CS 5973: Neuro/Cognitive Projects

The semester-long project constitutes a significant percentage of your class grade. Projects will be experimental in nature, requiring a carefully-designed computational hypothesis, a computer implementation, an experiment, and an analysis of the results. Project topics must be based on a set of at least three papers drawn from the literature; one of these papers must be drawn from the set of papers on the course schedule page. With approval, students may collaborate on projects in groups of size two. In these projects, it must be clear that there is a significant and differentiable contribution that can be made by each student.

For those students who do not have a significant background in programming, we will make every effort to design an appropriate collaborative project.

All project-related materials must be handed in on the specified due date: for in-class presentations, you must be ready to present in class; written materials are due at 23:59. Written materials may be handed in via email or the course blackboard.

Deadlines

Sept 15: 1-page project proposal due
Sept 20: In-class presentation of project proposal
Oct 18: In-class presentation of project status
Nov 8: In-class presentation of project status
Nov 19: Draft of final paper due
Nov 29: Peer paper reviews due
Dec 6: Final paper due
Dec 8: Final project presentations
Dec 14 (4:30-6:30): project presentations continued

Project Proposal

1-page due on September 15th at 23:59
Postscript/pdf/raw text (no doc files, please)

The proposal should answer:

What is the behavioral/neural domain to which you are connecting?
Why is it interesting?
What is the computational problem to be solved?
What is the computational approach (you may not know this in great detail yet, but take a guess)
What will your experiment(s) look like?
Include proper references!

Project Ideas

Grounding Symbols for Color Descriptions

Domain: how do children learn the meaning of the names of colors? In other words, how do they learn the relationship between their perception of color and the symbols used to describe the colors?
Interesting because:
- Children learn this mapping through example pairings of images and the symbols.
- The examples and the symbols are often ambiguous.
- The symbol classes are often overlapping.
Computational problem: how to establish a relationship between a vector space (representing colors) and a discrete space?
A possible computational approach:
- Input: tuples consisting of an image and a set of symbols that describe the color in the image.
- Throw out all spatial information: all images are reduced to a set of pixel colors (3D vectors)
- Construct color models for each symbol:
  - Take the pixel colors of all of the images for which the symbol is used.
  - Construct a mixture-of-Gaussian model that best fits the set of pixels (in a maximum likelihood sense). This is a representation of the likelihood of a given pixel vector given the symbol: p(color | symbol)
- Given the color models, we can perform the following experiments:
  - Given a novel image, generate the symbol(s) that best describe the color in the image.
  - Given a symbolic color description and a set of images, identify the image that best matches the color description.
- Other questions to examine:
  - How to handle conjunctions of symbols? e.g., "dark red" versus "light red" versus "red"?
  - How to handle disjunctions of symbols? e.g., "this image contains red and blue" (ie pixels are either red or blue)?

Spatial Concept Grounding from Visual Examples

This project would play out in a manner that is similar to the color grounding problem. In this case, however, we would like to construct spatial models of concepts such as "left of," "on top of," and "near."

Instead of constructing models in color space, we would construct models that capture spatial relationships. Gaussians (or mixtures thereof) could also be used (although there might be some other distributions that do a better job).

Interaction of the Dorsal and Ventral Visual Pathways

We understand a fair amount about the roles played by the dorsal and ventral visual pathways. However, much less is understood about how these pathways might interact with one-another. Several of the papers on the reading list represent different aspects of how the dorsal/ventral computations may take place individually, and in some cases examine how their interaction might play out. These include:

Matthew Schlesinger and Roberto Limongi (2005), Towards a what-and-where model of infants' object representations, AAAI Spring Symposium on Developmental Robotics
Alexander Stoytchev (2005), Behavior-Grounded Representation of Tool Affordances, In Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Barcelona, Spain, April 18-22
Paul Fitzpatrick, Giorgio Metta, Lorenzo Natale, Sajit Rao, Giulio Sandini (2003). Learning about objects through action - initial steps towards artificial cognition. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Taipei, Taiwan, May 12 - 17.
Justus Piater, Roderic Grupen (2002) Learning Appearance Features to Support Robotic Manipulation, Cognitive Vision Workshop (ETH Zurich).
Wheeler, D. S., Fagg, A. H., Grupen, R. A. (2002), Learning Prospective Pick and Place Behavior, Proceedings of the International Conference on Development and Learning (ICDL'02), Electronically Published

Model of Development: Choosing which Skills to Learn and When

Many approaches to skill learning are focused on learning an individual skill (i.e, have a single reward function). In the rare case in which a set of skills is learned, it is typically the experimenter that determines the sequence in which the skills are learned. In contrast, infants and toddlers are constantly "hopping" from one learning problem to another -- in many cases, this process of switching between learning tasks is determined internally. What is it that drives this selection of learning task? We know that selection cannot be arbitrary: many skills build on top of others that have been previously learned, and (early in the process) the developing body does not have the motor strength or representational capability to take on the more complicated tasks. One computational theme that we see in a variety of writings (see the curiosity and development section of the schedule) is that of focusing on areas of "moderate novelty." This means that the agent actively seeks out experience in the world in which some success has already been found, but that high performance has yet to be achieved.

For a class project, one could implement one such mechanism of development.

Recognition of Grasping Actions

Given a time series of hand position, orientation, and shape, how can an observing agent extract a deep representation of the grasping action? In particular, we would like to extract the goal of the reach (which includes the object identity and the how the object is to be grasped). Depending upon the application, we may want to be able to produce a prediction of the goal before the hand actually arrives at the object. In other applications, we may wish to label a sequence of pick-and-place operations after the fact.

There are a number of computational theories that attempt to explain this recognition process. Some of the most interesting are inspired by Rizzolatti's mirror cell work which suggests that the control system itself is involved in the recognition process.

Hardware Access

Some of my robot hardware is available for use in your projects. In addition, some of my students are available to help in the data collection process.

Stereo vision system: This system can deliver raw images, but we also have a simple object segmentation system implemented in matlab that can give 2D and 3D locations of objects, as well as a number of other object statistics (blob/object size, orientation, color, etc.). Note that the 3D component of the system is still being calibrated, but I expect that it will be up and running by the end of September.
Hand tracker: This system will give very accurate hand position/orientation and (approximate) finger flexion information at ~15 Hz. The 3D calibration of the vision system will place visually-reported positions into the same coordinate frame as the hand position estimates. If you decide to work with this data stream, then you should already be familiar with (or willing to get up to speed on) position and orientation representations in 3D (so a robotics or a graphics background will help here).

Final Project Document

For the final project report, we will be using the official ICDL paper style (this is required). Templates for both latex and M$Word are available at the ICDL submission page
Total length of the final report is limited to 6 pages
You should not need to do any more paper reading at this stage. But - you must discuss (where appropriate) your basis set of papers and provide proper references
For your paper draft, you should have as much completed as possible, as this is your primary opportunity to get feedback (which will be critical for grading of your final report). If you find yourself running out of time, you should focus on the core pieces of the paper (these are the components that will receive the highest weight in the grading): description of the experimental problem, hypothesis, experimental approach (with details!), and results (the other components should at least be outlined).
For a description of the key pieces of a project report, see Writing a project report by Ray Mooney at U Texas (note that the focus is on machine learning projects)

Project Presentation

Your final project presentation constitutes 10% of your course grade. At the time of your presentation, your experiments should be complete. You will have a total of 30 minutes to present your project and to address questions (so plan on 25 minutes of material). Your slides should cover:

A reminder of the domain in which you are working.
A description of the particular problem that you are trying to solve, including:
- Why is it interesting/important?
- Why is it hard?
A concise statement of your experimental hypothesis.
A description of your computational approach (including the components, representations, and algorithms). This needs to be enough detail for your audience to understand how to begin to replicate your approach. But - you should limit your discussion of the implementation details that do not have a bearing on the computational approach. Illustrative examples are good.
A description of your experiment.
A description of your experimental results. Where appropriate, include an example of *what* has been learned, aggregate learning results (showing performance over many task instances), learning curves, and statistical testing.
A conclusion: what are the key points of your talk?
A discussion of possible next steps.

fagg AT ou.edu

Last modified: Sat Dec 3 19:31:09 2005