Andrew H. Fagg


Research Statement


One of the elusive goals of artificial intelligence is to build machines that can work in symbiotic collaboration with humans. To do so, the machines must learn to perform new skills and to refine old ones from their interaction with humans and with the surrounding environment. My research centers around these symbiotic relationships between humans and machines. Specifically, I study machines as models of how biological systems represent and learn motor and cognitive skills; primates as inspiration for new robot control and learning techniques; and the interaction of humans with machines. Central to all of these problems are the issues of constructing rich representations of the state of the agent, the local environment, the task, and the skills; and of using various forms of available training information to refine these representations. In this study of symbiotic computing, I draw on the disciplines of robotics, artificial intelligence, machine learning, computational neuroscience, and wearable/ubiquitous computing.

Motor Skill Representation and Learning

Skills involving reaching, grasping, and manipulation are a rich focus of inquiry because they enable both humans and monkeys to affect their environments in a flexible manner. By studying these motor skills, I hope to build robots that will be able to perform tasks within unstructured human environments, as well as environments that are inhospitable to humans, including space. I am particularly interested in drawing inspiration for robot control systems from the study of biological control and in the use of robots as a mechanism in which to test biological theories of motor control.

To better study robot control, one of my projects has been the design and construction of the UMass Torso robot (Figure 1) in the context of an NSF-sponsored Research Infrastructure project. As with many humanoid-form robots, the UMass Torso consists of many controllable degrees-of-freedom and sensors. Thus, there are often many ways in which a task may be accomplished with the available sensor and actuator set. Although this design increases the complexity of the control and sensing problem, the redundancies in which a task may be addressed can be exploited to allow the robot to perform a wide range of tasks while optimizing for a variety of task criteria. The research challenge is to manage these complexities and provide layers of abstraction that 1) enable a programmer to work at an intuitive task level, and 2) allow planning and machine learning algorithms to be used in a practical manner to automatically improve motor skills (or more specifically, control policies).

\epsfig{file=twohands_bw.eps,width=2.4in}
Figure 1: The UMass Torso humanoid robot has two 7 degree-of-freedom Whole Arm Manipulators (WAMs) whose kinematics are similar to that of the human arm. Each arm is equipped with a three-fingered Barrett hand that has been augmented with 6-axis force/torque sensors at each finger tip. The arm configuration enables the exploration of a wide range of reaching and grasping skills at a variety of physical scales. The Torso is also equipped with a steerable, stereo vision system.

The control basis approach provides a general framework for sensorimotor abstraction (Platt, Jr. et al., 2003). At its most fundamental level, the framework calls for a set of closed-loop controllers that bridge the gap between the continuous sensor/actuator/time domains and the discrete, abstract representations relating to these. One class of closed-loop controllers that I have been developing with a student is aimed at the formation of stable grasps. Rather than starting with a detailed model of the object to be grasped (e.g., as derived from a vision system), the first step in our approach is to haptically explore the object to be grasped. At each contact with the object, the controller estimates the total force and torque applied to the object by the set of contacts. Given a simple model of the local object geometry, the controller computes movements of the fingers and arm that attempt to reduce the total force and torque (Platt, Fagg, and Grupen, 2002). The power of this approach to grasp formation is that the controller can be assigned a variety of different physical resources, including finger tips, palms, multiple hands, and even ``virtual contacts'' such as gravity (Platt, Fagg, and Grupen, 2003).

Although the closed-loop controllers sense and act in the continuous domain, they provide an interface to a higher level of abstraction. From this abstract level, a single action becomes the activation of a particular grasp controller. This action terminates at some later time, with a report of success (e.g., having achieved a stable grasp) or of failure (e.g., lost contact with the object). This type of action and state transition model can be captured by the Semi-Markov Decision Processes (SMDP) formalism, which supports the use of a variety of planning and learning techniques and also provides a convenient way to express task-level programs. For example, we have shown that stable manipulation of an object can be expressed in terms of a sequence of grasp controller activation actions (Platt, Fagg, and Grupen, submitted). In addition, a student and I have applied a reinforcement learning technique to the problem of discovering an appropriate sequence of grasp and place actions (Wheeler, Fagg, and Grupen, 2002). Rather than starting with a model of which grasp was appropriate for a given final object configuration, the robot learned through interaction to select a grip in anticipation of how the grasped object was to be used in future actions. The behavior exhibited through the learning process by the robot demonstrated interesting qualitative similarities to what one sees in grip selection with children in a similar task.

Related to this work, I am a member of a collaborative project led by NASA/Johnson Space Center. Their humanoid robot, Robonaut, will ultimately be deployed on the International Space Station to participate in the assembly and maintenance of the station components. Although there are a number of kinematic and sensing differences with the UMass Torso, we have demonstrated that aspects of our control approach apply well in this robot. In particular, we are bringing critical automated grasping and teleoperator interface components to the Robonaut system.

Biological Motor Control

Biological systems represent the best examples of motor control and learning. I am pursuing biological models as inspiration for robot control approaches and robots as mechanisms with which to evaluate biological control theories. In my research, I seek to understand how to better describe what is represented by different areas of the nervous system, the computations that are implemented by these regions and their interconnections, and what factors drive the development of these representations. I study these questions in the context of motor control - specifically in the area of reaching, grasping and manipulation.

One of the critical questions to be addressed when examining the role of the brain in motor control is the relative contribution of peripheral systems, specifically, muscles, the sensors embedded within the muscles and other tissue, and the neural circuitry within the spinal cord. It is common in the modeling community to assume that these peripheral systems impose a linear transformation of the motor signals generated by the brain. Although a simple assumption, it implies that the full complexity of a temporal muscle activation pattern is due to the motor commands generated by the brain itself (and requires a large number of parameters to describe). I have developed a model of muscle/spinal interaction that includes key nonlinearities, particularly within the feedback loop implemented by the spinal circuitry (Houk, Fagg, and Barto, 2002). Although these nonlinearities impose additional complexities to the modeling process, we have shown that they can drastically reduce the complexity of the motor command that is necessary to produce realistic muscle activation patterns (Fagg, Barto, and Houk, 1998a). This observation has important implications for how the brain represents and learns motor skills.

I also study models of biological motor skill learning. It is clear from the psychophysics and neuroscience areas that multiple, distinct mechanisms of learning are involved in the process of acquiring a new skill. In addition, there exist interesting parallels between theories of machine learning and the mechanisms that are implemented by several brain regions. I am interested in developing models of these learning mechanisms and their interaction. For example, an area of the brain called the Cerebellum is thought to be involved in learning coordinated motor skills. Experimental evidence suggests that the motor outflow from this area is trained using a mechanism that relates to supervised learning (or regression) techniques. One unknown is the source of the error information that drives the learning process. When one examines reaching movements in adults, we often see a gross movement to the target followed by a sequence of smaller movements. A hypothesis that I have been exploring is that this training information (in the form of an error vector) is derived from the submovement that follows the current one (Fagg, Sitkoff, Barto, and Houk, 1997; Fagg, Zelevinsky, Barto, and Houk, 1998b; Barto, Fagg, and Houk 1999).

This approach is interesting in that the motor system, in some sense, is responsible for teaching itself how to produce smoother, more coordinated movements. However, this model assumes that the motor system is always capable of generating an effective sequence of corrections to take the arm to the target. One possibility is that some aspect of a corrective action is selected as a function of its utility in completing the movement (Fagg, Barto, and Houk, 1998a). Another set of brain regions known as the Basal Ganglia are thought to be involved in the assessment of the utility of actions. I am currently developing an abstract model in which a reinforcement learning (RL) module is responsible for selecting from a small number of available corrective actions, but the meaning of these actions is altered at the same time by a supervised learning mechanism. This model is particularly interesting in that it uses exploratory learning (specifically, RL) when there is little information about how to perform a movement, but then comes to rely on supervisory training information when the teacher becomes competent.

In addition, I study the formation of movement representations and execution strategies. The production of movements typically involves the differential recruitment of many more muscles than skeletal degrees of freedom and many more neurons than muscles. However, there are specific regularities in the way in which neurons and muscles are recruited in the movement generation process (for example, both muscles and cells are often recruited as function of the cosine of the direction of movement). The question is what factors lead to these regularities despite the redundancies that exist. In our model, we explore the hypothesis that many of these effects can be explained through a process that attempts to optimize both the movement error and the degree of effort used to perform the movement (Fagg, Shah, and Barto, 2002b). The model produces patterns of systematic wrist muscle recruitment that are consistent with both human and monkey data. Furthermore, through this approach we are able to explore issues surrounding the neural representation of movement (Shah, Fagg, and Barto, submitted) and the formation of these representations (Sondhi, Shah, and Fagg, in preparation). I am currently working to extend these techniques to the area of grasp formation (Fagg and Arbib, 1998; Fagg, 1996).

Human-Machine Interaction

There are currently many consumer electronic devices that promise to improve our daily lives by performing a wide range of tasks - especially related to communication and memory functions. However, in practice, these devices demand greater amounts of personal attention on the part of the user, which detracts from their benefits. A solution is to develop devices capable of automatically making intelligent guesses as to the information that the user will need over the next few minutes. This information should then be presented in a form that minimizes user distraction. By reducing the user's need to attend to the mechanics of interacting with the devices, we open up a wide range possibilities for new uses of such ``wearable'' computing systems.

I have developed a distributed service model to address these problems. A set of independent agents is responsible for gathering information that may be useful to the user at any given time (e.g., email, news, and location-dependent ``sticky'' notes). However, these agents do not communicate directly to the user, but instead submit information to a central interaction process. This process is responsible for making context-sensitive decisions about whether the information should be presented to the user and how it should be presented (displayed as text or whispered in the user's ear). I approach this decision problem as one of control in which a representation of the user's activity is translated into an appropriate presentation action. This control perspective of the user interface enables us to engage a variety of machine learning approaches, including both supervised and reinforcement learning techniques.

To date, this perspective has been applied in two experiments. First, I have shown that an effective association can be acquired between a representation of the user's current activity and a document that she will access in that context. This prediction is acquired by ``looking over the user's shoulder'' and observing regular patterns of document access. Predicted documents are presented to the user in menu form and can be selected with a minimal number of keystrokes, increasing the speed at which many documents can be retrieved. Second, a student of mine has examined a context-sensitive power management problem in which a mobile computer must decide at any given time to suspend for a short period of time or continue to be active so as to respond to user requests or critical sensory events. We formulated the problem in terms of an SMDP and employed Q-Learning (a form of reinforcement learning) to optimize the selection of control actions. The learned control policy acquired an implicit representation of the conditions under which the processor could safely suspend while only missing a small number of external events. In the coming semester, we will be applying similar techniques to the problem of when/how to present agent-generated messages.

The issues addressed in the wearable computing domain also apply to the area of human-robot interaction. Here, we wish to maximize the efficiency of communication between the human and (potentially) many robots. Several students and I have been developing mixed-reality interfaces (a combination of real and virtual environments) for this purpose (Fagg et al. 2002a; Ou, Karuppiah, Fagg, and Riseman, 2004). Here, a virtual environment is used to summarize the state of the real world as extracted by the set of sensors and to make explicit the physical relationships between the different robots and sensors. This approach allows the user to explore the data space in a spatial manner and then to select individual sensors for access to their live data streams or individual robots for control purposes.

One of the dominant paradigms in robot control for space applications or hazardous environments is for a user to teleoperate a robot. Due to the large cognitive effort required to ensure that the robot acts as intended by the teleoperator, the useful operation time of a user is often less than an hour. I have been exploring the use of mixed autonomy approaches that allow the robot to perform some subtasks autonomously after permission is given by the user. One approach that I have been pursuing is to use our already-existing humanoid control system as a mechanism for the recognition of the intended movement produced by the teleoperator. This technique is being used to preemptively complete movements initiated by the teleoperator (giving the teleoperator short periods of time to rest) and to train the control system to perform sequences of submovements within a single demonstration.

Future Research Directions

In the future, I will continue to pursue the research themes discussed in the previous sections. However, I plan to also take two additional steps.

A significant next phase in the humanoid robot work is to develop a version of this system that allows motion of the base and of the trunk. Besides the technical problems of mobility, balance, and power, this step will enable the exploration of new research problems, including the collaborative manipulation of large objects; the interaction in planning and execution of reach, grasp, posture, and body placement; and planning for long-duration tasks involving object acquisition, assembly, and delivery.

My research interests in computational neuroscience, robotics, and wearable computing converge on the emerging field of Brain-Machine Interfaces (BMI). These interfaces will ultimately involve the chronic implantation of a large number of electrodes into the brain, potentially allowing for the high bandwidth transfer of information between brain and computer. This work has important implications for prosthetic limbs that will behave and ``feel'' much like the biological limbs that they replace, and for the development of computational prosthetics that will augment aging brain regions. But - there are many technical and computational questions that have yet to be addressed. The latter include how to interpret the cellular activity in real time so as to command an artificial limb in a convincing manner, how to be robust to drift in the cellular representation of movement, and how to support the collaborative learning of the human and machine to improve performance of the complete system. I am a member of a multidisciplinary group that includes Northwestern University and University of Chicago that has recently submitted a proposal to the National Institutes of Health in which we plan to take some of the next critical steps in this work.

Bibliography

Barto, A. G., Fagg, A. H., Sitkoff, N., and Houk, J. C. (1999).
A cerebellar model of timing and prediction in the control of reaching.
Neural Computation, 11:565-594.

Fagg, A. H. (1996).
A Computational Model of The Cortical Mechanisms Involved in Primate Grasping.
PhD thesis, Department of Computer Science, University of Southern California.

Fagg, A. H. and Arbib, M. A. (1998).
Modeling parietal-premotor interactions in primate control of grasping.
Neural Networks, 11(7/8):1277-1303.

Fagg, A. H., Barto, A. G., and Houk, J. C. (1998a).
Learning to reach via corrective movements.
In Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems, New Haven, CT.

Fagg, A. H., Ou, S., Hedges, T. R., Brewer, M., Piantedosi, M., Amstutz, P., Hanson, A., Zhu, Z., Grupen, R., and Riseman, E. (2002a).
Human-robot interaction through a distributed virtual environment.
In Proceedings of the Workshop on Intelligent Virtual Environments and Human Augmentation (WIHAVE), Chapel Hill, NC.

Fagg, A. H., Shah, A., and Barto, A. G. (2002b).
A computational model of muscle recruitment for wrist movements.
Journal of Neurophysiology, 88(6):3348-3358.

Fagg, A. H., Sitkoff, N., Barto, A. G., and Houk, J. C. (1997).
Cerebellar learning for control of a two-link arm in muscle space.
In Proceedings of the IEEE Conference on Robotics and Automation. Omnipress.

Fagg, A. H., Zelevinsky, L., Barto, A. G., and Houk, J. C. (1998b).
A pulse-step model of control for arm reaching movements.
In Proceedings of the 1998 Meeting of the Society for the Neural Control of Movement.

Houk, J. C., Fagg, A. H., and Barto, A. G. (2002).
Fractional power damping model of joint motion.
In Latash, M., editor, Progress in Motor Control: Structure-Function Relations in Voluntary Movements, volume 2, pages 147-178. Human Kinetics, Champaign, IL.

Ou, S., Karuppiah, D. R., Fagg, A. H., and Riseman, E. (2004).
An augmented virtual reality interface for assistive monitoring of smart spaces.
In Proceedings of the IEEE International Conference on Pervasive Computing and Communications.

Platt, R., Fagg, A. H., and Grupen, R. A. (2003).
Whole body grasping.
In Proceedings of International Conference on Robotics and Automation (ICRA'03).

Platt, Jr., R., Brock, O., Fagg, A. H., Karupiah, D., Rosenstein, M., Coelho, Jr., J. A., Huber, M., Piater, J., Wheeler, D., and Grupen, R. A. (2003).
A framework for humanoid control and intelligence.
In the Proceedings of Humanoids 2003.

Platt, Jr., R., Fagg, A. H., and Grupen, R. A. (2002).
Nullspace composition of control laws for grasping.
In Proceedings of the International Conference on Intelligent Robots and Systems (IROS'02).

Platt, Jr., R., Fagg, A. H., and Grupen, R. A. (2004).
Manipulation gaits: Sequences of grasp control tasks.
In Proceedings of the International Conference on Robotics and Automation (ICRA'04).

Sondhi, E., Fagg, A. H., and Shah, A. (in preparation).
Formation of cortical representations for wrist movements.

Wheeler, D. S., Fagg, A. H., and Grupen, R. A. (2002).
Learning prospective pick-and-place behavior.
In Procedings of the IEEE/RSJ International Conference on Development and Learning (ICDL'02).

About this document ...

This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -no_navigation -split 0 -t 'Andrew H. Fagg: Research Statement' research.tex

The translation was initiated by on 2007-12-18


2007-12-18