HOW TO GET WHAT YOU SEE: SOFTWARE TOOLS
FOR VISION-BASED SYSTEMS

Gregory D. Hager
Department of Computer Science
Yale University

Technology has advanced to the point where a typical desktop computer has ample computing power for real-time, video-based applications. As a result, application areas ranging from human-computer interaction to robotics are making increasing use of vision technology. However, experience has shown that developing software for vision-based systems is, in general, a complex and time-consuming task. This has led us to search for software development techniques that can facilitate the construction of complex, vision-based systems.

We have recently begun to use a set of techniques, referred to collectively as "functional reactive programming" (FRP), to create a programming environment for systems involving sensors (including cameras) and actuators. The result is a programming environment called Frob (for {F}unctional {Rob}otics). In this talk, I will discuss the general structure of Frob and show how it separates the WHAT (the definition of a sensor processing or control strategy) from the HOW (the nuts and bolts of implementing that strategy). As an illustration of the potential advantages of this approach, I will show how, by re-prototyping our visual tracking system using FRP, we were led to a novel formulation of the system which not only reduced the code size by an order of magnitude, but also greatly enhanced its capabilities, with almost no cost in performance.

This is work in progress with Paul Hudak, John Peterson and Alastair Reid.