Home

  Projects

  Papers

  Students



 

  Ibots: User Interface Softbots

Interactive applications extend human abilities along an enormous number of dimensions. What can we learn from agents that use these same software tools? This project attempts to answer this question, with the the development of a specialized type of interface agent that we call an ibot, for interface softbot.

An ibot controls an interactive system through the graphical user interface, as human users do, without relying on an application programming interface (API) or access to source code. Instead of tailoring interface agents to the APIs of different applications, our goal is to build what Nils Nilsson has called habile agents: general tool-using agents. Our work has led to a programmable substrate for ibots, with sensors, effectors, and skeleton controllers for this purpose. Sensor modules take pixel-level input from the display, run the data through image processing algorithms, and build a representation of visible interface objects. Effector modules generate mouse and keyboard gestures to manipulate these objects. These sensors and effectors act as eyes and hands to be managed by a controller appropriate for an application domain.

This material is based upon work supported by the National Science Foundation under Grant No. 0083281. Related work also funded under this grant explores issues in human and agent tool use. A summary of the goals of the project, with progress annotations, is online as well. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

Participants

Robert St. Amant (PI)

Publications

Kunal Shah, Sameer Rajyaguru, Robert St. Amant, and Frank E. Ritter. Connecting a cognitive model to dynamic gaming environments: Architectural and image processing issues Proceedings of the Fifth International Conference on Cognitive Modeling (ICCM). 2003. Pp. 189-194.

Robert St. Amant and Ajay Dudani. An environment for programming user interface softbots. Proceedings of the International Working Conference on Advanced Visual Interfaces (AVI). 2002.

Mark O. Riedl and Robert St. Amant. Toward automated exploration of interactive systems. Proceedings of the International Conference on Intelligent User Interfaces (IUI).

Robert St. Amant and Christopher G. Healey. Usability guidelines for interactive search in direct manipulation systems. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI). 2001. Pp. 1179-1184.

Robert St. Amant and Mark O. Riedl. A perception/action substrate for cognitive modeling in HCI. International Journal of Human-Computer Studies. 55(1): 15-39. 2001. (IDEALibrary link)

Robert St. Amant and Luke S. Zettlemoyer. The user interface as an agent environment. Autonomous Agents. 2000. Pp. 483-490.

Robert St. Amant. Interface agents as surrogate users. Intelligence magazine. Summer, 2000.

Robert St. Amant, Henry Lieberman, Richard Potter, and Luke S. Zettlemoyer. Visual generalization in programming by example. Communications of the ACM, 43(3): 107-114. March, 2000.

Luke S. Zettlemoyer and Robert St. Amant. A visual medium for programmatic control of interactive applications. Human Factors in Computing Systems (CHI '99). 1999. Pp. 199-206.

Software

SegMan is a successor to the VisMap system described in the papers above. The system is written mainly in Franz Allegro Common Lisp, with low-level functionality provided by a DLL generated in Visual C++. A readme file contains basic information about getting and running the system. A more extensive introduction contains information about the functionality of the system. We have recently extended the SegMan system to handle perception in a simple driving game; we have written a very brief discussion of the technical issues for integration with a cognitive model. See the history file for changes between versions.
We are also building a Java version of the VisMap system. It is not quite ready to be released, but should be in the near future. Some preliminary information about the Java version, written by Ajay Dudani, is available but will be modified significantly in the near future. Currently you can download a demo that shows how SegMan functionality can be applied to the problem of regression testing of a graphical user interface. The picture to the left shows the interface to the testing software; the original screen shot is shown here.
As part of our interest in the visual aspects of interacting with interfaces, we have built a user interface for drawing, called HabilisDraw, that relies on explicit visual tools rather than global modes in its execution. In most graphical drawing applications, one "uses tools" for drawing lines, circles, etc., by pressing a button on an icon to specify a drawing mode, and then dragging the mouse cursor over the canvas in specific patterns. Such activity does not exploit our tool-using abilities in any significant way. The picture to the left shows an alternative approach, a drawing environment that includes explicit tools such as rulers, compasses, pins, and inkwells. We believe that some applications can benefit from a richer representation of tools and tool use.

When complete the HabilisDraw system will be released for general use. You can see the tools developed for Version 1, and a result of using Version 2. You can also watch a movie, in AVI format, of its use.

We are developing a simple statistical interface (SSI) package to assist users in analyzing the output of their systems using SegMan. SSI will contain semi-automated statistical strategies for simple forms of exploratory time series analysis. In its current state, SSI is a conventional interactive statistics package based on CLASP. The image accessible to the left shows data from the instrumented interaction between a cognitive model (ACT-R) and a driving simulation, a recent SegMan application. The code for the current version of the system is available.

Demonstrations

Our most recent work has been to integrate SegMan with the Soar cognitive modeling architecture. Here is an AVI movie that shows a Soar agent playing Minesweeper. Its moves are random, but this demo is a reasonable proof of concept.

Part of our work has involved developing programmatic interfaces for a planning system to control interactive applications. Here are two AVI movies, in which SegMan is controlled by a reactive planner (RAPs), one that involves warnings and one that does not.

One of the applications in which we used VisMap was Solitaire, as a proof of concept. Here is a movie of the now-defunct system in action.

 
  Last Updated:
  8/14/00
, 12:46:52 PM
 
 

Mail questions or comments to stamant@csc.ncsu.edu