This document gives a very brief introduction to the driving code in the SegMan perceptual substrate. There are several issues that a visual system for driving must address: steering, speed control, and the integration of these activities.
1. Steering (directional control)
2. Speed control
3. Integrating directional and speed control
4. Running the driving simulation
5. An index of functions
The visual environment in which the driving code operates is shown
below. The simulated car is approximately straddling the center line.
Visual information is extracted from this scene by a call to
attend-road. All function calls in the driving
simulation are in the vision package. See the instructions for running the simulation and the index for a discussion of coding issues.
The fragment of code below first computes the current center of the field of vision, offset by some amount to account for the driver's seat being on the left side of the car. It then computes the goal location in the visual field, which is center of the lane. These two location should be the same, but are usually not. This code moves moves the mouse pointer from the current location to the goal location. Notice that in order to reach the goal location, the driver must steer in the opposite direction of the movement of the mouse.
(let* ((scan-line (nominal-scan-line))
(horizontal-offset (* *pixels-per-degree* 4))
(current-x (point-x (far-focus)))
(goal-x (+ (field-vertical-center) horizontal-offset)))
(attend-road)
(driver-move-mouse-to (make-point current-x scan-line))
(driver-move-mouse-to (make-point goal-x scan-line))
(print (list current-x goal-x (- goal-x current-x))))
The code for determining the center of the lane can break down if the road is curving off too fast in one direction or another. In the image below, the left side of the road is no longer visible, suggesting that the driver needs to steer hard left to stay on the road.
The system can detect this in a few different ways. One is to use a
function that computes the average slope of the road edge at some
point. If this exceeds some value in the positive or negative
direction, then that edge of the road has left the visual field
(because the right and left edges of the visual field are vertical.)
The fragment of code below shows this approach. Another possibility
is to use the points at which the road starts and ends, with the
functions road-edge-right-start,
road-edge-right-end, and the left
equivalents, computing slopes from these. This latter approach is not
equivalent when the road edges are too curved, however.
(let ((scan-line (nominal-scan-line))) (cond ((> (road-edge-left-slope scan-line) most-positive-fixnum) (print "Can't see the left edge of the road")) ((> (road-edge-right-slope scan-line) most-positive-fixnum) (print "Can't see the right edge of the road"))))
Controlling speed is slightly more complex than steering. Deciding which direction to steer can be done by looking at a snapshot of the current environment. Braking and acceleration, on the other hand, depend on how the visual environment is changing. In the images below, taken within some seconds of one another, the curvature of the road has changed and the simulated car has moved further into the lane of oncoming traffic.
![]() |
![]() |
Ideally we'd have some sophisticated functionality for computing
optical flow, but this is beyond our abilities at this point.
Instead, we detect changes by recording the locations of specific
points in the visual field, and measuring the distance they move from
one snapshot to the next. The code below shows how this can be done.
It is straightforward to define functions that abstract away the
recording of values over time, but I've made it explicit here because
it presumably needs to be explicit in the memory of a cognitive model.
We have a loop here so that new values can be generated from
successive calls to attend-road.
(loop with last-left = nil
with last-right = nil
with last-scan-line = nil
repeat 10
do (attend-road)
(let ((flow (if (and last-left last-right last-scan-line)
(let ((left (point-x (road-edge-left last-scan-line)))
(right (point-x (road-edge-right last-scan-line))))
(prog1
(print (+ (abs (- left last-left))
(abs (- right last-right))))
(setf last-left left
last-right right
last-scan-line (nominal-scan-line))))
(let ((scan-line (nominal-scan-line)))
(prog1
(print 0)
(setf last-left (point-x (road-edge-left scan-line))
last-right (point-x (road-edge-right scan-line))
last-scan-line scan-line))))))))
One of the interesting difficulties we face in in the integration of the control of direction and speed arises from our simulation of the process in discrete time and the way that optical flow is handled. Suppose that at time t the model analyzes the road, records the data for estimating flow, and determines that steering one direction or another is appropriate. At time t+1 some steering command is issued, and the simulated car moves in that direction. At time t+1 or later the road is again analyzed so that flow can be computed, but at this point the action of the model have made some contribution to the changes in the visual field, independent of changes that would have occurred otherwise. This contribution needs to be accounted for, or the car might end up braking every time it steers.
(get-cursor) to return the cursor position, and
repeatedly (painfully) nudge the position of the window until it
matches the location (300, 315). We'll fix this problem shortly.
Once the window is in place, and (attend-road) has
been called, the code fragments above for retrieving steering
information and speed information can be evaluated.
attend-road: Process the current contents of the screen.
field-bounds: Return the bounds of the image, such
as (300 315 705 615).
horizon: Return the y-value of where the road ends
in the distance.
road-edge-left-start: Return the point (in SegMan
form, a list of x and y) that corresponds to the lower point at which
the left edge of the road can be seen, along the left border of the
visual field.
road-edge-left-end: Return the point at which the
left edge of the road disappears into the distance. This may be
closer on hills.
road-edge-right-start: See road-edge-left-start.
road-edge-right-end: See road-edge-left-end.
nominal-scan-line: Return a y-value that averages
the vertical area in the visual field occupied by the road. This
function is used to provide a point at which the edges of the road can
be acquired.
road-edge-left (scan-line): Given a scan line,
return a point corresponding to the left edge of the road. The
scan line value is the y-value of this point.
road-edge-right (scan-line): See road-edge-left.
road-stripe (scan-line): Given a scan line, return
a point whose x-value corresponds to the approximate center of the
road. This can be used to determine lane boundaries.
road-edge-left-slope (scan-line): Given a scan
line, return the approximate slope of the road edge at that point.
The slope is averaged over a small number of segments marking the road
above and below the scan line.
road-edge-right-slope (scan-line): See
road-edge-right-slope.
road-end: Return the point at which the road
disappears in the distance, at the horizon.
driver-move-mouse-to (x-or-point &optional y):
Move the mouse pointer to a position on the image. This is similar
to the function move-mouse-to in the segman
package, but has its origin at at the top left of the image.
Another, smaller set of functions combines these primitives into an approximation of visual routines (the intention is that these routines should operate as visual routines in the intermediate vision sense.) The values that some of these functions return are still under discussion and development; for example, should everything be computed in degrees or in pixels?
field-vertical-center: Return the center of the visual field.
far-focus (&optional (n-degrees 5.5)): Return a point
centered in the lane some number of degrees below the horizon.
near-focus (&optional (n-degrees 10)): See
far-focus. These two functions together can provide
information about steering direction, although in the code in the
steering section a different value is used for
the near focus. This is a modeling issue that should be resolved at
some point.
lane-deviation (&optional: (degrees-deviation 2)): Return
the number of degrees horizontal difference between where the car
is and where it "should" be, plus the direction that steering needs to
occur for correction.
road-edge-missing: Return some maximum value (for
consistency with lane-deviation and a direction to be
steered. See the discussion in the steering
section for further details.
estimated-flow: Return a number indicating the amount of
change since the last time the function has been called. See the
discussion in the speed control section
for further details.