Sensing Linear and Angular Change
Through a Small Optical Window
Tim Poston† Manohar Srikanth‡ Prabhakar Vaidya†
|
†National Institute for Advance Studies (NIAS) |
|||||||||||||||||||||||||||
|
‡Indian Institute of Science (IISc) |
|||||||||||||||||||||||||||
|
Abstract:
We present an algorithm to sense translations and rotations of a sliding optical window with respect to a nearby observed surface. We fit a multi-frame descriptor of smooth rigid motion to image subsequences, by using it to create transformed images from a reference one, comparing these to actual views, and applying a Newton's method to minimizing the sum of squared differences, solving iteratively for the polynomial coefficients in the motion descriptor. Our simulation results, using synthetic test images and real CCD Camera images, indicate that the algorithm very robustly estimates Frame-to-Frame motion parameters with sub-pixel precision. The rotation estimate, besides adding a degree of freedom to user interaction, supports a directionally consistent XY output that XY-only sensors cannot provide. |
|||||||||||||||||||||||||||
Introduction
Our work, for a three-degree-of-freedom mouse interoperable with standard mice, includes means to track rotation by comparison of multiple sensors, as in [McKenzie et al.] and [Bohn], and also by better analysis of the data from the lens-and-camera hardware used in current XY-only optical mice. Upgrading only the algorithm and processing components means that existing mouse form factors, internal layout, button schemes, moulds and production lines can be used unchanged. The very similar manufacture, with little retooling, will preserve both cost and the breadth of user choice. Here we describe our algorithm less legalistically, and report early results from it. |
|||||||||||||||||||||||||||
|
A typical translation and rotation task: Green star to be overplayed on Red one. (a) is done with a standard XY-Mouse, (b) is done with XYθ Mouse (Mushaca). (a) requires significantly more time and effort than (b). |
|||||||||||||||||||||||||||
Available
data
As the mouse sensor moves over (far left) over a wooden desktop, it 'sees' a changing image, usually with both colour variation and shadows from sidelit irregularities in the surface. The image illustrated at left, however, shows far more detail than the usual 18×18 CCD array can capture. The input for motion estimation is a changing 18×18 pixel array |
|||||||||||||||||||||||||||
| Industrially conventional processing (as in, e.g., [Agilent] sensors) identifies common features in these images to estimate the direction and amount of mouse movement. As the figure at right shows, 'feature' here cannot be as sharply defined as, for instance, the corners of a ∇, which line fitting can locate with sub-pixel precision. A great deal of ingenuity is needed to detect features robust enough to be identified from one image to the next, since each pixel is a weighted optical average of brightness levels over a small region of the pad, complicated by noise. One cannot (for instance) simply seek repeated grey levels. |
|
||||||||||||||||||||||||||
In consequence, features are located to a precision of at most one pixel diameter. This is quite adequate for a classical mouse with a 1mm window moving at 10cm/sec and sampled every millisecond, which corresponds to a displacement about two pixels per frame. With a screen refresh rate of 60/sec, the cursor position needs to be updated only about every 18 sensor images. Summing this many integer pixel counts for displacements δX and δY suppresses enough quantization noise for a smooth experience by the user. |
|||||||||||||||||||||||||||
|
However, for rotation this sensitivity is not
enough. A mouse in the hand, rotated for close control, may move as
slowly as 10°/sec,
and the changing angle should be smoothly reported. The difference
between a pure (δX, δY)
translation and a turning motion may be seen
as a rotation about (say) a corner of the image. No point in an
18×18 array moves in one millisecond's 10°÷1000
rotation by more than a hundredth of a pixel; nor, therefore, does
one feature. Feature locations defined to within one pixel are
adequate for translation detection, but suppress rotation.
After the ≈100 steps needed for a rotation to change
a feature's location (relative to pure-translation
motion) by a whole pixel, it is long gone from the sensor window.
| |||||||||||||||||||||||||||
|
We estimate motion by bypassing the idea of features: use of the whole
image avoids discarding data. We also compare multiple successive
images for each estimate, rather than each image with the previous.
This exploits more pixel data, suppressing motion jitter by fitting
smooth motion functions, as well as averaging out the effects of
per-pixel noise. It also allows estimates that go beyond the
first derivatives (or first differences) of the motion.
The result is a close fit to the actual motion, including rotation.
The mouse always moves exactly sideways, so the cursor
moves only in X:
(Adjust the Windows ControlPanel⇒Mouse⇒PointerOptions⇒PointerSpeed to Slow, to see this best.)
Conversely, if the mouse moves straight but rotates as it goes, the cursor's path curves.
If — and only if — the rotation is known,
simple algebra turns 'sideways' and 'forward' back to
a geometrically consistent X and Y.
(Exactly which X and Y depends on the starting
attitude of the mouse, but does not change thereafter.)
Standard XY estimation algorithms cannot support this,
without doubling the sensors. We estimate
( δ(sideways),
δ(forward),
δ(θ) ,
and can adjust
( δ(sideways),
δ(forward) )
to
(δX, δY) at trivial cost.
Results
Conclusions | |||||||||||||||||||||||||||