3. Experiment

In order to validate the idea of using the aforementioned software-based perceptual manipulations, empirical evidence of the success of this methodology should be obtained. Because problems with VEs are so task-dependent, the best way to experiment with the software manipulation of visual cues is with a concrete example. The work in this thesis has been motivated by more than pure scientific interest, of course, and a practical application of the knowledge has been a driving force for this research.

3.1 Experiment Background

The mission of the Virtual Environment Technology for Training group at the Massachusetts Institute of Technology has been to investigate the manner in which newly-developed immersive interface techniques can be applied to the learning of complex tasks. One of the main projects of interest to both the group and the sponsor (the Naval Air Warfare Training Systems Center) is the development of a submarine simulator that is capable of teaching Navy personnel the basics of boat navigation on the surface of a harbor or bay. The project aims to improve understanding of the advantages and disadvantages of VE training over more traditional simulation methods (VETREC, 1992).

In marine navigation, one officer is in charge of making all steering decisions for the boat. On a submarine, this officer stands on the "sail" and gives navigation commands to the rest of the navigation crew who are located below-deck. This position is known as the "Officer of the Deck" or "OOD" (Levison, Tenney, Getty, & Pew, 1995; Zeltzer, Aviles, Gupta, Lee, Nygren, Pfautz, & Reid, 1994).

The OOD task centers around the visual recognition of several cues: the motion of the water, the texture of the water, and, most importantly, the presence of channel buoys and range markers. The OOD guides the boat through the channel marked by the buoys, and uses the range markers to ensure that the boat is centered in the channel. For more information on the OOD task and how it was selected, see Levison, Pew, and Getty (1994) and Levison et al. (1995).

The ability to see the channel markers is extraordinarily important to the performance of the OOD task. However, the buoys and range markers are not visible in the simulator at the distance they would be visible in the real world. No data has been collected on the performance in the OOD simulator without the navigation aids. However, the following snapshots of the same scene illustrate the difficulty presented by the lack of visibility:

(a) (b) Figure 3.1: The effect of scaling on visibility in the OOD simulator. (a) shows the unscaled scene from a particular viewpoint, while (b) presents the result of scaling. The objects in the distance are much more visible in (b) than in (a). Obviously, the ability to see the buoys is impaired when no scaling method is used. Trying to navigate a channel without being able to see more than a few buoys ahead is very difficult. In addition, the range markers are not clearly visible, inhibiting their use as navigational aids.

Basically, the development of a simulator like the OOD requires sufficient realism to allow the task to be trained. In this situation, critical information is eliminated, removing realism and making the virtual world too unlike the real world. The lack of resolution, in this task, is not acceptable.

3.1.1 Geometry of the OOD model

The OOD simulator had a specific model of the world that was used to determine the visual relationships between objects in the computation of the graphics. In particular, the submarine sail was said to be 34 feet off of the water, while the viewpoint was calculated to be 39 feet by adding 5 feet for the height of a human observer's eyes. This number was based on the size of the submarine model used in the simulator. In addition, the simulation designers consulted Navy personnel to ascertain the rectitude of the model's dimensions (Pioch, 1995).

These same individuals also validated the size of the buoys, whose dimensions were originally given by the U.S. Coast Guard. The following figure shows the appearance and dimensions of the buoys in the OOD simulator:

Figure 3.2: The dimensions of the buoy model used in the OOD simulator. Given the dimensions of the submarine and buoy models, we can now extend the models presented above to accommodate objects with a three-dimensional shape. The following figure shows a three dimensional view of the perspective geometry in the OOD simulator:

Figure 3.3: A representation of the perspective geometry model used to display a buoy object in the OOD simulator. The relevance of the previous calculations of perspective geometry is readily apparent. We can proceed by repeating the calculations presented above and incorporating a three-dimensional object that is non-regular in one dimension (rather than the flat 15 foot by 15 foot square used before).

Figure 3.4: A side view of the perspective geometry used in the OOD simulator to view a buoy object. From this model of the side view of the buoy object, the following formulas can be derived:

(17) (18) (19) The formula describing the number of pixels in the object is the same as before, as are the calculations to determine the location of the top and bottom points of the object since they depend entirely on the object's visual angle.

Proceeding to the top view of the figure, we are presented with the dilemma of deciding whether to model the object using the width of the base or the width of the top.

Figure 3.5: A top view of the perspective geometry used to present a buoy object in the OOD simulator. Using formulas (13) through (16), and substituting the visual angle for either the top or bottom of the buoy for , the formulas for the horizontal appearance of the buoy object can be found. Before choosing either the top of the object or the bottom (or some combination) as a basis for modeling, the dominating dimension should be found. That is, because of the disparity in number of vertical and horizontal pixels and the disparity in the height of the buoy object versus its width, one of the sets of formulas will determine the distance at which the object is last visible. Intuitively, the horizontal () dimension would seem to dictate the cutoff point. This is easy to verify via substitution.

Substituting the parameters of the HMD and the OOD model (viewpoint height, buoy dimensions) used into formula (4), the visual angle subtended by half of a pixel can be determined. Half a pixel is used since the graphics system rounds half pixels to full pixels.

(4) Not surprisingly, since the aspect ratio matches the ratio of field of view and the number of pixels in either dimension, the half-pixel angle is the same. Solving equations (19) and (13) for , and equal to the half- or one-pixel angle, the threshold distance can be determined. First, the vertical dimension, using the half-pixel angle:

(19) Now, solving for the horizontal case, using the one-pixel angle:

(13) We use the one-pixel angle in the horizontal case since, in this model, the center line of the object would match up with the split between the middle two pixels in the display. Thus, to be visible, an object would have to cover half each of those two pixels. Given the values derived, our intuition that the horizontal aspect of the object will cutoff first is correct. Still, the final determination of the cutoff point should be determined empirically. By simply observing the point at which the object disappears (which is possible since 0.037û is significantly greater than human visual acuity), the actual threshold distance was found:

The presented models could deviate from the data for several reasons. Most notably, the rounding procedure for displaying an object that covers half-pixels is not easily found. Most graphics software and hardware systems bury simple pixel-rounding functions below many layers of other mechanisms. In addition, the answer may be very complex. Since a large number of graphics packages rely on blurring (anti-aliasing) to accommodate partial pixels, finding a simple answer for how a non-blurred pixel is rounded off in non-anti-aliased images is very difficult. However, since the empirical value for the cutoff distance falls between the values determined for the top and bottom widths, we can be assured than our models are sufficient for describing the visual behavior of the buoy object at different depths.

In addition to determining the threshold distance for visibility, the first point at which the object is fully visible in the display should be calculated. Because the object could subtend a visual angle greater than the FOV or could be cut off by the size of the FOV, its visual angle will not be properly represented in the display. The near limit is given in Equation (7), but both horizontal and vertical components need to be considered. Also, the more comprehensive models should be used in the calculation. Solving for distance in the vertical case, where the object's visual angle is equal to half of the FOV, we find:

In the horizontal case, the near cutoff occurs when the visual angle of the object reaches the full FOV. The difference in the solutions to the two different dimensions can be elucidated by a quick examination of Figure 3.4 and Figure 3.5. Now, we solve for the near horizontal cutoff distance:

Thus, the near fully visible point is determined by the vertical constraint. Given the previous calculations, we now are able to compute the range of visibility of any display. This presentation addresses a specific case, but extending the calculations presented here to accommodate other HMDs or other scene models is trivial.

3.1.1.1 Determining Real World Visibility

Having determined the range of visibility for buoy objects, the next step is to try and establish an approximate value for the real-world range of visibility. The problem of determining real-world visibility is extraordinarily difficult. A great deal of estimation must be done in order to find any sort of reasonable solution, and error in the answer is likely to be significant.

The problem of determining real-world visibility is difficult for a number of reasons. The most obvious explanation is that target detection is a form of visual acuity, and visual acuity varies significantly from person to person (Boff & Lincoln, 1988). Not only is population variance a factor in visual acuity, but there are a number of factors that have been shown to strongly influence the detectability of an object. A short list of these factors includes:

Most of these elements receive treatment in the design of the experiment below. Human visual acuity, as treated in experimental psychology, is a sufficiently similar problem to the visibility trouble with HMDs that the methodology is the same.

Geise, in 1946, reported that visual acuity varies with distance. However, he noted that after a viewing distance greater than 5 m was reached the change in visual acuity was fairly minimal. He noted that visual acuity was about 1.5 times worse at 5 m than at 20 cm. These data suggest that visual acuity at a distance, while decreasing significantly, will remain close to acuity at near distances.

In addition to the constraints posed by human visual performance, the environment in which buoys are seen in the real world is highly variable. The time of day, the latitude and longitude, and the weather all determine the amount of illumination a buoy receives. The color and roughness of the water also play a part in the buoy's discriminability. Furthermore, not all buoys are seen with the water as a backdrop, some buoys are seen with a land mass behind them. The color of a land mass is also highly variable.

How can a reasonable estimate be derived if the variability in the real world is so great? One method is to solicit the experience of actual naval officers who have performed the OOD task in the real world. An experienced U.S. Navy Lieutenant explained the Navy's rules for buoy placement and distribution and claimed that, based upon his experience, buoys were visible at distances up to 3 miles. In addition, the officer pointed out the simple cases where "I normally could see that" in the simulation of a bay with which he was familiar. This also provided data suggesting to a visibility threshold of about two or three miles (Pioch, 1995).

Independent of the variability of the real-world data, the need for better visibility in the OOD simulator is clear. The estimation of a threshold distance can be accomplished with some degree of accuracy. The thresholds based on pixellation in the simulator's display are much shorter than are needed to adequately represent estimated threshold found for the real world.

3.1.1.2 Assumptions

For simplicity, we will focus only on the buoys as significant examples of the visibility problem, disregarding the other objects that also suffer from reduced visibility. Solving the visibility problem with the buoys is tantamount to solving the visibility problem with the other navigational aids and may eventually be extensible to other visibility problems.

In addition, the curvature of the earth is ignored in all calculations. The simulator models the earth as a flat plane, not as an oblate spheroid, (ignoring the work of such notables as Christopher Columbus) in order to simplify the model and its dynamics. Thus, for the purposes of the following calculations, the Earth is flat.

Furthermore, the task in the OOD simulator is a training task (Levison, Pew, & Getty, 1994). While the focus thus far may seem to be directed towards human performance, the actual experimentation will attempt to assess not only the effects of poor spatial resolution on depth estimation performance, but also on the effects of resolution on the training of depth estimation.

Because the project involves a simulator, realism is a significant operating constraint. The solution to the difficulties in visibility should attempt to match real-world visibility and depth perception as well as possible. Bearing this in mind, we turn back to the issues of perspective geometry.

3.1.1.3 Previous Work on Visibility in the OOD Simulator

Problems with the visibility of the buoys and range markers were first reported in by Pioch (1995). However, his solution failed to account for a number of effects of perspective geometry and human performance. This work describes the implementation of a piecewise linear scaling algorithm which is used to make the buoys and other navigational aids visible (Pioch, 1995).

Figure 3.6: The behavior of a previous solution to the visibility problem in the OOD simulator. The object is gradually scaled to twice its original size over the range from 1,000 to 2,000 feet. This algorithm fails to account for the effects on distance estimation that will occur between 1,000 and 2,000 yards when the buoy fails to shrink at the correct rate. In fact, an examination of Figure 3.7 shows that the object will remain the same size from 1,000 feet to 2,000 feet. Thus, distance estimation will be confused since users are normally able to discriminate a number of distinct depths in that range. Essentially, gaining additional pixels at a distance in this manner sacrifices all discriminability in the 1000 to 2000 foot range.

Figure 3.7: Graph showing that the previously-designed algorithm fails to minimize the distortion in distance estimation. It maintains the same visual angle for 1000 feet, eliminating depth discrimination in that range. Clearly, the previously designed method proposed for solving the visibility problems in the OOD can only be classified as an "engineering solution" for the immediate improvement of the simulator since it was not based upon a robust investigation of the perspective geometry which determines the effects of size constancy and linear perspective.

The problem of visibility in a HMD can be described as a threshold perception problem; an object receding into the distance has a definitive point where it can no longer be seen by the user. Software solutions to the threshold problem must be careful not to introduce distance estimation errors in depth perception. Two kinds of errors can result: bias errors and resolution errors. Distortion of the ability to discriminate the depth of an object is a bias effect. A change in the variability of a S's response at a particular depth is a resolution effect. That is, if an visibility-enhancing algorithm does not significantly increase the mean error in a Ss reply, it has a minimal bias. If an algorithm does not significantly increase the variability in the responses, it has a minimal effect on resolution.

Previous solutions have failed to account for both sorts of potential judgment problems caused by algorithmic distortions of the appearance if the object. A good solution should minimize the potential for distance estimation errors while extending the threshold visibility point.

3.1.1.4 Finding a Solution

An ideal solution to the visibility problem would maximize the distance over which an object is visible and minimize depth estimation errors; more simply, the best solution is the most realistic one in terms of bias and resolution. This constraint eliminates simple algorithms that might scale the target object by enough of a constant factor to make it visible at the distance required. Simply scaling the object introduces a significant distortion, especially when the distance between the observer and the object is small. In addition, extending the visibility of a 15 foot by 15 foot target object from two miles to three miles would require scaling the object to 22.5 feet, a distortion that would be clearly discernible at close distances.

Another possible solution would be to extend the range over which an object is one pixel in size.

Figure 3.8: Extending the visibility by making the object stay one pixel in size until the desired visibility point is reached. This would certainly extend the visibility but would eliminate the ability to discriminate depths over a large range of distances. According to Figure 3.8, an object would stay one pixel in size for about 13,000 feet.

Human perception utilizes more than just perspective cues in the perception of depth. A successful algorithm could utilize color and atmospheric cues to extend the visibility of an object. For example, a fraction of a pixel could be displayed by blending it with its neighboring pixels. This technique is known in computer graphics as anti-aliasing. Using anti-aliasing, an object could fade out into the background gradually as it recedes into the distance. Extending the distance could be achieved simply by reducing the rate at which it changes into the color of the background.

Recalling the discussion of realism above, this solution seems to be the best, upon first glance. It would minimize bias and resolution errors in distance estimation, while providing a fairly close approximation of what occurs in human depth perception. However, quantifying the results of such an algorithm would be extraordinarily difficult. While HMDs are fairly consistent in their resolution and FOV, they vary greatly in their ability to produce color.

In our work at M.I.T., we have noted a significant number of color differences between displays in the same brand of HMD. As discussed above (see Difficulties with Head-Mounted Displays), most HMDs introduce a significant amount of color distortion. The optics may introduce chromatic aberration, or the color range of a particular display technology may be limited (Barfield et al., 1995). Furthermore, incorrect application of anti-aliasing techniques can actually exacerbate the pixellation of depth (Christou & Parker, 1995). Thus, the logical choice was to find an algorithm that optimized the perspective geometry rather than utilizing a color change across the visible range of distances.

A successful solution should minimize the deviation from the expected visual angle, especially at near distances, to minimize distance estimation error while also extending the visibility at far distances. These criterion match the desire for realism as well as providing sufficient improvement in task-specific performance. A realistic solution will deviate very little from the real-world visual angle and will present objects that are visible at real-world distances.

3.1.2 The Geometry of Optimal Solutions

The best way to avoid distorting depth judgment is to evenly distribute the pixels across the range which an object should be visible. An algorithm could let the number of pixels subtended by the object be normal at the closest distance and slowly increase the size the object as it moves away so that it subtends more pixels than it would otherwise.

The number of pixels subtended by an object at a distance can be increased in two ways. One, the size of the object can be scaled as a function of distance, where at the nearest distance the object is normal sized and is gradually scaled as it recedes so that it reaches the disappearance point at the minimum size. This is best illustrated by observing the effect of the algorithm on visual angle as a function of distance.

Figure 3.9: A plot showing the effects of the size scaling algorithm on visual angle as a function of distance. The deviation from the normal (dotted line) behavior of the visual angle is minimized. To grasp this function completely, we recall that the empirical cutoff point for an unscaled object was 5,347 feet. Given this value for distance, we can determine the width of the object that serves as the actual determinant for cutoff:

(13) This width serves as the basis for the scale factor at a particular distance. This is the width that is scaled to match the visual angle necessary for visibility at a particular distance, rather than just the top or bottom width. We can now present the formula for scaling the size of the object:

(20) Despite appearances, this is a simple computation since the scale factor (the first fraction) can be computed before run time, so that the formula used is really:

The result of this algorithm is to stretch the depth ranges caused by pixellation to accommodate the improved visibility while, as shown in Figure 3.9, the distortion of visual angle subtended is minimized.

Figure 3.10: The stretching of depth ranges by the size scaling algorithm accommodates a large range of visible distances. Finally, the effect of the algorithm can be best understood by comparing the scaled version of the buoy object with the unscaled version at a number of distances:

Figure 3.11: A comparison of the normal, unscaled buoy object to the buoy object altered by the size scaling algorithm. (a) At 125 feet. (b) At 1,100 feet. (c) At 3,000 feet. (d) At 13,000 feet. Note the motion of the buoy object towards the horizon, especially in (d). Another algorithm utilizes a distortion of the field of view to improve visibility. By narrowing the FOV with increasing distance, an object appears normal at close distances but becomes larger when it is further away. This is best understood by imagining a pair of binoculars that dynamically increase magnification as an object gets further away. In this case, the increase in magnification is scaled so as to match the minimum size of the object with the desired visibility range.

The following graph shows how the distortion from the normal visual angle is minimized:

Figure 3.12: A plot showing the effects of the FOV distortion algorithm on visual angle as a function of distance. The deviation from the normal (dotted line) behavior of the visual angle is minimized. The formula for the FOV distortion algorithm is given as:

(21) The width used in the calculation is the same as the size algorithm, thereby incorporating the empirical cutoff point into the visibility calculation. Formula (21) can be simplified by precalculating the scale factor:

As in the size-scaling algorithm case, the number of pixels and the visual angle subtended are distorted, while improving visibility to cover the desired range.

Figure 3.13: The stretching of depth ranges by the size scaling algorithm accommodates a large visible range. Again, the properties of the FOV distortion algorithm are best understood in a visual comparison to the normal case:

Figure 3.14: A comparison of the normal, unscaled buoy object to the buoy object altered by the FOV distortion algorithm. (a) At 125 feet. (b) At 1,100 feet. (c) At 3,000 feet. (d) at 13,000 feet. The FOV distortion algorithm, unlike the size-scaling algorithm, distorts the entire scene, not just the buoy object. However, since the distortion of the surrounding scene is not critical to the visibility of the target object, it is ignored. The usefulness of this algorithm decreases when the appearance of other objects in the scene is also important. To solve the scene-warping problem, the FOV can be distorted only when the graphics engine is drawing certain objects. The location of the object, however, will still be distorted.

The clever reader will note that the pixel distributions of the size algorithm and FOV algorithms should be identical. That is, the perceived size of an object at any point is theoretically the same for both algorithms. However, the assumption that the two algorithms will result in identical performance is incorrect, since the complete scenes that result from the application of the algorithms are not the same. The two algorithms, while predicting identical object sizes, do not generate identical locations on the display surface. Figures 3.11 and 3.14, in part (d), show the locations of the buoy object at 13,000 feet for the different algorithms. The size scaling algorithm will push the object further to the horizon because it does not distort the linear perspective cues, while the FOV algorithm will manipulate both the size constancy cues and the linear perspective cues.

The solutions presented are sufficient since they provide an extended visible range to the observer while attempting to minimize depth judgment errors. In this manner, they improve the overall realism of the simulator and meet the task-specific requirements for visibility. However, this sufficiency is theoretical, and empirical evidence concerning the usefulness of these algorithms should be obtained before a full conclusion is reached.

3.2 Method

VE systems are prone to a variety of adjustment problems and other experimental noise (see section entitled Problems with Virtual Environments). Moreover, the association with a practical problem is necessary to provide a real, rather than academic, engineering solution. The experimental method described below determines the effectiveness of algorithms for improving visibility.

3.2.1 Subjects

Six students at the Massachusetts Institute of Technology participated in the study. Subjects were required to fill out forms in compliance with the Committee on the Use of Humans as Experimental Subjects and were paid for their time. The subjects ranged in age from 17 to 22. Half the individuals had vision corrected by contact lenses, the other half had normal or nearly-normal vision. None of the subjects had any prior experience navigating boats or with other maritime activities which might give them knowledge of buoy location and identification. Three subjects were male and three subjects were female.

3.2.2 Apparatus

The visual stimuli for the task were presented using a Silicon Graphics Onyx (with a RealityEngine2 graphics board). Software for generating the stimuli was developed using the Performer Library from SGI. Data-collection and experiment-control programs were developed in C.

The graphics were shown using a Virtual Research VR4 head-mounted display. The spatial resolution for each eye is given in product literature as 742 pixels by 230 pixels (Virtual Research, 1995). However, the HMD took as input an NTSC composite video signal with a resolution of 486 pixels by 648 pixels. As noted earlier, sometimes the frame buffer clips the image (Rolland et al., 1995), and this was tested empirically for the VR4 HMD. The HMD was revealed to be capable of displaying 486 pixels by 646 pixels; therefore, this was the resolution used in subsequent models. The displays measured 1.3 inches across the diagonal which means that a pixel subtended .074û of both horizontal and vertical visual angle.

Figure 3.15: The characteristics of the HMD display used in the experiment. Responses were collected using the BBN Hark Voice Recognition System. The system used a "press to talk" button and a Sennheiser microphone headset. A simple grammar was written to recognize spoken numbers from 1 to 99,999 (see Appendix A). The Hark system ran on a Silicon Graphics Indy computer. Responses were sent to data recording software running on the Onyx via typical ethernet connections. In addition, audio feedback was given to the subject via the Sennheiser headset and an identical headset mounted on the HMD. Audio feedback consisted of the playback of a recorded message asking for a repeat of a response that had confused the recognition system. A low-level test of the HARK system resulted in an average accuracy rate of 97% on a fairly simple grammar (Pioch, 1995). This was considered more than suitable for the needs of the experiment.

For an in-depth treatment of the design and implementation of the VETT core testbed hardware and software systems, the reader should consult Zeltzer et al. (1994).

3.2.3 Design

The primary experimental goal was to assess two visibility-enhancing algorithms in terms of human perceptual performance. The usefulness of these algorithms in improving the training of depth perception is also investigated.

The experiment design was influenced by two overall factors: performing experiments on far-field visibility and accommodating the constraints of the OOD simulation. The selection of only a part-task of the OOD navigation task allowed for careful simplification of problems of immersion and simulator fidelity to problems that could be resolved experimentally. The environment and target stimulus were presented in such a way as to be consistent with the OOD simulator. However, in the interest of reducing experimental noise, the OOD models were not followed precisely.

For instance, the landmasses and clouds were removed from the scene. This was done to avoid the introduction of conflicting depth cues. In addition, the buoy-object's position at a far distance would place some of the pixels next to that of a landmass, thus presenting a different background color. Having the background color at far distance be inconsistent could cause a different perceptions of depth at the same distance and was thus unacceptable. Furthermore, the color of the background could also influence target detection. The clouds were flat textures mapped into the sky. Because it was unknown how the clouds would be interpreted in depth, they were classified as noise and removed.

In addition, all other buoys in the model of the channel were removed, as were range markers and turning beacons. The presence of these other features would clearly influence the perception of depth of the target object. The submarine model that normally would be visible in a normal forward view was also removed. The remaining scene consisted of a flat plane that had a water texture mapped onto it and a sky that is lit from an overhead light that approximated the sun.

Figure 3.16: A number of elements were removed from the OOD simulation shown in (a) to get a noise-free environment for experimentation. (b) shows a sample scene from the experiment. The target object was reduced to a frustum from a more complicated model. This was done to ensure that the underlying graphics software would only have six polygons to interpret and display rather than the twenty in the original model.

Figure 3.17: The buoy model used in the experiment (left) and the buoy model used in the OOD simulator (right). The behavior of the graphics package at a far distance was unknown with respect to handling multiple polygons. If the more complex model is assumed, and the object has been reduced to one pixel, we can hypothesize that a case exists where the model with fewer polygons will be visible and the other will not.

Figure 3.18: The predicted effect of using the simpler buoy model. The simple model (left), translates to two pixels because it fills more than half of the two pixels it covers, while the original OOD model (right), is displayed as only one pixel because of its shape. Obviously, the advantage of using the model with fewer polygons outweighs the usefulness of adhering to the exact model used in the OOD simulator. Moreover, since the new model does not differ from the previous one in its dimensions and proportions, the geometry discussed above does not change

The target object was chosen to be red. In the OOD simulator, buoys are red, green and yellow, and have small white numbers labeling them. The decals were preserved since they did not interfere with the color of the object beyond a certain distance. Color, however, was determined to be a significant factor in determining depth in a pilot experiment. Thus, to reduce complexity, a single color was chosen.

Finally, the experiment differed from the simulator in that the point of view was fixed so that the buoy was directly straight ahead. The viewpoint was not based upon feedback from a head position tracker. Motion of viewpoint during acuity and target detection tests significantly reduces accuracy (Boff & Lincoln, 1988). In addition, motion sickness associated with tracked-head motion was avoided (Kennedy et al., 1992).

The direction (heading) of the viewpoint was chosen randomly after pilot experimentation showed a significant effect of direction on accuracy. The water texture provides a important depth cue. Failure to randomize direction could result in the use of the texture as the main depth cue, rather than the size and shape of the object.

In order to fully address the issue of performance in the OOD simulator, the effects of training had to be considered. That is, since the OOD simulator was designed to train individuals at a task, performance of a sub-task in the OOD model should also be considered as a training task. Thus, the experimental goal was not only to assess the visibility-extending algorithms in terms of human perceptual performance, but also to examine their usefulness in improving the training of depth perception.

The subjects' task was to estimate the distance of the target object in feet. The units of measurement were chosen so that the subjects (hereafter: Ss) could give a sufficiently fine-grained response. Also, the units were influential in the accuracy scores on the initial assessment trial, since some transfer effects from real world-based expectations were observed. That is, the different subjects would form preconceptions based upon the units about how to judge depth in the experiment.

The target object was presented according to the perspective geometry discussed in detail above. Two identical control conditions and the two algorithms described above at two different distance thresholds (10,560 feet and 13,160 feet) constituted the six experimental conditions. Each S started on a different condition. The following Latin square was used to remove place-in-order effects:

Table 3.1: The Latin square distribution of conditions, days, and subjects used in the experiment. Since six subjects were used, order effects between which algorithm was used could be counterbalanced. Algorithm-used was chosen over visibility distance since the interaction effect of the algorithms was deemed more important.

Two dependent variables were recorded. The first consisted of the verbal report received from the subject and ranged in value from 1 to 99,999, or -1, if the subject replied "I can't see that." The other dependent variable recorded was reaction time (RT). RT was measured as the time from the display of the stimulus to the receipt of the reply from the Hark system. The RT measurement did not subtract the time needed to speak different responses. That is, the amount of time needed to speak, "thirteen thousand, one hundred and twenty-four" is larger than the time need to say, "forty," and this discrepancy was unaccounted for.

The main measure of performance in the experiment was the deviation between the Ss' responses and the distance presented according to the perspective geometry. This difference is referred to as "absolute error," which is not to be confused with "standard error" in later statistical calculations.

Distances were chosen from within ranges of depth. The depth ranges were selected so that thirty distances picked from within thirty depth ranges would constitute a set of trials. Depth ranges were chosen such that they would range from the near visibility point (calculated previously to be 123 feet) to just beyond 13,160 feet (2.5 miles). This was done to ensure that an equal number of trials would be presented for each experimental condition.

Distributing the depth "buckets" linearly across these distances made little sense. The geometrical models for the OOD simulator predict that the target object will be displayed at the same number of pixels over certain ranges, ranges whose size at the furthest distance is far larger than a linearly-chosen depth bucket. This implies that buckets at a distance should be larger. In addition, the number of invisible trials in the control and closer visibility point conditions were minimized to increase the number of useful data points.

Table 3.2: The ranges of depth from which distances were selected in the experiment. The depth buckets were determined according to the following equation:

(23) This formula increases the depth bucket size at the larger viewing distances to account for the expectation of decreased depth acuity at those ranges. In addition, this calculation accounts for the desire to collect relatively similar numbers of data points across distances. Too many points presented in the far range would overtrain on those points, while too few would not yield enough data to determine the effects of the algorithms at a far distance.

Trials in which the stimulus was not visible were presented in order to balance the total number of trials per condition. Keeping the number of trials constant and presenting only visible trials in a particular condition introduced the problem of training depth estimation on one condition more effectively on a particular range. Varying the number of trials and keeping the size of the depth ranges constant presented problems with the time needed to complete the trial and the amount of training for each condition. Presenting invisible trials seemed to be the best solution, even though the display of invisible trials between visible trials could interfere with the training of distance estimation. The issue of the number of trials is discussed in detail below.

The methods presented above represent a significant attempt at reducing noise inherent in the simulator. Certain problems were unavoidable (such as the color distortion in the HMD), but others were minimized or eliminated. Unfortunately, a major characteristic of experimentation in VEs is the difficulty of properly eliminating confounding factors.

3.2.4 Procedure

Subjects were solicited via ads sent out to electronic mailing lists. In addition, only Ss that could perform the experiment on six consecutive days were selected. Ss were scheduled to run over a seven-day period (four subjects on Days 1 through 6, 2 subjects on Days 2 through 7). Arranging times was a difficult task, but Ss were scheduled to do the experiment on the same time every day when possible.

Upon arrival on the first day, Ss completed forms in compliance with the Committee on the Use of Humans as Experimental Subjects. In addition, a set of instructions was presented (see Appendix B), and questionnaires on marine experience were completed. Finally, Ss filled out paperwork detailing their subjective physiological state (e.g. did they feel nauseous, light-headed, etc.). On subsequent days, only a short set of instructions and the physiological surveys were required. The experimenter examined the responses regarding physiological state to see if there were any conditions that may interfere with the S's well-being during experimentation. Then, Ss were asked if they had any questions about what they were asked to do; short clarifications would be given if required.

Notably, Ss were asked to make fine-grained responses. Pilot tests showed that Ss had a strong tendency to estimate distance rather than guess. Performance improved when Ss made finer-grained responses (i.e. "4,435" vs. "4,500"). Therefore, Ss were encouraged to use more digits in their estimation (see Appendix B).

Because the HMD eliminated all vision except for that inside the helmet, recording Ss' responses became an issue. A keyboard could not be used to enter responses since it would require some typing training and mistyped responses would be difficult to catch. Instead, the Hark voice recognition system was chosen for its reliability and its speaker-independent recognition. The Hark system has a recognition rate that can approach 100%. Unfortunately, in practice, the recognition rate is about 95%. For an experiment with 17,280 total data points, this could represent a loss of 618 data points, clearly not an acceptable condition. With practice, however, individuals can become accustomed to the system and achieve much higher hit rates. In addition, having the Ss speak their responses allowed the experimenter to easily monitor the hit rate and make corrections as needed.

Therefore, after completing the paperwork, Ss donned the Sennheiser headset-microphone to perform a simple training regimen on the Hark voice recognition system. A number from 1 to 99,999 or the phrase "I can't see that" was presented on the screen of the workstation. The Ss would press the "push to talk" button and speak their response. If a response was not recognized, the subjects were informed of possible problems via information printed onscreen and a replay of the recorded verbal request, "Could you repeat that?" The experimenter was on hand throughout the process to monitor the S and to provide assistance in case of any difficulty. In the case of a misrecognized response, the experimenter would record the trial and the correct response and later correct the data to reflect the given response. In this manner, near 100% accuracy of response recognition could be achieved. After the first day, Hark training was reduced to a much shorter set of trials.

Once the Hark training was complete, Ss were seated comfortably. The HMD had a foam seal that prevented most external light from interfering with vision. The experiments were conducted in a room with no windows and lighting was reduced to a single computer screen (approximately 12 cd/m2) used for experiment control and data collection monitoring.

Figure 3.19: The experiment station. A subject sits comfortably, wearing the HMD and holding the "push to talk" button for the voice recognition system. Ss were encouraged to keep their head positioned straight ahead so as to have the displayed horizon match what would be expected in the real world. Following the arrangement of the S at the experiment station, verbal instructions on the adjustment of the HMD were given. A test pattern was displayed during this adjustment so as to ensure a proper fit. The subject adjusted the headstraps, the interocular distance, and the eye relief of the HMD to obtain the clearest picture. Again, verbal clarifications on the experimental procedure were offered.

The stimulus was presented in two different ways. Feedback trials displayed the stimulus until a response was given then showed a number indicating the correct depth of the target object over top of the scene. The correct response was displayed for 1.25 seconds. On assessment trials, the correct distance was not displayed. Both feedback and assessment conditions distinguished separate trials by a blank screen shown for .75 sec.

Figure 3.20: The timing of a typical assessment trial. After the stimulus was displayed and the subject gave a response at time N, the screen was blanked. Figure 3.21: The timing of a typical feedback trial. After the stimulus was displayed and the subject gave a response at time N, the correct answer was displayed. Then, the screen was blanked. A set of assessment trials was followed by four sets of feedback trials. Every four sets of trials, Ss received a break so as to reduce mental fatigue from repeating the task, physical fatigue from supporting the weight of the HMD, and visual fatigue from the optics of the display.

Figure 3.22: The ordering of breaks and trials for one subject during a typical day's run. The first and third breaks were approximately three minutes long. During these breaks the Ss were told to remove the HMD but to remain seated. The midway break was ten to fifteen minutes long and Ss were encouraged to walk around outside the lab. After a break, the test pattern would be displayed and the Ss would readjust the HMD. Upon completion of that day's trials, Ss again filled out a physiological state form and were paid.

3.3 Results

On a given day of the experiment, a subject performed 16 sets of trials, 4 assessment and 12 feedback (see Figure 3.24), for a total of 480 data points. Over the six days of the experiment, each S performed a total of 2880 trials, some of which were discarded because of noise. Noise included skipped trials caused by improper use of the voice recognition system. A pause during the enunciation of a reply could confuse the system into thinking two replies had been given (i.e. "four thousand [pause] two hundred" was recognized as 4,000 and 200, not 4,200). Thus, the two were appended (by the experimenter) and the second trial discarded. Of a total of 17,280 points, 87 were discarded because of skips.

The performance of the subjects was only evaluated in the assessment sets of trials. Only trials where the S responded with an estimate of depth were considered. Those trials on which the S replied, "I can't see that" were discarded. Using these criterion, a total of 3,223 data points were considered. The data from a typical set of assessment trials is shown in Figure 3.23.

Figure 3.23: A typical assessment trial set. The condition represented here was the size scaling algorithm with a cutoff distance of 10,560 feet. The circles at 0 feet represent cases where the subject replied that he or she could not see the buoy object. The control condition included 4 sets of assessment trials for each subject and was used on two separate days for a grand total of 1,440 data points. However, since about half of the distances displayed were invisible, the actual number of data points was 760. The following table provides a gross summary of the performance of the Ss on the control condition:

Table 3.3: The mean, standard deviation, and standard error for RT and absolute error for the control condition. Again, the RT measure does not account for the time it took an S to speak a reply, thus those numbers should be considered to be much noisier than the error measure. Assuming that the data is normally distributed over subjects and that the subjects form a representative sample of the population, we can analyze the significance of population variance on performance in the assessment trials.

Table 3.4: The by-subject data compiled for an analysis of variance. Subject is the independent variable and error is the dependent variable in the calculation. According to the ANOVA presented in Table 3.4, the difference in performance between subjects over all visible assessment trials was statistically significant, F = 7.447, p < .0001

The performance of the subjects varied with the distance presented. Performance was assessed both by the mean error and the standard deviation. Absolute mean error corresponds to the effect of bias and standard deviation corresponds to the effect of resolution. By examining the effect of distance on both mean error and its standard deviation in the control case, we can compare the effects of the visibility-enhancing algorithms on the bias and accuracy of depth estimation.

Figure 3.24: A plot of error (the absolute deviation between the presented distance and the subjects' response) versus depth range for the control condition. The error increased as a function of distance. Figure 3.25: A graph of the standard deviation of mean error (averaged over subjects) versus depth range for the control condition. The variance of the responses increased as a function of distance. Figures 3.25 and 3.26 show the effect of distance on absolute average error and accuracy in the control case. In the control condition, only images that were presented in depth ranges 1 to 17 (distances of 123 feet to 5612 feet) were visible. The cases utilizing the visibility-extending algorithms covered depth ranges 1 to 27 (with a cutoff distance of 10560 feet) and in depth ranges 1 to 30 (with a cutoff distance of 13160 feet).

We can see the results of changing the cutoff distance used in each algorithm by plotting the mean error and the standard deviation of error as a function of distance.

Figure 3.26: A plot of mean error as a function of depth range for the various cutoff distances. The control case has fairly equivalent accuracy to the algorithm-enhanced cases over the ranges it is visible (1 to 17). The 10,560 ft. case cuts off at depth range 27. Figure 3.27: A plot of mean error as a function of depth range for the various cutoff distances. The control case has, as expected, better accuracy over the ranges it is visible (1 to 17). The 10,560 ft. condition is last visible in depth range 27. Figure 3.26 shows that the mean error of the responses continues to increase with distance even in the extended visibility cases. Furthermore, the accuracy of the responses decreased with distance. The size scaling algorithm and the FOV distortion algorithm did not yield identical performance results. Figure 3.28: A plot of mean error as a function of depth range for the two algorithm conditions. The control case is only visible over depth ranges 1 to 17 (123 feet to 5612 feet). The FOV distortion algorithm is better than the size scaling algorithm for all depth ranges from 16 to 29.

The standard deviation of the of the error (Figure 3.29) also increased with distance, although it varied more in the near ranges.

Figure 3.29: A graph of the variance in mean error as a function of depth range for the two algorithm cases. Mean standard deviation is calculated as the average for all six subjects. The control case is only visible over depth ranges 1 to 17 (123 feet to 5612 feet). To summarize the effect of the different algorithms and cutoff distances, the mean error and standard deviation for all conditions and subjects were compiled. To properly compare the different conditions, only data from comparable distances can be evaluated. This complicates the analysis, but permits a more fine-grained investigation of the results. Table 3.5 summarizes the by-subject mean error.

Table 3.5: The mean error (given in feet) for the various conditions and subjects. The control algorithm showed better performance than either algorithm-enhanced case over the depth ranges 1 to 17. The average absolute error on the cases with a 10,560 foot cutoff distance was better than in the cases with a 13,160 foot cutoff. Also, the size scaling algorithm resulted in better performance than the FOV algorithm.

Before performing significance testing on these results, the accuracy for the various conditions should be examined.

Table 3.6: The standard deviation of error for the various conditions and subjects. The control algorithm showed the best depth estimation resolution over the depth ranges 1 to 17. Also, the resolution performance of the size scaling algorithm at both cutoff distances was much better than the performance of the FOV distortion algorithm at those distances. Increasing the cutoff distance decreased the accuracy of the responses.

By assuming a normal distribution of the data, we can perform a series of analyses of variance to determine the significance of the information in Table 3.5 and Table 3.6. Again, only data from similar depth ranges can be directly compared.

Table 3.7: An analysis of variance treating data collected from assessment trials performed in depth ranges 1 to 17 over all conditions. The topmost table shows the effect of cutoff distance, depth range, and their interaction on average absolute error. The middle table displays the effect of the algorithm used, depth range, and their interaction on average absolute error. The bottom table displays the effect of condition, depth range, and their interaction on average absolute error. The difference in performance between the cutoff distances was not statistically significant for depth ranges 1 to 17. Furthermore, the algorithm had no significant effect on performance. The condition, which represents the four combinations of the algorithms and cutoff distances as well as the control, was also not statistically significant.

As expected, the effect of depth range was strong, indicating that the influence of distance on performance is statistically meaningful. However, by analyzing only the algorithm-enhanced cases over depth ranges 1 to 27, more data points can be considered (although the control cannot be compared).

Table 3.8: An analysis of variance treating data collected from assessment trials performed in depth ranges 1 to 27 over the algorithm-enhanced conditions. The topmost table shows the effect of cutoff distance, depth range, and their interaction on average absolute error. The middle table displays the effect of the algorithm used, depth range, and their interaction on average absolute error. The bottom table displays the effect of condition, depth range, and their interaction on average absolute error. Like the results for the depth ranges 1 to 16, the effect of cutoff distance and condition was not statistically significant and the effect of depth range is strong. No interaction effects was observed. Interestingly, the algorithm used was significant, F = 22.801, p <.0001. By referring to Table 3.5, we can draw the conclusion that the size scaling algorithm conditions had significantly better performance than the FOV distortion algorithm cases for depth ranges 1 to 27. Next, we compare the algorithms for the full range of depths.

Table 3.9: An analysis of variance treating data collected from assessment trials performed in depth ranges 1 to 30 over the algorithm-enhanced conditions. The table displays the effect of the algorithm used, depth range, and their interaction on average absolute error. To properly calculate the data in Table 3.9, only the conditions with a cutoff distance of 13,160 feet were examined. Comparing the performance of two algorithms with an F-test at 99% significance showed that the size scaling algorithm was better than the FOV distortion algorithm.

In summary, the performance in the algorithm-enhanced cases did not differ significantly from the control case for the depth ranges 1 to 17. The accuracy decreased with distance. The size scaling algorithm was significantly better than the FOV distortion algorithm over all distances.

Before examining the effects of the algorithms and various cutoff distance on training, we should clarify what is meant by training in this experiment. Training performance is given by the both the final trained performance and the rate at which that performance is achieved. Performance implies both mean error and variance of the responses; therefore, the analysis of training should include both bias and resolution effects. Because of the size of the experiment, the number of assessment trials was somewhat limited, so the learning curve for a particular condition has only four data points.

Figure 3.30: The learning curves of error for the various conditions. The control condition had the best final trained mean error. Only assessment trial means are plotted. All five conditions are shown in the leftmost plot, which shows training on the first 17 depth ranges. The middle plot compares the learning curves of the algorithm-enhanced cases for depth ranges 1 to 27. The rightmost plot shows the behavior of the two conditions that were visible over all 30 depth ranges. Comparisons of the different conditions can only occur on the same sets of depth ranges. Because the control case is only visible on depth ranges 1 to 17, the performance of the algorithm-enhanced conditions can only be assessed versus the control on that range. Figure 3.30 shows that most conditions had a learning curve with a negative slope implying that the performance of the subjects improved as they were trained longer. Notably, both size scaling algorithm cases show a positive learning curve slope on depth ranges 1 to 17.

The significance of the training effect for the various conditions is assessed by comparing the effect of the number of the trial on performance. Again, analysis can only be performed over comparable distances.

Table 3.10: An analysis of variance treating data collected from assessment trials performed in depth ranges 1 to 17 over all conditions. The topmost table shows the effect of algorithm, trial set number, and their interaction on average absolute error. The middle table displays the effect of the cutoff distance, trial set number, and their interaction on average absolute error. The bottom table displays the effect of condition, trial set number, and their interaction on average absolute error. The significance of the training performance is revealed by the analysis of variance shown in Table 3.10. The significance of the trial set number, combined with the negative slopes in the learning curves shown in Figure 3.30 imply that training did take place. The data in Table 3.10 also suggests that the difference in error between conditions at certain trial sets was statistically significant.

Table 3.11: An analysis of variance treating data collected from assessment trials performed in depth ranges 1 to 27 over algorithm-enhanced conditions. The table shows the effect of algorithm, trial set number, and their interaction on average absolute error. By looking at depth ranges 1 to 27, we can analyze the significance of the different algorithms on training. A training effect is quite apparent in Table 3.11, and the performance significantly changes with the different algorithms. By looking at Figure 3.30 we can conclude that the size scaling algorithm resulted in better performance than the FOV distortion algorithm. No interaction effect was found.

Table 3.12: An analysis of variance treating data collected from assessment trials performed in depth ranges 1 to 30. Only the two cases with a cutoff distance of 13,160 feet are considered. The table shows the effect of algorithm, trial set number, and their interaction on average absolute error. An examination of the training effect on all 30 depth ranges reveals that training did take place, although no significant effect of algorithm was found.

In summary, significant training took place on all conditions. Training did not take place uniformly across all depth ranges. The algorithm used was shown to be significant for some depth ranges. Observation of the learning curves shows that positive learning took place for most conditions.

The effect of presenting invisible trials on training performance is difficult to determine. The control case had more invisible trials than the other conditions, and thus training performance may have been affected. However, as discussed earlier, the need to present a equal number of trials over all conditions outweighed the desire to have the same number of visible trials.

Table 3.13: The number of normal, false-negative, false-positive cases for the various conditions. The algorithms did have an effect on the false-visible case, as shown in Table 3.13. The number of trials where the subject claimed they could see an object when the model predicted it would be invisible was greater for the algorithm-enhanced cases. Furthermore, the control condition had more cases where the object was presented at a distance that was expected to be invisible and the subject responded that it was visible. Clearly, the effect of the algorithms on target detection should not be entirely discarded as trivial; although we will see that the sources of false-visible and false-invisible claims are easily discovered.

3.4 Discussion

We have shown that the error in estimating the location of an object in depth is a function of the viewing distance (see Table 3.7 and Table 3.8). This corroborates the assertions made earlier regarding the number of pixels subtended for a given distance (see Equations 11 and 14). Because the number of pixels subtended is constant across particular ranges of depth, distances in those ranges can not be discriminated.

Furthermore, this assertion implies that if we were to map the depth ranges for a particular model, the accuracy in that range would depend entirely upon the size of the range. For example, if the object was two pixels from, say, 7200 feet to 10000 feet, and a number of trials were run to determine accuracy in that range, we would expect a mean error in the Ss' reply of 1400 feet. However, the linear perspective cue causes additional fragmentation of the depth ranges which results in better performance. An object may remain the same size, but move a pixel towards the horizon.

The effect of the linear perspective cue is difficult to assess because of the disappearance-reappearance and growth-shrinkage problems. Because the object may completely disappear at some depth ranges or growth in size with distance, the ability to predict the performance in particular depth range is too hard.

The control condition of the experiment reveals the effect of pixellation. An examination of Figure 3.24 shows a flattening of the curve at depth ranges 7 through 17. Looking at Table 3.2, we see that depth ranges 7 through 17 correspond to distances 1,994 feet to 5,612 feet. Then, observing the pixellation behavior shown in Figure 3.13, we see that, in the control case, the object is one pixel in size from about 2,000 feet to the cutoff distance. So, the mean error across depth ranges 7 through 17 should be about the same, which explains the plateau seen in Figure 3.24. A similar effect can be observed in the algorithm-enhanced cases, bearing in mind the logarithmic distribution of the depth ranges.

Pixellation explains the degradation of depth perception as a function of distance in a HMD and points to the effects of distance on acuity in the real world. If we are modeling the appearance of the real world in our VE, the lack of discriminability due to the spatial resolution of the HMD is only somewhat appropriate. Since the decrease in acuity with distance occurs much faster than in the real world, the issues of visibility and depth judgment are still problematic.

As a solution to the discrepancies in the degradation of depth perception between the real world and the virtual world, two algorithms were proposed. The experiment showed that the algorithms could easily extend the visibility in the simulation, without significantly affecting performance. Only performance in depth ranges 1 to 17 could be compared to a control, but no significant difference was noted in that range.

The manner in which the algorithms extend visibility may have had an effect on training. However, the most relevant aspect of the algorithms is their ability to extend visibility without significantly affecting the error. The algorithms pushed the cutoff distance to 2 and 2.5 miles as two conditions of the experiment. As the size of a depth range was stretched to accommodate a greater range of distances, the mean error increased, but not significantly on the depth ranges 1 to 17.

The size scaling algorithm showed, for the full range of depths, a significantly better effect on performance than the FOV distortion algorithm. A possible explanation is that the size scaling algorithm took better advantage of linear perspective. Because the FOV distortion algorithm distorted the distance of the object to the horizon, it had the same number of linear perspective steps as the control. The size scaling algorithm caused the object to have more steps towards the horizon. This may have provided the additional discriminability.

The results of the experiment show that the final training performance of the control case for depth ranges 1 to 17 (see Figure 2.20) was better than any of the algorithm-enhanced conditions. This is reasonable since more pixel steps occurred in the control case over that range.

Given the significance of the number of the trial set on performance, the subjects clearly displayed a learning effect. In most cases, the learning was positive; the Ss performed with less error and better accuracy as they had more feedback trials and practice. The cases in which the learning curve had a positive slope (negative learning) were caused by the size scaling algorithm. However, these cases were unusual in that training only was negative on depth ranges 1 to 17. This suggests that the training of the farther depth ranges had a poor influence on training at the closer ranges. However, this effect was noted only for the size scaling algorithm on depth ranges 1 to 17; positive training was observed when the range from 1 to 27 was considered.

The training implications of the experiment are not terribly serious. The experiment was subject to design constraints which prevented noise-free and easily comparable training across all conditions. Because of the higher goal of realism, the best way to assess training effectiveness is with a transfer of training experiment. This kind of test cannot be easily accomplished for the OOD task. Thus, the results regarding training performance should be taken as only guidelines, while the concrete geometrical and performance models should be considered as empirically and rigorously justified. The difficulties in performing and interpreting training experiments in virtual environments is discussed extensively by Lintern (1996).

The conclusions that can be drawn from the analysis of absolute mean error and accuracy are highly useful. Error and variance increase with distance, but may plateau over a depth range where the size of the object is constant. We have discovered that our algorithms extend visibility without introducing a significant change in depth estimation ability (based upon comparison with the control over depth ranges 1 to 17). The size scaling algorithm was shown to be significantly better than the FOV distortion algorithm.

The ramifications for the OOD simulator are obvious. The size scaling algorithm should be implemented to extend visibility in the simulator without introducing significant distortions in depth estimation. The implications of using the algorithms for training performance should be examined further if the ability to train depth perception is determined to be a critical component of the task. A solution to the visibility problem in this scenario has been presented and empirically justified.

On to next section: Future Work