From: Jeff Pierce [jpierce@cs.cmu.edu]
Sent: Wednesday, October 03, 2001 2:25 PM
To: bowman@vt.edu; 3dui List
Subject: RE: Comparison of HMD and Dome

At 11:55 AM 10/3/01, Doug Bowman wrote:
>> Generalizing the tasks performed in VE is exactly what I am doing. Your
>> "testbed evaluation" paper is really useful.
>>
>> There is one point I got confused. I noticed that the model of the testbed
>> (also other previous research) was created based on tasks. For instance,
>> for selection and manipulation, a boxes scene was created; for travel
>> , a town type scene was created. But in our model, we did all test by
>> using only one model. The VE is a room with a set of 15 colored balls,
>> cubes, torus, cylinders, and paramids (5 of each type, 3 of each color)
>> along one wall and 15 matching platforms along the opposite wall. The task
>> requires subjects to move the 15 balls on the left side the room over to
>> the matching 15 platforms on the right side of the room. The subject has
>> to go through a maze type walls, while avoiding the walls.  We measure
>> 6-DOF manipulation performance, task finishing time, sickness, and so on.
>> Notice that subjects have to navigation to the object, pick up and then
>> navigation, then drop. Two constant speeds are allowed.
>>
>> Does it matter we measure them together? My program did record all points
>> (in each loop) subject traveled to. Should I seperate the task performace
>> and discuss them?
>
>Jian,
>
>If you're looking for generalizable results, my personal feeling is
>that it's best to separate the tasks and control all the outside factors
>as much as possible. That way, when you do your analysis you can find
>out statistically which factor is responsible for any changes in
>performance.
>
>There are some other people on the list who disagree with me (would Jeff
>like to chime in here?).

How can I resist the opportunity to engage in thesis procrastination?  =)

For the rest of the list who're wondering what Doug is talking about, he
and I had a short discussion about how well the results from particular
tasks correspond to the results from real work.  For example, many testbed
tasks involve the manipulate of generic shapes (e.g. cubes, spheres) to
allow experimenters to isolate the contributions of individual factors
(e.g. size, distance).  If we instead had users manipulate familiar objects
(e.g. chairs) their task performance could be affected because they
recognized the objects and made assumptions about their properties (e.g.
size, distance).  The advantage to using these types of tasks is that we
can learn a great deal about the contributions of individual factors (e.g.
how does doubling the size of a cube affect task performance?).  The
disadvantage is that the results do not necessarily "generalize" as well to
real work.  When engaged in real work, users that extra information
available: they're working with familiar objects, allowing them to take
advantage of the known properties of those objects.

When you create these types of tasks you're choosing a particular point on
a spectrum: more confidence about the contributions of individual factors
in exchange for less confidence about how well the results transfer to real
work.  Fred Brooks wrote a paper in 1988 that discusses this spectrum; it's
worth a read if you haven't looked at it:

Frederick P. Brooks. Graphics Reality Through Illusion: Interactive
Graphics Serving Science. CHI 1988 Proceedings, pages 1-11.

We can also choose a point closer to the other end of the spectrum.  On
this end you get more confidence about how well your results reflect real
work, but you pay with less confidence about the effects of individual
factors. Consider a case where I want to learn whether technique A or
technique B is better for arranging objects in a scene.  If I create tasks
of this type (e.g. moving furniture around a room, moving rides around an
amusement park) I will arguably have more confidence in the result then if
I make people move cylinders around a featureless environment.  However, I
won't be able to state the effects of size with as much confidence.  If
users are much less accurate positioning furniture at 500 feet than
amusement park rides, is the difference because of the relative size of the
objects or because of the types of objects (furniture vs. rides)?

The trick, of course, is determining where on the spectrum you should be.
I tend to lean toward the latter end of the spectrum because I'm an
engineer at heart. If I need to choose a technique for a VE where drama
students will be prototyping stage layouts, I'm probably better served by
the latter type of study.  On the other hand, to learn all about a
particular technique and what makes it tick, you're probably better served
by the former type of study. If you're more of a scientist interested in
learning Truth you probably lean toward this end.  

The question you need to answer is what exactly you want to learn. Doug's
recommendation (separate the tasks and control all the outside factors as
much as possible) will help you draw conclusions about how particular
factors affect performance in a particular display.  For example, you might
learn that when you double the size of the spheres performance gets faster
in the HMD but not in the dome.

On the other hand, all you might care about is whether training is more
effective in display A than display B.  In this case you need to focus more
on making your tasks resemble real work than on controlling the individual
factors.  If you're training people to pull a piece of the international
space station from the shuttle, navigate through space, and snap the piece
into place, make your tasks similar to that and don't worry about the
individual factors.

So what do you want to learn?  How individual factors affect performance in
a particular display?  Whether users will be more effective working in
display A than display B?  My impression is that you're trying to learn the
former, so Doug's suggestion is the way to go. 

Jeff