From: Jian Chen [jchen3@Bayou.UH.EDU]
Sent: Thursday, October 04, 2001 10:08 AM
To: Jeff Pierce
Cc: bowman@VT.EDU; 3dui List
Subject: RE: Comparison of HMD and Dome

Thanks Dr. Pierce and Dr. Bowman. is there a way to trasfer these two, if
I can ask? Because if we want to evaluate both, the model creation is time
consuming, and the human subject needs much longer time (our human tests
last one year).

    best,
  Jian
----------------------------
Computer Science Department
University of Houston

On Wed, 3 Oct 2001, Jeff Pierce wrote:

> At 11:55 AM 10/3/01, Doug Bowman wrote:
> >> Generalizing the tasks performed in VE is exactly what I am doing. Your
> >> "testbed evaluation" paper is really useful.
> >>
> >> There is one point I got confused. I noticed that the model of the testbed
> >> (also other previous research) was created based on tasks. For instance,
> >> for selection and manipulation, a boxes scene was created; for travel
> >> , a town type scene was created. But in our model, we did all test by
> >> using only one model. The VE is a room with a set of 15 colored balls,
> >> cubes, torus, cylinders, and paramids (5 of each type, 3 of each color)
> >> along one wall and 15 matching platforms along the opposite wall. The task
> >> requires subjects to move the 15 balls on the left side the room over to
> >> the matching 15 platforms on the right side of the room. The subject has
> >> to go through a maze type walls, while avoiding the walls.  We measure
> >> 6-DOF manipulation performance, task finishing time, sickness, and so on.
> >> Notice that subjects have to navigation to the object, pick up and then
> >> navigation, then drop. Two constant speeds are allowed.
> >>
> >> Does it matter we measure them together? My program did record all points
> >> (in each loop) subject traveled to. Should I seperate the task performace
> >> and discuss them?
> >
> >Jian,
> >
> >If you're looking for generalizable results, my personal feeling is
> >that it's best to separate the tasks and control all the outside factors
> >as much as possible. That way, when you do your analysis you can find
> >out statistically which factor is responsible for any changes in
> >performance.
> >
> >There are some other people on the list who disagree with me (would Jeff
> >like to chime in here?).
>
> How can I resist the opportunity to engage in thesis procrastination?  =)
>
> For the rest of the list who're wondering what Doug is talking about, he
> and I had a short discussion about how well the results from particular
> tasks correspond to the results from real work.  For example, many testbed
> tasks involve the manipulate of generic shapes (e.g. cubes, spheres) to
> allow experimenters to isolate the contributions of individual factors
> (e.g. size, distance).  If we instead had users manipulate familiar objects
> (e.g. chairs) their task performance could be affected because they
> recognized the objects and made assumptions about their properties (e.g.
> size, distance).  The advantage to using these types of tasks is that we
> can learn a great deal about the contributions of individual factors (e.g.
> how does doubling the size of a cube affect task performance?).  The
> disadvantage is that the results do not necessarily "generalize" as well to
> real work.  When engaged in real work, users that extra information
> available: they're working with familiar objects, allowing them to take
> advantage of the known properties of those objects.
>
> When you create these types of tasks you're choosing a particular point on
> a spectrum: more confidence about the contributions of individual factors
> in exchange for less confidence about how well the results transfer to real
> work.  Fred Brooks wrote a paper in 1988 that discusses this spectrum; it's
> worth a read if you haven't looked at it:
>
> Frederick P. Brooks. Graphics Reality Through Illusion: Interactive
> Graphics Serving Science. CHI 1988 Proceedings, pages 1-11.
>
> We can also choose a point closer to the other end of the spectrum.  On
> this end you get more confidence about how well your results reflect real
> work, but you pay with less confidence about the effects of individual
> factors. Consider a case where I want to learn whether technique A or
> technique B is better for arranging objects in a scene.  If I create tasks
> of this type (e.g. moving furniture around a room, moving rides around an
> amusement park) I will arguably have more confidence in the result then if
> I make people move cylinders around a featureless environment.  However, I
> won't be able to state the effects of size with as much confidence.  If
> users are much less accurate positioning furniture at 500 feet than
> amusement park rides, is the difference because of the relative size of the
> objects or because of the types of objects (furniture vs. rides)?
>
> The trick, of course, is determining where on the spectrum you should be.
> I tend to lean toward the latter end of the spectrum because I'm an
> engineer at heart. If I need to choose a technique for a VE where drama
> students will be prototyping stage layouts, I'm probably better served by
> the latter type of study.  On the other hand, to learn all about a
> particular technique and what makes it tick, you're probably better served
> by the former type of study. If you're more of a scientist interested in
> learning Truth you probably lean toward this end.
>
> The question you need to answer is what exactly you want to learn. Doug's
> recommendation (separate the tasks and control all the outside factors as
> much as possible) will help you draw conclusions about how particular
> factors affect performance in a particular display.  For example, you might
> learn that when you double the size of the spheres performance gets faster
> in the HMD but not in the dome.
>
> On the other hand, all you might care about is whether training is more
> effective in display A than display B.  In this case you need to focus more
> on making your tasks resemble real work than on controlling the individual
> factors.  If you're training people to pull a piece of the international
> space station from the shuttle, navigate through space, and snap the piece
> into place, make your tasks similar to that and don't worry about the
> individual factors.
>
> So what do you want to learn?  How individual factors affect performance in
> a particular display?  Whether users will be more effective working in
> display A than display B?  My impression is that you're trying to learn the
> former, so Doug's suggestion is the way to go.
>
> Jeff
>
>
>
>