Situa tion ar eness Men tal orkload and rust in oma tion Viable Empirically Suppor ognitiv e Engineering onstruc ts Raja Parasuraman Geor ge Mason University Thomas B
183K - views

Situa tion ar eness Men tal orkload and rust in oma tion Viable Empirically Suppor ognitiv e Engineering onstruc ts Raja Parasuraman Geor ge Mason University Thomas B

Sheridan Massachusetts Institute of echnology and Department of ransportation olpe Center Christopher D ickens Alion Science and echnology Micr o Analysis Design Division BSTRACT Cognitive engineering needs viable constructs and principles to prom

Download Pdf

Situa tion ar eness Men tal orkload and rust in oma tion Viable Empirically Suppor ognitiv e Engineering onstruc ts Raja Parasuraman Geor ge Mason University Thomas B




Download Pdf - The PPT/PDF document "Situa tion ar eness Men tal orkload and ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.



Presentation on theme: "Situa tion ar eness Men tal orkload and rust in oma tion Viable Empirically Suppor ognitiv e Engineering onstruc ts Raja Parasuraman Geor ge Mason University Thomas B"— Presentation transcript:


Page 1
Situa tion ar eness Men tal orkload and rust in oma tion: Viable Empirically Suppor ognitiv e Engineering onstruc ts Raja Parasuraman Geor ge Mason University Thomas B. Sheridan Massachusetts Institute of echnology and Department of ransportation olpe Center Christopher D. ickens Alion Science and echnology , Micr o Analysis & Design Division BSTRACT Cognitive engineering needs viable constructs and principles to promote better understanding and prediction of human performance in complex systems. Three human cognition and performance constructs that have been the subjects of

much attention in research and practice over the past three decades are situation awareness (SA), mental workload, and trust in automation. Recently, Dekker and Woods (2002) and Dekker and Hollnagel (2004; henceforth DWH ) argued that these constructs represent “folk models without strong empirical foundations and lacking scientific status. We counter this view by presenting a brief description of the large science base of empirical studies on these constructs. We show that the constructs can be operationalized using behavioral, physiological, and subjective measures, supplemented by

computational modeling, but that the constructs are also distinct from human performance. DWH also caricatured as “abracadabra a framework suggested by us to address the problem of the design of automated systems (Parasuraman, Sheridan, & Wickens, 2000). We point to several factual and conceptual errors in their description of our approach. Finally, we rebut DWH’s view that SA, mental workload, and trust represent folk concepts that are not falsifiable. We conclude that SA, mental workload, and trust are viable constructs that are valuable in understanding and predicting human-system

performance in complex systems. In tr oduc tion OGNITIVE ENGINEERING AND , MORE GENERALL , HUMAN ACTORS ERGONOMICS HF depend on the development of viable constructs and principles to pr omote better understanding of human performance in complex systems. HF/E sear ch puts for war d theor etically based constructs and tests their generality thr ough empirical studies in a wide variety of laborator y tasks, simulators, “micr oworlds, and actual ADDRESS CORRESPONDENCE TO: Raja Parasuraman, Arch Lab, MS 3F5, George Mason University, 4400 University Dr., Fairfax, VA 22030, rparasur@gmu.edu. Journal

of Cognitive Engineering and Decision Making , Volume 2, Number 2, Summer 2008, pp. 140–160. DOI 10.1518/155534308X284417. 2008 Human Factors and Ergonomics Society. All rights reserved. 140
Page 2
Viable ognitiv e Engineering onstruc ts 141 work domains. A desirable featur e is that the constructs and/or principles ar quantifiable in the form of either mathematical or computational models. At the same time, HF/E practice is concer ned with the judicious application of general principles identified thr ough this kind of esear ch to the design of systems to sup- supported constructs

and general principles that have enjoyed success accor ding to these criteria include stimulus-r esponse compatibility Fitts law , visual sear ch models, quantification of working memor y capacity , and the cr ossover model of tracking performance (Fitts, 1951; Jagasinski & Flach, 2006; Pr octor & van Zandt, 1984; ickens & Hollands, 2000). In this article, we discuss thr ee mor e ecent human performance constructs that have been the object of much attention in cognitive engineering esear ch and practice over the past thr ee decades: situation awar eness (SA), mental workload, and trust in

automation. chose to eexamine these ar eas for thr ee easons. First, ther e is now an extensive, matur e ecor d of empirical esear ch that has deep- ened scientific knowledge and understanding of these human performance ar eas. The time is ther efor e right to take stock, to examine whether these constructs ar worthwhile, and to see what has been lear ned. do not attempt a compr ehen- sive eview but instead summarize the main points of each. pr ovide evidence on the scientific status of these thr ee human performance constructs and ar gue that they ar e viable and valuable for use in HF/E

practice. A thir d eason for our examination is that these and elated questions wer discussed in two ecent articles by Dekker and colleagues. The issues wer e raised in the context of an attack on the viability of these HF/E constructs and their use in methods for function allocation between humans and automation. The elevant articles ar e by Dekker and oods (2002) and Dekker and Hollnagel (2004; henceforth Dekker , oods, Hollnagel, or DWH ). On the basis of a highly selective eview of some pr evious articles published by us (e.g., Parasuraman, Sheridan, ickens, 2000), DWH ar gued that

constructs such as SA and trust in automation (complacency) ar e “folk psychology concepts (or “folk models”) that ar e without empirical foundation and lacking scientific status. In this article, we counter this view by pointing to evidence on the opera- tionalization and empirical basis of these constructs. also show that their sci- entific status is as str ong as any in cognitive engineering or HF/E. DWH also stated that accident and other applied investigators—as well as, by implication, HF/E esear chers who study SA, mental workload, and trust—use these constructs as “explanations for

eal-world accidents and incidents in an uncritical and cir cular way . agr ee that some consumers of HF/E esear ch do unfortunately bandy
Page 3
142 Journal of ognitiv e ngineering and Decision Making / Summer 2008 about these constructs in an over generalized way occasionally—for example, the militar y pilot commander who extols the virtue of the “right stuf f and equates it with high SA. However , the leading esear chers do not, and they often point to moderating ef fects and other contributing factors that may limit their ability to account for complex eal-world phenomena. DWH

appear to have focused on the writings of the applied community , often published in non-peer -r eviewed outlets, and neglected to mention the lar ge body of empirical esear ch that constitutes the scientific literatur e on workload, SA, and trust. Ar e SA, Men tal orkload And rust in oma tion Just “F olk Models ”? DWH took HF/E esear chers and practitioners to task for using constructs such as SA, mental workload, and trust in automation. DWH used the term folk model to efer mainly to SA and to complacency . However , complacency has been studied primarily in lation to automation and can be

subsumed under the mor general ar ea of trust in automation. Specifically , complacency can be elated to trust in automation when it is gr eater than that warranted based on automation eliability or obustness. Furthermor e, workload not only is encounter ed in ever y- day parlance (by “common folk”) but also is sometimes used uncritically by HF/E consumers—in the manner that DWH pointed out—to “explain eal-world inci- to workload in both articles and stated that high workload could “possibly be that this concept shares the same characteristics they ascribed to SA and com placency . ther efor e

take DWH s critique to apply to each of these thr ee human performance constructs. or “folk psychology . DWH appar ently intended the characterization to be pejora- tive, with the folk tag meant to show that these constructs lack scientific status. str ongly disagr ee. But befor e we bur y Caesar , let us praise him. Hence, we view DWH s indictment of SA, mental workload, and trust as mer folk model concepts not as depr ecator y but rather as irr elevant. ar gue, as will
Page 4
be subsequently supported by efer ence to literatur e eviews covering many hun- dr eds of studies, that

these constructs ar e as well operationalized as any in HF/E esear ch. In fact, we challenge DWH to cite a lar ger body of supporting empirical their pr oposed concept of “team play . By empirical evidence, we mean contr olled experimental studies and/or validated computational models and not just subjec- tive obser vations, analytical exer cises, or personal opinions. The supporting evidence we describe includes experimental studies, conver g- car eful quantification and modeling ef forts that have been conducted in the exam- ination of SA, workload, and trust. In the following, we emphasize

both the scien- tific attributes underlying the thr ee constructs (e.g., their operational definitions, pr edictions, and falsifiability) as well as their diagnostic value in assisting HF/E practitioners to formulate specific solutions or emedies when any of the thr ee constructs suggests suboptimal human-system interaction. go on to specifi- cally addr ess the concer ns of DWH egar ding tr eatments of automation in func- tion allocation. Situation war eness DWH saw little merit in the construct of SA for HF/E esear ch and practice, instead consigning it to the folk model bin. For example, in

Dekker and Hollnagel (2004), they stated, Human factors have sic ] a sizeable stock of concepts that ar e used to expr ess insights about the functional characteristics of the human mind that underlie complex behavior The labels efer to concepts that ar e intuitively meaningful in the sense that ever yone associates something with them, so they feel that they understand them. The use and popularity of these labels ar e evidence of the psychological strategies people tur n to when confr onted with the daunting complex- ity of moder technological systems. In scientific terms it can be seen as a

tendency to ely on folk models. The concr ete ason for exploring these questions is the ecent pr oblem of human performance cr eated by high levels of automation on commer cial air craft flight decks. arious pr ovisional explanations for the obser ved pr oblems have been pr oposed, including complacency . , loss of situation awar eness and loss of ef fective cr ew esour ce management. The pr oblem is that the value of the constructs hinges on their commonsense appeal rather than their substance. Situation awar eness lacks a level of detail and so fails to account for a psychological mechanism

needed to con- nect featur es of the sequence of events to the outcome. The first and most evident characteristic of folk models is that they define their central constructs, the explanandum, by substitution rather than Viable ognitiv e Engineering onstruc ts 143
Page 5
decomposition or eduction. So instead of explaining the central con- struct by statements that efer to mor e fundamental and pr esumably better known explananda, the explanation is made by eferring to another phenomenon or construct that itself is in equal need of expla- nation. (pp. 79–80) In addition, in efer ence

to SA and complacency , DWH stated that “none of these folk claims have a str ong empirical basis (Dekker & oods, 2002, p. 242). pr ovide these somewhat lengthy quotes so that the eader may understand the natur e of DWH s ar guments against the empirical or scientific basis of SA and other constructs. However , the char ge that SA is without empirical foundation is itself without foundation. In the wake of the development of the SA construct by & oods, 1991). But now ther e is a substantial body of esear ch pointing to its 2006; sang & idulich, 2006; ickens, 2008; ickens et al., 2008). A key

featur e of SA is that the concept is diagnostic of dif fer ent human operator states and, ther efor e, specifically pr escriptive as to dif fer ent HF/E emedies when SA is insuf ficient (described later). Endsley s (2006) enduring definition of SA is “the per ception of the elements in the envir onment within a volume of time and space, the compr ehension of their meaning, and the pr ojection of their status in the near futur e (p. 529). This is a useful definition that emains fairly intact and espected by esear chers and practitioners to this day . The pr escriptive usefulness of this

definition derives as much fr om what SA is not as it does fr om what it is. Most important, SA epr esents a continuous diagnosis of the state of dynamic world. As such, ther e is a “gr ound truth against which its accuracy can be assessed (e.g., the objective state of the world or the objective unfolding of events that ar e pr edicted). SA is ther efor e quite distinct fr om (e.g., it is not) the choice or decision of what action to take as a consequence of the diagnosis. This is because ther e is no gr ound truth for choice, because choice must be made on the basis of explicit or implicit

values of outcomes, and these values can legitimately var fr om person to person. SA, on the other hand, is value fr ee. Accurate choice will depend on good SA, but choice is not the same as SA. This important distinc- tion echoes the partial independence of diagnosis and choice in expected-value models of decision making (Edwar ds, 1961) and of sensitivity and esponse bias in signal detection models (Gr een & Swets, 1966). 144 Journal of ognitiv e ngineering and Decision Making / Summer 2008
Page 6
Thir d, SA is not general knowledge (long-term memor for facts, pr ocedur es, or

mental models ), which has a elatively long time constant, being acquir ed (and for gotten) over periods of hours, days, and years. SA, on the other hand, generally applies to mor e rapidly evolving situations. Long-lasting knowledge supports but is distinct fr om SA. This degr ee of pr escriptive specificity of SA makes the construct useful for suggesting dif fer ent HF/E emedies when SA is found wanting. For example, acci- dents lated to poor SA would be addr essed by teaching workers wher e and how to better seek information (Level 1 SA), integrate it (Level 2), and pr edict its impli-

cations (Level 3) rather than, for example, teaching pr ocedur es or esponses to actions or changing the value structur e for dif fer ent actions. dif fer ent deci- sion aids should be pr ovided to physicians if their SA is poor (diagnostic assis- tance) than if the tr eatment they pr escribe (decision and choice) is wanting (Gar et al., 2005; Morr ickens, & North, 2006). Finally , just as the distinction between SA and br oader concepts of cognition (choice, knowledge) is useful to HF/E practitioners, the efinement of the thr ee levels within the SA construct is diagnostic. Dif fer ent

methodological techniques can be used to assess Level 1 SA (visual scanning) fr om those of Levels 2 and 3. Computational models of both Levels 1 and 2 have been developed and validated (W ickens et al., 2008). Br eakdowns in noticing (Level 1) invite dif fer ent classes of solutions (e.g., alerts and attention guidance) than do br eakdowns in understand- ing (Level 2; supported by display integration) and pr ediction (Level 3; supported by pr edictive displays). Mental orkload Ther e ar e many parallels between workload and SA. Both ar e constructs that ar e distinct fr om behavior and

performance, as well as constructs that ar e mea- sur ed by triangulation fr om physiology , performance, and subjective assessments, coupled with task analysis and computational modeling (W ickens, 2000). orkload, like SA, also generates specific diagnoses with pr escriptive HF/E me- dies when workload is suboptimal (her e we focus on excessive overload, although solutions for underload have also been addr essed). At the most general level, mental workload can be described as the elation between the function elating Viable ognitiv e Engineering onstruc ts 145
Page 7
the mental esour

ces demanded by a task and those esour ces available to be sup- plied by the human operator . Its histor y pr edates that of SA by about two decades, first compr ehensively eviewed in Moray (1979). Assessment of mental workload had a major impact on system design in the decision to downsize the flight deck fr om thr ee to two cr ewmembers by eliminat- ing the flight engineer s position. The pr oblem ar ose in the context of certification by the Federal iation Administration (F AA) of new air craft, such as the DC 9-80 and Boeing 757/767 (Ruggerio & Fadden, 1987). The decision was dir ectly

sup- ported by a workload assessment of the two-cr ew design to ensur e that the demands of flight tasks did not exceed the capacities of the two-person cr ew . The AA asked one of the curr ent authors (Sheridan) and Robert Simpson at MIT to investigate the possibility of measuring or quantitatively rating mental workload to pr ovide a basis for deciding between thr ee and two pilots. Following a year of riding an airline pilot s jump seat, Sheridan and Simpson (1979) developed workload rating scale, analogous to the then well established Cooper -Harper rat- ing scale for handling qualities,

that was used to support the two-cr ew design. Furthermor e, the solutions or emedies suggested by inadequate performance may not necessarily be the same as those suggested by a diagnosis of excessive workload. For example, performance decr ements may esult fr om an inadequate interface. If critical information is displayed in a location wher e it cannot be accessed or epr esented in an acr onym that cannot be understood, this is not workload issue, but it certainly degrades performance. A computer mouse with high gain will cr eate unstable tar get acquisition—a performance decr ement but,

again, not originally a workload pr oblem (even though the poor performance may elicit a subjective eaction of high per ceived workload). In contrast, a task envir onment that equir es the operator to hold a lar ge amount of information in working memor y while seeking mor e information or while conversing clearly does describe a workload pr oblem, even if most opera- tors ar e capable of doing so without making many err ors (as long as they ar e not interrupted by unexpected new demands). A human manual contr ol system with ver high display magnification even mor e clearly demonstrates the

dissociation between workload and performance. Such a system could pr oduce exceptionally good tracking performance (low err or), but the esulting investment of esour ces to corr ect even the smallest of deviations will incr ease to a gr eat extent the work- load that is experienced (V idulich & ickens, 1986). 146 Journal of ognitiv e ngineering and Decision Making / Summer 2008
Page 8
Viable ognitiv e Engineering onstruc ts 147 A thir d example of the dissociation between workload and performance comes fr om a classic study of air traf fic contr ollers by Sperandio (1971). As the

task load on an operator incr eases, one might pr esume that performance would decr ease and workload incr ease. However , Sperandio found that as the number of air craft in their sectors incr eased, appr oach contr ollers tasked with giving landing instructions to air craft maintained their performance at the same level, even as they experienced gr eater workload. Sperandio obser ved that contr ollers used verbal communications, to keep performance stable under high task load. (For discussion of operator adaptive strategies and workload, see Parasuraman Hancock, 2001.) Thus, the construct of

mental workload is vital in understanding the elation between objective task load and task strategies. A final example comes fr om a common daily activity: driving. orkload can be a better pr edictor of drivers futur e performance than of their curr ent perform- ance. One can measur e excessively high driver workload (e.g., using objective physiological or secondar y task measur es) when steering and speed ar e perfect and the car is pr operly and accurately following the ad. In other wor ds, high workload can be accommodated without immediate consequence for perform- driver may not allow him

or her to espond ef fectively to an unexpected incr ease in demand; consequently , the car may veer into a ditch. of brain and autonomic system activity (Kramer & Parasuraman, 2007; ust in Automation: The Issue of Complacency Automated systems have not always been used by human operators in the ways that designers intended (W iener & Curr , 1980; oods, 1996). Automation may be used appr opriately at times, but instances of misuse and disuse ar e also pr evalent (Parasuraman & Riley , 1997). One of the major factors influencing both misuse and disuse is trust in automation. Lee and Moray (1992)

conducted pio- neering esear ch showing that an operator s use of automation to contr ol a simu- lated manufacturing plant was dir ectly elated to his or her momentar y trust, which in tur n was elated to the type and fr equency of faults. In a subsequent study , Lee and Moray (1994) showed that operators chose to use automation if
Page 9
148 Journal of ognitiv e ngineering and Decision Making / Summer 2008 their trust in automation exceeded their confidence in their own ability to contr ol the plant but other wise chose manual contr ol. A noteworthy aspect of Lee and Moray s esear

ch was their development of simple mathematical model elating variations in trust in automation and self- confidence to human system performance. Since this work, ther e has been a lar ge body of esear ch examining the interr elationships between trust and factors such as automation eliability , err or type, task dif ficulty , and other factors (Bisantz Seong, 2001; Dzindolet, Peterson, Pomranky , Pier ce, & Beck, 2003; Dzindolet, Pier ce, Beck, Dawe, & Anderson, 2001; iegmann, Rich, & Zhang, 2001; see Lee & See, 2004, for a eview). This lar ge body of esear ch has clearly established the

importance of trust in the human use of automation. Although ther e ar e dif fer ent models and theories of trust in automation (Cohen, Parasuraman, & Fr eeman, 1998; Dzindolet et al., 2003; Lee & See, 2004; Madhavan & iegmann, 2007), ther e is br oad agr eement that the construct not only is worthwhile but also can be used, in conjunction with other factors, to pr edict human dependence (or not) on automation, as well as linked to objective measur es of human performance. Is the trust construct ele- vant to performance in the eal world? think so. In their critique, DWH ar gued that human

performance, rather than psychological constructs such as SA and complacency , should be used in understanding eal-world behavior , including accidents and incidents (Dekker & Hollnagel, 2004, p. 84). Their ar gument does not hold, because these concepts wer operationalized in terms of human per- formance. A vast amount of human performance literatur e on trust also pr ovides sound basis for its operationalization. ariations in trust ar e one factor in an operator s use of automation. Low levels of trust can lead to disuse, as is often the case with automated alarm systems that generate many

false alerts (Dixon, ickens, & McCarley , 2007; Parasuraman, perfectly eliable can be associated with overr eliance and failur e to monitor the “raw information sour ces that pr ovide input to the automated system. This is the complacency issue. The term is somewhat unfortunate, in that in ever yday parl- ance, complacency tends to suggest willful and ill-advised neglect, wher eas the a rational strategy (Moray & Inagaki, 2000). For historical asons stemming fr om its long use in the aviation domain, the term continues to appear as if the pr eponderance of descriptors is suf ficient evidence

to bring in a guilty ver dict on the char ge of lacking scientific status. Even in this method of guilt by association, DWH err ed. On two occasions (Dekker & Hollnagel, 2004, p. 81), they attributed one pr oposed definition of complacency—“self-satisfaction which may esult in non-vigilance based on an unjustified assumption of satisfactor y sys- tem state”—to an article written by one of the pr esent authors (Parasuraman,
Page 10
Viable ognitiv e Engineering onstruc ts 149 Molloy , & Singh, 1993). However , the original sour ce (as Parasuraman et al., is a NASA eport on the viation

Reporting Safety System (ASRS). The esults of the original study by Parasuraman et al. (1993) suggested that complacency eflects a strategy of allocating attention away fr om the automated task to other concurr ent tasks. The allocation strategy itself may esult fr om high trust in the automation. One limitation of the study was that trust was not meas- ur ed independently; however , this was done in subsequent studies that eplicated the basic findings (Bagheri & Jamieson, 2006; Bailey & Scerbo, 2007; Manzey Bahner , & Hueper , 2006). Moray and Inagaki (2000) suggested that the attention

allocation strategy could be rational and, furthermor e, that complacency should be inferr ed only if the rate of monitoring was below that of an “optimal obser ver who was equir ed to attend to many sour ces of information (the automated task being one such sour ce). Senders (1964) first showed that human obser vers tasked with monitor- ing for abnormal eadings on multiple dials with dif fer ent fr equencies of changing values (bandwidth) made eye movements to the dials in pr oportion to bandwidth. This is consistent with the Nyquist theor em (a continuous variable can be epr o- duced exactly

if it is sampled at least at twice the highest fr equency component). Moray and Inagaki suggested that a human operator who monitor d automation at a lesser rate than the optimal Nyquist fr equency was complacent and the one who monitor ed at a gr eater rate was skeptical, wher eas the one who monitor ed at the optimal rate was eutactic (or well calibrated). Moray and Inagaki s (2000) pr oposal is an important one. However the pr o- posal may be dif ficult to test because the fr equency spectrum of many eal-world
Page 11
150 Journal of ognitiv e ngineering and Decision Making /

Summer 2008 information sour ces can be complicated to compute and can also make it dif ficult for the human obser ver to per ceive the dif fer ent fr equencies of change. Moray and Inagaki also made the point that complacency should be evaluated independently of performance with automation. (This eiterates the distinction made pr eviously between a psychological construct such as SA, workload, or complacency and performance as an outcome.) Thus, for example, if complacency eflects allocation of attention to other concurr ent tasks, then eye movement ecor dings should show that operators scan

the raw information sour ces less fr equently when using automation than when performing the task manually or less fr equently when automation eliability is higher than lower . This is indeed the case (Bagheri 2005). Furthermor e, if operators ar e given an explicit tool to uncover the raw is indicated they use the tool less often under automation than under manual contr ol. Manzey et al. (2006) conducted a series of experiments that verify this pr ediction. Unfortunately , DWH eferr ed to none of this rich body of empirical work but instead got wrapped up in the issue of folk models and

inconsistent definitions of complacency , ther eby implying that ther e is no solid empirical evidence for the construct. They also raised the legitimate issue that accident investigators can use complacency and other human performance constructs in an uncritical way . But her e, too, they mistook the messenger for the message and ignor ed the underlying empirical evidence, as illustrated by the following: ake as an example an automation-r elated accident that occurr ed in 1973, when situation awar eness or automation-induced complacency had not yet come into use. The air craft in question was

on appr oach [and] was equipped with a slightly deficient “flight dir ector”. The airplane struck a seawall bounding Boston short of the runway killing all 89 people onboar d. The safety boar d explained how an accumulation of discr epancies, none of which wer e critical in them- selves, had rapidly br ought about a high-risk situation without posi- pr eoccupied with the information pr esented by his flight dir ector sys- tems, to the detriment of his attention to altitude, heading and air- speed contr ol. oday , both automation-induced complacency of the first of ficer and a loss of situation

awar eness of the entir e cr ew would most likely be cited under the causes of this crash. These “explana- tions (complacency , loss of situation awar eness) wer e obviously not needed in the early 1970s to deal with such accidents. The analysis instead pr oposed a set of mor detailed, mor e falsifiable, and mor traceable assertions that linked featur es of the situation (such as an accumulation of discr epancies) to measurable or demonstrable aspects
Page 12
of human performance (diversion of attention to the flight dir ector versus other sour ces of data). (Dekker & Hollnagel,

2004, p. 84) Note the last statement, in which they mentioned “diversion of attention to the flight dir ector versus other sour ces of data. This is the pr ecise finding fr om the empirical studies of complacency described earlier! Studies using eye move- ments and other information-sampling techniques do in fact point to diversion of attention away fr om the primar y information sour ces (Manzey et al., 2006; Metzger & Parasuraman, 2005; Thomas & ickens, 2004). As to DWH s state- ment that explanations should be based on “measurable or demonstrable aspects of human performance, this is

exactly what the studies described earlier did: They measur ed such behaviors as detection rate and esponse time and linked these to operational definitions of complacency or Level 1 SA, such as eye move- ments and use of an information-sampling tool. Summary on SA, orkload, and ust These dif fer ent methodologies pr ovide conver ging evidence for the scientific viability of these constructs. Of course, this is not to say that ther e is complete agr eement on the underlying mechanisms or that futur e esear ch may not lead to evisions of the componential structur es or modeling of these

constructs—that is the natur e of scientific pr ogr ess. But their scientific status seems indisputable. unc tion Alloca tion and the esign of oma ed yst ems The constructs of SA, mental workload, and trust ar e often invoked in consid- & oods, 2002) specifically addr essed the function allocation issue, using as springboar d a pr evious article by the pr esent authors (Parasuraman et al., 2000). A major aspect of their critique is based on their assertion of Parasuraman et al. (2000) belief in “fixed human and machine str engths and weaknesses (Dekker oods, 2002, p. 240). Viable ognitiv e

Engineering onstruc ts 151
Page 13
stated that machine strengths and weaknesses continue to evol ve (whereas human str engths and weaknesses evolve with lear ning), as shown in the following quotes: “Machines, especially computers, ar e now capable of carr ying out many functions that at one time could only be performed by humans (p. 286), and “any pr oposed (automation) taxonomy is likely to be super ceded by technological developments in methods for information integration and pr esentation (p. 294). Parasuraman et al. (2000) pr oposed that ther e was a need for determining experi-

mentally what should and should not be automated, based on cognitive engineer- ing data and other considerations (e.g., cost and ease of system integration). Of course, engineers must be cognizant of the “ballpark str engths and weaknesses of both humans and machines to practice the art of engineering in performing exper- imental developments. However , Parasuraman et al. (2000) explicitly stated that these capabilities ar e not fixed but change over time. Like many others, we appr eciate satir e, especially when insightful. But we see little value in demeaning the historically important Fitts

(1951) “Men ar e better at, machines ar e better at (MABA-MABA) table by calling it, as DWH did, “abra- the Fitts list is no longer valid as originally stated, because machines have long since surpassed humans in many of the categories stated by Fitts in 1951. DWH also asserted that “these lar ge mother hood labels for possible perform- ance decr ements know little consensus in the HF community . They also eferr esembles a linear input output device that takes an impoverished stimulus fr om the world and converts it into a esponse by adding meaning thr ough vari- DWH s statements r eveal a

navet about how engineering is done, as well as how information-pr ocessing models have evolved, the latter per haps as a esult of neg- lect or their having ead only the older and not moder n cognitive psychology liter- atur e (W ickens & Carswell, 2007). a d f i s p ( P e a DWH criticized MABA-MABA lists (and Parasuraman et al. s [2000] levels and stages of automation appr oach, by infer ence) for not including many other human 152 Journal of ognitiv e ngineering and Decision Making / Summer 2008
Page 14
attributes they list—information filtering, scheduling,

anticipating, inferring, lear ning, and others—which “fall by the wayside, misleading the designer who elies on the list and foster “the idea that new technology can be intr oduced as simple substitution of machines for people (Dekker & oods, 2002, p. 240). In fact, however , many of these ar e tr eated explicitly within the Parasuraman et al. (2000)—and sur ely never intended—so that they can knock it down. Thr oughout their article, Dekker & oods (2002) stated that Parasuraman et al. (2000) ar Certainly computers and automation ar e r eplacing human sensor y-motor and cognitive tasks of all

kinds, and quite successfully . But the “substitution is far fr om simple, as DWH suggested that Parasuraman et al. (2000) believe. Curr ently for example, major gover nment and industr y ef forts ar e under way to eallocate human and machine functions for air traf fic contr ol (A TC). The issue has arisen in the context of moder nizing the TC system, which is incr easingly ill equipped to handle the pr ojected gr owth in air traf fic over the next decade. Because it has been ascertained empirically that human contr ollers simply cannot perform the neces- sar y tasks, it is believed that mor e

automation can help. The work on how to evise the human and machine oles and develop the automation is a gigantic esear ch, development, and human-in-the-loop simulation ef fort, not one of mer substitution. The eader should also note that the original Parasuraman et al. (2000) article was a consequence of commendations fr om a thr ee-year study (1994–1997) by the National Resear ch Council panel on human factors in the curr ent and futur TC system. All thr ee of the pr esent authors ser ved on that panel, and it is gratify- ing to note that the ecommendations for automation design made in the

panel eport (W ickens, Mavor , Parasuraman, & McGee, 1998) appear to have been fol- lowed by the AA (Ahlstr om, Longo, & ruitt, 2005). principles as a emedy for better team play . do not efute the latter but note that the thr ee pr escriptions for better team play that they pr opose—highlighting changes, displaying futur e pr ojections, and visually integrating information (eco- logical displays)—ar e not new but instead ar e time-honor ed principles that wer validated many years ago in HF/E esear ch. Furthermor e, all of these ecommen- dations ar e explicitly incorporated within Stages 1 and

2 of the automation taxon- omy pr ovided by Parasuraman et al. (2000). Viable ognitiv e Engineering onstruc ts 153
Page 15
154 Journal of ognitiv e ngineering and Decision Making / Summer 2008 Discussion examining the constructs of SA, mental workload, and trust. cannot bur den the eader with a compr ehensive eview , but we can confidently state that empiri- cal studies supporting the SA, workload, and trust constructs ar e numer ous (in the hundr eds). Quantity is not ever ything, of course, but the vast majority of the empirical esear ch is of high quality . The studies meet the

most widely accepted was a drawback, which, as we have ar gued, it is not). The pr oduct of quantity and quality is a lar ge number
Page 16
Viable ognitiv e Engineering onstruc ts 155 well-supported HF/E principles, but they apply primarily to interface design, not automation functionality design. Thus, the two appr oaches ar e nicely comple- mentar also see in DWH s team play concept the invocation of a thir meta- component, A final ar gument raised by DWH against the scientific status of SA, workload, and trust—their attempt to put the last nail in the cof fin—is that these

constructs, constructs in the same camp as Fr eudian psychoanalytic “theories of toilet train- ing of infants (Dekker & Hollnagel, 2004, p. 81). This is another example of their blame-by-association method. have two esponses. First, a fundamental principle of science and the scientific method is that it continually seeks to impr ove itself by falsifying curr ent (accepted) ideas (explana- tions, pr edictions) based on evidence. This is what distinguishes science fr om metaphor (art, music, poetr eligion, beauty , love, etc., or pseudoscientific theo- ries such as Fr eudian psychoanalysis,

which ar e not falsifiable), based on rational rules for what constitutes cr edible and acceptable evidence (e.g., ther e is no evi- dence for God). However , constructs such as SA, mental workload, and trust ar falsifiable in ter ms of their usefulness in pr ediction. (Because a construct is not a state- ment of fact, falsifiability of the construct itself is a meaningless notion.) For example, that a particular interface to System A leads to some measurable degr ee of SA, wher eas System B is associated with less SA, is quite falsifiable. Refer enc es Adolphs, R. (2002). rust in the brain.

Natur e Neur oscience, 5, 192–193. Ahlstr om, ., Longo, M., & ruitt, (2005). Human factors design guide (DOT/F AA/CT -02-11). Atlantic City , NJ: Federal viation Administration.
Page 17
156 Journal of ognitiv e ngineering and Decision Making / Summer 2008 Bagheri, N., & Jamieson, G. A. (2004). Considering subjective trust and monitoring behavior in (Eds.), Human per for mance, situation awar eness, and automation , HPSAA II (pp. 54–59). Mahwah, NJ: Erlbaum. Bailey , N., & Scerbo, M. (2007). Automation-induced complacency for monitoring highly- Issues in Er gonomics Science, 8,

321–348. Billings, C. E., Lauber , J. K., Funkhouser , H., yman, G., & Huf f, E. M. (1976). viation safety eporting system (T ech. Rep. TM-X-3445). Mof fett Field, CA: NASA Ames Resear ch Center Bisantz, A., & Seong, (2001). Assessment of operator trust in and utilization of automated Er gonomics, 28, 85–97. Br oadbent, D. E. (1958). Per ception and communication. London: Per gamon. Pr ogram. Dekker , S. A., & Hollnagel, E. (2004). Human factors and folk models. Cognition, echnology and ork, 6, 79–86. Dekker , S. A., & oods, D. D. (2002). MABA-MABA or abracadabra? Pr ogr ess on human-

automation coor dination. Cognition, echnology , and ork, 4, 240–244. Dixon, S., ickens, C. D., & McCarley , J. M. (2007). On the independence of eliance and compliance: Ar e false alarms worse than misses? Human Factors, 49, 564–572. Durso, ., Rawson, K., & Girr otto, S. (2007). Compr ehension and situation awar eness. In Durso (Ed.), Handbook of applied cognition (pp. 163–194). est Sussex, UK: iley Dzindolet, M. ., Peterson, S. A., Pomranky , R., Pier ce, L. G., & Beck, H. (2003). The ole of trust in automation eliance. Inter national Jour nal of Human-Computer Studies, 6, 697–718.

Dzindolet, M. ., Pier ce, L. G., Beck, H. ., Dawe, L. A., & Anderson, B. (2001). Pr edicting misuse and disuse of combat identification systems. Military Psychology 13 , 147–164. Edwar ds, (1961). Behavioral decision theor Annual Review of Psychology , 12, 473–489. Endsley , M. (1995). Measur ement of situation awar eness in dynamic systems. Human Factors, 37, 65–84. Endsley , M. (2006). Situation awar eness. In G. Salvendy (Ed.), Handbook of human factors and er gonomics (3r d ed., pp. 528–542). New ork: iley Endsley , M., Bolte, B., & Jones, D. G. (2003). Designing for situation awar eness:

An appr oach to human-center ed design. London: aylor & Francis. Farr ell, S., & Lewandowsky , S. (2000). A connectionist model of complacency and adaptive ecover y under automation. Jour nal of Experimental Psychology: Lear ning, Memory and Cognition, 26, 395–410. Fitts, (Ed.). (1951). Human engineering for an ef fective air navigation and traf fic contr ol system. ashington, DC: National Resear ch Council. Flach, J. (1995). Situation awar eness: Pr oceed with caution. Human Factors, 37, 149–157. Gaines, B. R., McCarthy , J. C., Fallon, E., & Bannon, L. (Eds.). (2000). Function allocation

[Special issue]. Inter national Jour nal of Human-Computer Studies, 52 (2). due to automated-r elated complacency in IFR-rated general aviation pilots. Pr oceedings of the Inter national Symposium on viation Psychology (pp. 245–249). Columbus: Ohio State University
Page 18
Gao, J., & Lee, J. D. (2006). Extending the decision field theor y to model operators eliance on automation in super visor y contr ol situations. IEEE ransactions on Systems, Man, and Cyber netics, 36, 943–959. Garland, D. J., & Endsley , M. R. (Eds.). (2000). Situation awar eness analysis and measur ement. Mahwah,

NJ: Erlbaum. Gr een, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New ork: iley Jagasinski, R., & Flach, J. (2006). Contr ol theory for humans. Mahwah, NJ: Erlbaum. Jor dan, N. (1963). Allocation of functions between man and machines in automated systems. Jour nal of Applied Psychology , 47, 161–165. Kahneman, D. (1973). Attention and ef fort. Upper Saddle River , NJ: Pr entice Hall. Kramer , A., Kirlik, A., & iegmann, D. (2007). Attention: Fr om theory to practice. New ork: Oxfor d University Pr ess. Kramer , A., & Parasuraman, R. (2007). Neur oer

gonomics—Application of neur oscience to human factors. In J. Caccioppo, L. assinar & G. Ber ntson (Eds.), Handbook of psy- chophysiology (2nd ed., pp. 704–722). New ork: Cambridge University Pr ess. Krueger , ., McCabe, K., Moll, J., Kriegeskorte, N., Zahn, R., Str enziok, M., et al. (2007). Neural corr elates of trust. Pr oceedings of the National Academy of Sciences USA , 104, 20085–20089. Lee, J. D., & Moray , N. (1992). rust, contr ol strategies, and allocation of function in human machine systems. Er gonomics, 22, 671–691. Lee, J. D., & Moray , N. (1994). rust, self-confidence, and

operators adaptation to automation. Inter national Jour nal of Human-Computer Studies, 40, 153–184. Lee, J. D., & See, K. A. (2004). rust in automation and technology: Designing for appr opriate eliance. Human Factors, 46, 50–80. Lor enz, B., Di Nocera, ., Rottger , S., & Parasuraman, R. (2002). Automated fault management in a simulated space flight micr o-world. viation, Space, and Envir onmental Medicine, 73, 886–897. Madhavan, ., & iegmann, D. A. (2007). Similarities and dif fer ences between human-human and human-automation trust. Theor etical Issues in Er gonomics Science, 8, 277–301.

Manzey , D., Bahner , J. E., & Hueper , A.-K. (2006). Misuse of automated aids in pr ocess contr ol: Factors and Er gonomics Society McCloskey , M. (1983). Intuitive physics. Scientific American, 248 (4), 122–130. Metzger , U., & Parasuraman, R. (2005). Automation in futur e air traf fic management: Ef fects of decision aid eliability on contr oller performance and mental workload. Human Factors, 47, 35–49. Miller , C., & Parasuraman, R. (2007). Designing for flexible interaction between humans and automation: Delegation interfaces for super visor y contr ol. Human Factors, 49, 57–75. Molloy ,

R., & Parasuraman, R. (1996). Monitoring an automated system for a single failur e: igilance and task complexity ef fects. Human Factors, 38, 311–322. Moray , N. (1969). Wher e is capacity limited? A sur vey and a model. Acta Psychologica, 27, 84–92. Moray , N. (1979). Mental workload. New ork: Plenum. Moray , N., & Inagaki, (2000). Attention and complacency . Theor etical Issues in Er gonomics Science, 1, 354–365. Viable ognitiv e Engineering onstruc ts 157
Page 19
Morr D. G., ickens, C. D., & North, R. (2006). Reducing and mitigating human err in medicine. In R. S. Nickerson (Ed.),

Reviews of human factors and er gonomics (V ol. 1, pp. 254–296). Santa Monica, CA: Human Factors and Er gonomics Society Parasuraman, R. (1998). The attentive brain. Cambridge, MA: MIT Pr ess. Parasuraman, R., & Davies, D. R. (1984). arieties of attention. San Diego, CA: Academic Pr ess. Parasuraman, R., & Hancock, A. (2001). Adaptive contr ol of workload. In A. Hancock E. Desmond (Eds.), Str ess, workload, and fatigue (pp. 305–320). Mahwah, NJ: Erlbaum. collision-war ning systems. Er gonomics, 40, 390–399. Parasuraman, R., Molloy , R., & Singh, I. L. (1993). Performance consequences of

automation- induced “complacency . Inter national Jour nal of viation Psychology , 3, 1–23. Parasuraman, R., & Riley , (1997). Humans and automation: Use, misuse, disuse, abuse. Human Factors, 39, 230–253. Parasuraman, R., Sheridan, B., & ickens, C. D. (2000). A model of types and levels of human interaction with automation. IEEE ransactions on Systems, Man, and Cyber netics Part A: Systems and Humans, 30, 286–297. Parasuraman, R., & ilson, G. (2008). Putting the brain to work: Neur oer gonomics past, pr esent, and futur e. Human Factors, 50, 468–474. Pashler , H. (1998). The psychology of

attention. Cambridge, MA: MIT Pr ess. Popper , K. R. (1959). The logic of scientific discovery London: Hutchinson & Co. Posner , M. I. (2004). Cognitive neur oscience of attention. New ork: Guilfor d. Pr octor , R., & van Zandt, (1984). Human factors in simple and complex systems. Needham Heights, MA: Allyn & Bacon. Riley , A. (2000, October). Developing a pilot-center ed autoflight interface. Pr oceedings of the orld viation Congr ess and Exposition (pp. 241–245). rr endale, A: SAE Inter national. Ruggerio, ., & Fadden, D. M. (1987). Pilot subjective evaluation of workload during a flight

test certification pr ogram. In A. H. Roscoe (Ed.), The practical assessment of pilot workload (NA TO AGARDograph No. AGARD-AG-282; pp. 32–36). Essex, UK: Specialised Printing Ser vices. Sarter , N. B., & oods, D. D. (1991). Situation awar eness: A critical but ill-defined phenome- non. Inter national Jour nal of viation Psychology , 1, 45–57. Senders, J. (1964). The human operator as a monitor and contr oller of multidegr ee of fr eedom systems. IEEE ransactions on Human Factors in Electr onics, HFES-5, 2–6. Sheridan, (2002). Humans and automation: Systems design and esear ch issues. New ork:

iley Reviews of human factors and er gonomics (V ol. 1, pp. 89–129) Santa Monica, CA: Human Factors and Er gonomics Society Sheridan, B., & Simpson, R. (1979). owar d the definition and measur ement of the mental Institute of echnology Sheridan, B., & erplank, L. (1979). Human and computer contr ol of undersea teleoperators (Man-Machine Systems Lab Rep.). Cambridge: Massachusetts Institute of echnology Sperandio, J. C. (1971). ariation of operator s strategies and egulating ef fects on workload. Er gonomics, 14, 571–577. Thomas, L. C., & ickens, C. D. (2004). Eye-tracking and individual dif

fer ences in of f-normal Factors and Er gonomics Society 48th Annual Meeting (pp. 223–227). Santa Monica, CA: Human Factors and Er gonomics Society sang, ., & idulich, M. (2006). orkload and situation awar eness. In G. Salvendy (Ed.), Handbook of human factors and er gonomics (3r d ed., pp. 243–268). New ork: iley 158 Journal of ognitiv e ngineering and Decision Making / Summer 2008
Page 20
Viable ognitiv e Engineering onstruc ts 159 sang, ., & ilson, G. (1997). Mental workload. In G. Salvendy (Ed.), Handbook of human fac- tors and er gonomics (2nd ed., pp. 417–449). New ork: Plenum.

idulich, M. A., & W ickens, C. D. (1986). Causes of dissociation between subjective workload measur es and performance. Applied Er gonomics, 17, 291–296. ickens, C. D. (1984). Pr ocessing esour ces in attention. In R. Parasuraman & D. R. Davies (Eds.), arieties of attention (pp. 63–102). Orlando, FL. Academic Pr ess. ickens, C. D. (2000). The tradeof f of design for outine and unexpected performance: Implications of situation awar eness. In D. J. Garland & M. R. Endsley (Eds.), Situation awar eness analysis and measur ement (pp. 211–226). Mahwah, NJ: Erlbaum. ickens, C. D. (2002). Multiple

esour ces and performance pr ediction. Theor etical Issues in Er gonomics Science, 3, 159–177. ickens, C. D. (2008). Situation awar eness: Review of Mica Endsley s 1995 articles on SA the- or y and measur ement. Human Factors, 50, 397–403. ickens, C. D., & Carswell, C. M. (2007). Information pr ocessing. In G. Salvendy (Ed.), Handbook of human factors & er gonomics (3r d ed., pp. 111–149). New ork: iley ickens, C. D., & Dixon, S. R. (2007). The benefits of imperfect diagnostic automation: A syn- thesis of the literatur e. Theor etical Issues in Er gonomics Science, 8, 201–212. of the 13th

Inter national Symposium on viation Psychology (pp. 818–823) Dayton, OH: right-Patterson Air For ce Base. ickens, C. D., & Hollands, J. G. (2000). Engineering psychology and human per for mance. Upper Saddle River , NJ: Pr entice Hall. ickens, C. D., Mavor , A., Parasuraman, R., & McGee, J. (1998). The futur e of air traf fic contr ol: Human operators and automation. ashington, DC: National Academy Pr ess. ickens, C. D., & McCarley , J. M. (2007). Applied attention theory Boca Raton, FL: CRC Pr ess. ickens, C. D., McCarley , J. S., Alexander , A., Thomas, L., Ambinder , M., & Zheng, S. (2008).

Attention-situation awar eness (A-SA) model of pilot err In D. Foyle & B. Hooey (Eds.), Human per for mance models in aviation (pp. 213–242) Boca Raton, FL: CRC Pr ess. ickens, C. D., & Pr evett, (1995). Exploring the dimensions of egocentricity in air craft navi- gation displays. Jour nal of Experimental Psychology: Applied, 1, 110–135. iegmann, D. A., Rich, A., & Zhang, H. (2001). Automated diagnostic aids: The ef fects of aid eliability on users trust and eliance. Theor etical Issues in Er gonomics Science, 2, 352–367. iener , E. L., & Curr , R. E. (1980). Flight deck automation: Pr omises

and pr oblems. Er gonomics, 23, 995–1011. ilson, G. (2000). Strategies for psychophysiological assessment of situation awar eness. In M. Endsley & D. Garland (Eds.), Situation awar eness analysis and measur ement (pp. 175–188). Mahwah, NJ: Erlbaum. oods, D. (1996). Decomposing automation: Appar ent simplicity , eal complexity . In R. Parasuraman & M. Mouloua (Eds.), Automation and human per for mance: Theory and appli- cation (pp. 3–17). Mahwah, NJ: Erlbaum. eh, ., & ickens, C. D. (1988). Dissociation of performance and subjective measur es of workload. Human Factors, 30, 111–120. aja asur

aman is professor of psychology at George Mason University, Fairfax, VA. He is also director of the Graduate Program in Human Factors and Applied Cognition. He has engaged in long-standing research programs in human factors and cognitive neuroscience, focusing on attention, aging, human-automation interaction, neuroimaging, and vigilance. He has written 10 books, including The Psychology of Vigilance (1982), Automation and Human Performance (1996), The Attentive Brain (1998), and Neuroergonomics: The Brain at
Page 21
Work (2007). He is a Fellow of the Human Factors and Ergonomics

Society and of the American Association for the Advancement of Science. homas Sheridan is a professor of engineering and applied psychology emeritus in the Departments of Mechanical Engineering and Aeronautics-Astronautics at the Massachusetts Institute of Technology, where he was director of the Human-Machine Systems Lab. Currently, he is senior research fellow at the U.S. Department of Transportation, Volpe Center. His research interests are in human-automation interaction, modeling, and cognitive engineering. He has authored or coauthored five books and is a member of the National Academy

of Engineering. 160 Journal of ognitiv e ngineering and Decision Making / Summer 2008