/
DOCUKENT WOHEED 118 621AUTHORHyatt C 7 De Berg 0 HTITLEPerfor DOCUKENT WOHEED 118 621AUTHORHyatt C 7 De Berg 0 HTITLEPerfor

DOCUKENT WOHEED 118 621AUTHORHyatt C 7 De Berg 0 HTITLEPerfor - PDF document

megan
megan . @megan
Follow
343 views
Uploaded On 2021-07-07

DOCUKENT WOHEED 118 621AUTHORHyatt C 7 De Berg 0 HTITLEPerfor - PPT Presentation

r4PERFORMANCE EVALUATIONcvrTHE USE OF SCORING SYSTEMS INNADAPTIVETRAININGCX01r110ByT114cCamSquadron Leader C JHyatt RAF and Captain 6H DeBerg USAFLiJr4ftiINTRODUCTIbNThis paper ID: 855739

training scoring logic system scoring training system logic performance process time systems data figure error adaptive trainee parameters approach

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "DOCUKENT WOHEED 118 621AUTHORHyatt C 7 D..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1 DOCUKENT WOHEED 118 621AUTHORHyatt, C. .
DOCUKENT WOHEED 118 621AUTHORHyatt, C. .7;; De Berg, 0. H./TITLEPerformance Evaluatian; The Use of.ScoringSystems inAdaptive Training.'"i['det 74]-.,,L.23p.; Paper presented at/the Annual Meeting of theMilitary Testing AsSbciation (16th, Oklahoma City,,Oklahodh, Octlober 21-25,;1974).JLTH005.111PUB DATENOTEEDRS PRICEMF-$0.83 HC-$1.0 Plus PostageDESCRIPTORS*Flight Training; Measurement; *Military Training;*Scoring; *Task Perfo\rmance; *TrackingABSTRACTResearch is described involving the development of a.scoriniystemfor performance evaluation. The example' used isaircraft landing. Tables are included which give a suggested methofor establishing a relevant 'scoring system, in relation to thisexample.(DEP)',********to****4*********************************************************Documents acqUired-by ERIC include many informalunpublished** materials not available from other sources. ERIC makes every effort ** to obtainthewest dopy'available."Vevertheless, items of marginal*,,* reproOucibility are often encountered and this affects: the quality*'* of the microfiche and hardcopy repio

2 ductions ERIC makes= available`** via th
ductions ERIC makes= available`** via the ERIC Document Reproduction Service 4EDRS).,EDRS is not%** responsible/ -bar the quality of the original document. Reproductions' ** suppliedjay EDRS are the best that can be made from the original.**0********V*************A*************11*******************************tir-41,ktiV3 r-4PERFORMANCE EVALUATIONcv\.-.r)THE USE OF SCORING SYSTEMS INNADAPTIVETRAININGCX)01r-110ByT--114c.Cam.Squadron Leader C. J.Hyatt, RAF and Captain 6..H. DeBerg, USAFLiJ,r'4f:tiINTRODUCTIbNThis paper, is intended to describe workwhich we have done at theitCrew Station D sign Facility, in the fieldof scoring complex trackingtasks.It intended to provide a down-to7earthapproach'to the develop-ment of a plosophy'for scoring systems without gettingdeeply involvedin mathematical approaches oranalysis or results.The typical sort oftracking task we might look at in thefacility is landing an aircraftinInstrument Flying Conditions, usingvariousforms of 'approach aid, andwith various task loadings.The approach we use to this problemwill obviously read acrOSs fairlyreadily

3 to many other areas ofstudy involving t
to many other areas ofstudy involving the use of"motorskills.'Our work"to date has been entirelyin the field of developmentalstudiesratherAthan training,ut since both thesepiocesses are amenable totreatment by an adaptil----, oop approach, they have manyfeatures in common,.and in particular,they both need a meaningfulscoring system.Futureprojects wifich will requite us to developscoring systems specificallyfor adaptive training are theSimulator for Air to Air Combat, andtheAdvanced Simulator for UndergraduatePilot-Training.-Figure l'shows how the Crew StationDesign FacilityJits into theASD organization.It is partof the Directorate .of'Crew and AGE Engi-neering,"which is responsible.forproviding an advisory service totheindividual systems program officesof ASD.Ydu will see that we are inthe'same div"is"ion as the simulatorbranch, and we also have simulatorsof our own, used almostexclusively for experimental work..'As the emphaSis onstlatorsin training increases, andthesesimulators become moresophisticated, the need for sophistication in ourscoring systems increases, andOver the

4 last year, we- havefound itnecessary to
last year, we- havefound itnecessary to educate our'customers'.. This-paper 10 distillationofour studies.GENERAL DISCUSSIONOF PHILOSOPHY-I want to startbyi.00kingL* at some definitions; working definitionsrather than academic ones.The first of these is what w9 meanby anadaptive process.Figure 2 shows how we-have defined it-for our purposesin veryo3eneral terms.$155U S DEPARTMENT OF HEALTH.EDUCATION &WELFARENATIONAL INSTITUTE OFEDUCATIONTHIS DOCUMENT HAS BEEN REPRO-OUCEOEXACTLY AS RECEIVEI F ROMTHE PERSON OR ORGACTiZAT ION ORIGIN-ATING. IT POINTS OF VIEW OR OPINIONSSTATED 00 NOT NECESSARILY REPRE-SENT OFFICIAL NATIONAL INSTITUTE OF. EDUCATION POSITION OR POLICY NFigure 3 shows Ilona this process can bg represented bya classic threeblock loop, ehd I think[Most workers in this fieldwould accept thisgeneral concept.More specifically.in the'context of training, thethree blocks can be more precisely defined.as shown in the. lower halfot the figure.There already appears to be a fair consensus of oppiodthat the quality of the performancemeasure is "the make -or break"fea-ture of the s

5 ystem.Most people would agree that it is
ystem.Most people would agree that it is unfortunatelyalso heavily subjective.To make the performance measure as good aspossible let us first look at whetwe want froi a scoring system.Wehave madee list (Figure 4) of qualities which appearto be important:The first and,'most important is thatthe scoring system should bedirectly related. to the objectives of the adaptiveprocess:To ensurethis, we have to ptrsuade the'trainingor experidental.director todefine his objectives clearly.,The remaining qualities are.not in any particular order:theobjettives axe multiple, each aspect must' havesome element of thescoring system directed toward it.Many pafemeters may be collected during a'study, and themass ofdata can be confusing.It is necessary to reduce this to manageableand comprehensible proportions.Although some subjective decisions must be made in the formulationof the scoring system, the user should not have to makeany whenapplying it-ideally a computer should'be able to handle it.Having achieved a numericalrvalue for the score, it should bepossible to relate the various ranges with

6 an acceptability index,for example:Excel
an acceptability index,for example:Excellent, Good, Average, Poor, Unacceptable.For the greatest benefit to be obtaned fpom the adaptivesystem,knowledge of results is a valuable aid, .and hence it is desirableif possible for the scores to be generated in real time.To fully apply these principles we feel that it isnecessary todevelop the performance measure block of the classic adaptive loop(Figure 5).The classic adaptive training loop calls for the applipktionof the philosophy of training in the formulation of the adaptive log.This means simply how the next step of training is gdverned by presentperformance. Because this is the only block labelled 'Logic' there isa tendency to try and use this.step to insert logic related to thetraining.objectives.We believe that the performance measure blockshould be developed into three separate stages as shOwn in the lowerpart of the figure, and that the value logic and adaptive logic shouldbe kept entirely separate from, each other.It is these three stepsthat we want 66 concentrate your attentionon.In simple and directterms these three/step

7 s can be relabelled as shown in Figure 6
s can be relabelled as shown in Figure 6.Our1563 )/aim is to Hevelop/the phiabsophy. which should be appliedof the value or/scoring logic step in this process.DEVELOPMENT /OF SCAINGtow,to the generationHavAng defined the concept of 'scoring logic', it is easy to seethatthis logic is the primary link between the raw data which can be obtainedby monitoring performance, and the score which is used to determinesubsequent progress.It is the sole point in the loop at which the recordedtperformanceisoweighed against tlei fundamental olljectives of the process.As a re-sult, the quality of the scoring logis the key factor in determiningwhether or not the adaptive cycle is efficiently diiected towardtheaims of the process, be it an experimental study or a trainingprogram.'It is only too easy to skip over the question pfobjectives whendefining the'logic, and there is, perhaps, qn even more insidious'riskof using.exiating scoring logic 'because it worked well lasttime'.Scoring systems tend to look similar to each other, especiallythoseused'in any one particular fieldOf endeavor, and subtle

8 butvital differ-ences can go un-noticed
butvital differ-ences can go un-noticed.-One way of minimizing this risk'- which we believe is1:Worthwhileinvestment of time - is to carry out an objectiveevaluation of the0true aims of every new scoring systemWe devise, and to develop a soundrationale for the scoring logic.Better still, this rationale shbuldbe formally written up gnd included as an integral partof our descrip-tion of the scoring system.Let us then look at the various ways in which theaims of theadaptive process we are consi4ering can affectcihe way we goabout scoringit.There are a number of questions we have to askourselves, and someof the major ones are outlined in Figure 7.The first - and in our developmental studies the mostfundamentalquestion, and yet oddly enough the one most frequentlyoverlooked, iswhich part of the man-machine system are we labking at:the man, or themachine,'or the interface between them.In the training context the answer is simple -the man, that is toto say the trainee, is paramount.I would like ta digress for a moment,though, and speculate on the wealth of data which has, at onetime

9 andanother, been collected within adapti
andanother, been collected within adaptive training systeRsand which ifit is still stored, could havepotential value for the study of thetraining. machines used, or of the way theydisplay information.Itseems probable that muchof this data might'be tapped sitaply by runningit through new., scoring systems withappropriate changes in their aims -assuming that the original scoring system wascorrectly aimed at training.1574 .Much valuable information on the merits br shortcomings of ydriousmonitoring or.operating conAoles currently inuse ,might be acquired in'this way.training.However, as I said, this is a digression, and in theacontext .it-is the trainee we are trytpg to assess.The next question we have to ask is Wgit are we trying to find outabout the trainee. -The basic answer here is obvious:we want to knowhow well he performs.BUt to decide how to measure this performance,ye need to ask a number of subsidiary-questions.For example, we have to decide which ofto us -which means those Parameters-we canexpense -,are relevant to performance.Anda yardstick here must itself be performa

10 nceto the training objectives.°the param
nceto the training objectives.°the parameters availablemeasure without unduethe performance we use aswhich is directly relevanta4A good example of the sort of decision we have to make'in this areaarises when scoring a.tracking task such as an 'instrument LandingApproach.Should we consider distance along,track as a measure ofperforMance?This distance is one Wuy'of looking at the trainee's con-trol of his speed; notAuSt his instantaneous speed, but the'integralof his speed with'resPect to time from the start to the present.Thusif he makes an error in speed, Zo minimize.his resulting penalty scorehe must not only correct the speed, but make a suitable'compenscefOryadjustment to bring him back to his correct 1)f:43/don along track.Theargument against using this parameter for scoring is that as long asthe trainee stays on the correct line through space as defined by thelanding system, it does not matter when he gets to various points alongit.However,''this obviously depends on the scenario.If the objectof the exercise is not only to Kly along the correct line and land at'the correct

11 point, but,also.tband at a specific tit
point, but,also.tband at a specific tithe to fit .in with anexisting traffic pattern then the'accuracy of his position along trackmust be considered of some importance.Another decisiomweave to make, which is. also related to what weare trying to find out about the trainee's performance,: is whether we-should be concerned with his continuous operating or tracking' ability,or only with hIs ability to reach a certain point by any means at hisdisposal.This quite'clearly will determine whether we want a continuousscoring system or what we term 'gate' scoring - that is to say a measureof his ability to pass through a gate in space, or perhaps a Series of.gates...I realize that you may be thinking that the points we are makinghere are overly simple and obvious 7 they are simple, and they shouldbe obvious, but we think it ii.vitally important to emphasize that acold-blooded analysts of.this sort shoul4 be made, rather than the sortof approaCh whiCh we know from experienat often does occur, based on the°158t)0O by,,principle,of "Well, we usually score parameters A, B, and C - it looksa

12 s if thWshould be OK again this time."Ha
s if thWshould be OK again this time."Having, we hope, established which parameters we want to select fromthe raw data, we next have to think how the values of these parameterscan hest be used to give us a measure oferformance.The logical-approach is to compare these achieved vallffea with some ideal.Mostscoring systems adopt this approach, but once again a vital step whichmay be missed is to ensure that the ideal we set is truly relevant toour oblectives.It is useless to set an ideal value of some parameterat 1002\feet plus or minus zero, when we know that the 'Ace of the base'can only achieve 100 feet plus or minus ten feet, and plus or minustwenty feet is quite adequate for routine performance of the task.Thisis also a useful point at which tp consider the limitations of our-measured data; we can run into serious problems if we try to score tothe nearest foot, when the equipment - and I include here both theinstructorts monitoring equipment and the trainee's operating equipment -can only measure to the _}nearest five feet.Even having established an appropriate ideal-and a va

13 lid measureof divergenae.from it, we sti
lid measureof divergenae.from it, we still need to consider what the implications4of this divergence are._First of ill we must look at the relationship betweefi Size ofdivergence and importance - for example in a certain situation a fivefoot error could be acceptable, a ten faot error merely embarrassing,but a fifteen foot error fatal.Clearly this is not a linear relation-ship - at leakt not in terms of human values, and .our score should re-flect this.In an extreme case of this sort of situationany errorless than ten feet .might be totally acceptable, while anything in excessof ten feet would be totally unacceptable.-This is an example of what is generally known as'time on target'scoring.In some contexts such as air to air combat with guidedku4ssiles it may be a perfectly adequate measure of performance, althougheven here it is more suited to competitive scoringthan, to training.But for a complex tracking task it contains too little data tohelp the,instructor or trainee to identify areas-of weakness and plan remedialtraining accordingly.'Another thing we have to take into acco

14 unt when assessing thedivergence bf-the
unt when assessing thedivergence bf-the achieved performance from the ideal is the fact that.some parameters may be much more critical than others,,and if we wanta scouting system which assigns equal penalties toequally unacceptable'errors, we must weight the measured errors accordingly.A FORMAL PROCESS FOR CREATING A SCORING SYSTEMI now want to recapitulate on the various sorts of decision we havediscussed, and in so doing I want to formalize and develop a process159 for creating an effective scoring system.I want to show tow, starring from the raw data available to us,we cart adopt a systematic approach to mould it to our purposes,andensure that our training objectives are met.There are various processes we can apply to a mass (4 raw data, butwe believe that. six of these processes, shown in Figure 8, if appliedin sequence will go far toward producing a sound system.The first step is to select the parameterswhich are relevant toour training objectives.Next we must look, at what data is available to up on these para-meters and edit it.By editing it, I mean deciding which value

15 s of ittaDe, and the choice here ranges
s of ittaDe, and the choice here ranges from using it all, through using itat regular,interwals, to using it only at specific points which weconsider relevant./.We must then compare this edited. data with what we believe to bethe ideal values we pre seeking to achieve by outtraining program.Thiscomparison Will give us what we have chosen to term 'error, values'.These error values must now have two processes applied to them.These are Modification and Weighting.The dividing line betwpen themis not clear cut, andlor this reason - to avoid lengthy discussionof what .each comprises, I will not separate them.The operationswhich I include under these headings are:Ensuring that the error value6 reflect the trainee's performancerather than any shortcomings in the training equipment.nsuring that the size of the error and the importance ofthisize is suitably reflected in the score.Ensuring that the more critical parameters carry appropriatelyheavier scoring penalties.Ensuring that any inter-relation between parameters isaccountedfor - this for example 'would include any weighting in respe

16 ctofrange if this were considered releva
ctofrange if this were considered relevant.The result of modifying and weighting the error values is toproduce what we call 'scoring elements'.Finally the scoring elements we have arrived at can be combinedto give a sin4le comprehensivencord, or they maybe combined in groupsto give sub scores related to particular partsof tke training objective,orparticular'capabilities of the trainee.Part of this combination16011 "process will be normalization of the score with respect totime ordistance if this appears to be appropriate.We believe that-this systematic approach gives the best 4hanceofachievtng a useful and meaningful scoring system.PRACTICAL EXAMPLESIf our systematic approach is applied to a variety oftraining pro-grams, a variety of scoring systems will naturallyresult, but in, allprobability they will fall into four or five broadlydefined groups;for example time on target.systems, cumulative error systems gate scoresystems, or combinations of these.Within any of these groups, onecan postulate a generalized scoring system which bymanipulation of aseries of constants can be use

17 d for various slightly differenttraining
d for various slightly differenttrainingtasks.This concept lends itself very well to a computer based scoringsystem within an evolving organization - for example apilot trainingschool - where the training objectives remain fairly constant but theequipment used, and the assOciated'operating procedures, will probably,evolve steadily over the years.To'give you a good example of this, I will offer a brief outlineof the type of scoring system we are currently using in ourdevelop-mental studies to assess a pilot's ability to carry out landingapproaches under instrument flying conditions.Remember of course thatin these studies our' aim is to assess the man-machine interfac:Thisdoes not affect the process of scoring, but only the detailedapplication-of the scoring logic.,We have concluded that the appropriate method to adopt in thisinstance is continuous scoring, with the score a functionof site oferror.Time on target scoring simply does not tell us enoughabout'the things we need to know.However we do also take several specificgate scores at appropriate points en route,depending on the

18 nature ofthe particular study,.Our term
nature ofthe particular study,.Our terms are defined in Figure 9, and ageneralized formula for this continuous scoring is shown atFigure 10.You will see that this formula allows for easy adaptation tosuitchanges in the relative importance of then different errors byadjustingthe constants K, changes in the relationship betweenmagnitude andimportance of errors ay adjusting the functioni of the errorvalues,changes in normalization philosophy by adjusting the functionof time,and changes in weighting for range by adjusting the rangefunction.To illustrate how this type of/formula has been applied, I will usethe Microwave Landing System as an example.The object of this exerciseis to ascertain whether or not certain tracks in airspace can beflown,on a tittle scithdule, in a safe and efficient manner.Different routesare to be flown, and the object'ofscoring system is to indicatethe relative desirability of each route.Some typical routes are shown(;) in Figure 11.An MLS route differs fIpm the standard ILSFirst it is time dependent.The pilot must beright time.Also he must fly several dif

19 ferentcommand-basis, some by dead reckon
ferentcommand-basis, some by dead reckoning.Finallyconstant, but depends upon which portion of theapproach in several ways.in the right place at theheadings, some on athe descent rate is'notapproach is being flown.Scoring these approaches depends upon our value logic (as is truein any scoring system).This logic is based upon the objectives definedfor the study.As initially set forththe purpoge of MLS approlches isto control aircraft throughout a given airspace both with respect totime and spatial orientation relative to the prescribed path.Thisleads to the following logic:(1)The aircraft must be equally controlled throughout theairspace.(2)The aircraft must be in the "right place ac.the righttime."(3)For safety considerations the further the aircraft isfrom track the closer it approaches-a critical situation.(4)Some types of- error are more critical than others (i.e.low on altitude is worse than high...on altitude).(5)Different tracks must be compared to one another on thesame baqis.Apply these criteria, all possible parameters,are analyzed andeither discarded or modified and i

20 ncluded in the score as consideredapprop
ncluded in the score as consideredappropriate.The formula shown in Figure 12 is thv,result.No rangeweighting is include because he must fly just as accurately at greatdistances as he does in close.Time of arrival considerations are met,by using along track error which is time dependent.Hence the errorin this direction is taken to be the distance of his actual position'rom where he should be.The safety consideration states that bigerrors can be critical, hence a square law is used.This penalizeslarge errors very heavily.The weighting constants were arrived atsubjectively by discussion with qualified personnel as to the relativeimportance of different types of errors to be considered in a safeapproach.Fihally, in order to compare different tracks the entirescore is normalized with respect to time.The resulting equation canbe taken as a whole Oiby parts to examine the quality of the approach.CONCLUSION41PTo conclude, we do not claim to have found all the answers, butwe do feel we have gone a long way toward asking the right questions.1629 a)We have described'a learning process tha

21 t we- he gonethrough, andwhich we suspec
t we- he gonethrough, andwhich we suspect many people go through in the-process of proddcingsc6ring'systems... It has been a salutory experience for us to formulateour ideas into a systematic process.We hope that our presentationtoday will stimulate interest in the process and perhaps saveotherssome of the time we have spent.Rethember!To "prodlice an'effective scoring system three things areessential.Good r -4w data must be collected, good value logic, ortscoring logic, must be applied to it.The resulting components mustbe assembl4d in a practical manner.Figure 13 shows how our proposalsequence of operations generates the first three stepsof the fivestep adapbOe loop.The common thread running through the whole.process is relevance - we must constantly ask Ion-selvesig our scoresrelate to our aims.-We also hope that we have given you some food forthOught, and thatsome of you will feel like contributing your ownideas in discussionnow.Perhaps we may discover a few more of the answers to the questionswe have posed.Thank you.t1 0163441 I-AaAERONAUTICAL SYSTEMS DIVISIONDeputy for Eng

22 ineering(EN)siCrew' & A G EEngineeringI
ineering(EN)siCrew' & A G EEngineeringI N C)40SIMULATORS &HUMAN FACTORSCREASTATIONDESIGNFACILITYtResearchSimulatorsEF 111C 135T 39/140RPV airWVTHE ADAPTIVE PROCESSA PROCESS IN WHICH THE OUTCOMEOF ANY STEP DETERMINESTHE NAT-URETHE NEXT STEP..THE RELATIONSHIP BETWEEN ONESTEP AND THE. NEXT IS GOVERNED BY THEAPPLICATION OF A PREDETERMINED LOGIC. nib4THE ADAPTIVE LOOPf%k-GENERALIZED-Oc3"C.TRAINING-aSTATUSAOPERFORMANCRMEASURE0ADAPTIVELOGICADJUST -TASK'DIFFICULTY. AN EFFECTIVE. SCORING-SYSTEMSHOULD:* BE DIRECTLY RELATED TOOBJECTIVES* DEAL WITT! ALL ASPECTS OFOBJECTIVESCONDENSE RAW DATA .I-NTO SIMPL EMEASURES OF CRI.TICAL:SYSTEMS, ACTIONS* NOT BE-DEPENDENT ON SUBJECTIVE MEASURES*RELATE^TO ANACCEPTABILITYINDEX* PRODUCE/RESULTS QUICKLY, PREFERABLY.INREALTIME.9 0THE ADAPTIVE TRAINING LOOP,Development of Performance Measurement:PERFORMANCEMEASURE.ADAPTIVE''LOGICr."TrainingPhilosophyADAPTIVEVARIABLE,ParameterValuePerfoimanceMeasureLogicScore(Raw Data)Training(or Scores)"Objectivesa ESSENTIAL ELEMENTS OF A SCORING SYSTEMAll.Available Data1AirspeedAltitudeHeadingAO.AttitudeEtc.Scoring LogicOb

23 jectiveRelated-scoresdracking,AbilitySpe
jectiveRelated-scoresdracking,AbilitySpeedControlEtc. .SCORING LOGIC MUST BE RELEVANTTO OUR AIM'Q.Are we looking at the man, the machine; orthe interface?A. In the training context -the man..0.What do we want to know about him ?A. How Well he performs.z0tiSpecifically,Which parameters are affected by performance?Which parts of the performance are 'relevant?IQ. How can we measurehis performance?A. Compare it with an ideal .Specifically,How precise must the ideal be?'Are some, apparent errorsOctually equipment limitation?How are size and importanceof error related?Aresome parameters morecritical than others?C.N APPLICATION OF SCORING LOGICTO RAW DATA-azzSELECT44=uwEDITCOMPARE WITH IDEAMODIFY.-calla- WEIGHT-m=s-COMBINE 6.)SCORING SYSTEM DEFINIT)QNSSUBSCRIPTSxlongTrack).Across Tr,ack)yzVertical )Ti9-te )a naCDSimplified,Extended,S=The Scoring';F4-F+Fxyz'9%cf.04wherekf(E)f{R}---S =kxf EJERY,;+kyf(Ey)f(R )y+kzf(EzIf(Rf,t=ta.Et)w TYPICAL\,M L SFL1QHT PATHSV MIS '.ScoringIVA KG749. 5r. .4 7. a::.W7/51.7. .7r(1/32.Ex(t))2+(Ey)2ktwherezzj2if�Ez 0ifEz0 ffTHE COMPLETE PROCESSS CO RI