/
University of Salford: Clinical Gait Analysis  3.1 University of Salford: Clinical Gait Analysis  3.1

University of Salford: Clinical Gait Analysis 3.1 - PDF document

mitsue-stanley
mitsue-stanley . @mitsue-stanley
Follow
391 views
Uploaded On 2016-08-21

University of Salford: Clinical Gait Analysis 3.1 - PPT Presentation

Measuresurement Variability Why perform repeatability studies A repeatability study is conducted to assess how much measurements vary when they are repeated on the same subject They generally requir ID: 453085

Measuresurement Variability Why perform repeatability

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "University of Salford: Clinical Gait Ana..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

University of Salford: Clinical Gait Analysis 3.1 Measuresurement Variability Why perform repeatability studies? A repeatability study is conducted to assess how much measurements vary when they are repeated on the same subject. They generally require assessors to make repeat measurements on a number of More recent functional calibration methods may be less dependent on marker placement but then become dependent on how repeatably the calibration exercises are performed. Variability in anthropometric measurements may also affect gait data, particularly joint kinetics. 3.2 University of Salford, An Introduction to Gait Analysis ownership of their performance and empowered to improve it. A range of studies now summarised by McGinley et al. [3] suggest that variability within the acceptable range for many parameters and reasonable for some others is a practical possibility. Perhaps most encouragingly it is interesting to note that variability in gait analysis is considerably less than that reported for similar studies on the repeatability of static physical exam measures [4, 5]. If you trust clinical examination for use in clinical decision making, then you should be even more trusting of gait analysis results. Variance components 2.1 A simplified overview The key to performing useful repeatability studies is to understand where the variability of measurements is coming from. If we make a given measurement a number of times then we can calculate a mean value and a standard deviation. The mean describes the most likely value for the measurement and the standard deviation describes how much variability there is. Variability arises from different sources. Thus, in gait analysis, some of the variability in measurements may come from the different assessors performing measurements differently, this is known as between assessor variability. Some may come from individual assessors being variable in their own technique, this is known as within assessor variability. Some will also arise from the subjects walking differently on from walk to walk which is known as inter-trial variability. The variability for each of these can be measured by the associated standard deviations. To explore these concepts we can consider a number of measurements that have been made on a single subject on a number of different occasions. The data might be plotted as a number of different points as depicted in Figure 1. Figure 1. 45 repeated measurements of average pelvic tilt for one person For some reason the measurements towards the right side of the table appear a little higher than those on the left but otherwise there doesn’t appear to be any particular pattern to the data. The mean can be calculated and then vertical lines drawn from each data point to the mean value (Figure 2). University of Salford: Clinical Gait Analysis 3.3 Figure 2 Data points with mean (horizontal line) and standard deviation (average of vertical lines) illustrated The length of each of these lines is called the deviation and the average length is the standard deviationThe standard deviation calculated in this way is an estimate of how variable future measurements would be if performed under the same conditions as the original measurements. Just knowing this is quite useful if we want to understand how much weight to put on any particular measure in clinical interpretation of the data but it doesn’t give us any particular guidance on how we could make measurements more repeatable in future. If we know something about how the measurements have been made then we may be able to go a little further. For example if we know that measurements were made by different assessors we could calculate the mean for measurements made by that one assessor and calculate a standard deviation for each assessor separately (Figure 3). Figure 3 Data broken up with respect to different assessors who made the measurements It is an “average” but not a simple arithmetic average. Whenever we “average” deviations or standard deviations we actually perform a root mean square average using the number of degrees of freedom as the divisor rather than the number of datapoints. This does not affect how we think about things conceptually but is important when we actually calculate these quantities. For trest of this document “average” used in relation to deviations or standard deviations will be assumed to refer to this process. Averagepelvictilt(degrees) Personmean 3.4 University of Salford, An Introduction to Gait Analysis Immediately we do this we understand several things about the data. The mean values for assessor one and assessor two are very close. This suggests that the two assessors are performing measurements the same way on average. Assessor three, however, has quite a different mean value. This suggests some systematic difference in how the third assessor is performing the measurements. It is important to understand that this only tells us about consistency; it does not tell us which measurements are ‘right’ or ‘true’ and which are ‘wrong’. It is quite possible that assessor 3 is performing measurements correctly and that assessors one two are making the same mistakes. In order to improve measurements it would clearly be useful for all three assessors to chat together about their respective techniques and agree on a common protocol that they can all adhere to. In the case of pelvic tilt measurements as illustrated in the data differences like this are attributable to the way in which the assessors identify the pelvic landmarks (ASIS and PSIS) and how they place markers in relation to these. We can learn a little more by looking at the standard deviations. The average lengths of the vertical lines (standard deviations) for assessors one and three are quite similar. Those for assessor two are quite a lot longer. This means that, however, although assessors one and three appear to be making measurements different to each other each is quite consistent in the way they are doing so. Assessor two, however, is less consistent. It might be useful for this assessor (with or without input of colleagues) to re-examine how he or she performs the measurements and working to do so more consistently in future. Such variability may stem from some uncertainty in identifying the landmarks (there are several bumps on the sacrum that can be mistaken for the PSIS for example). On the other hand the assessor may have a good understanding of how the measurement should be made but doesn’t take sufficient care and attention in applying this consistently every time the measurement is made. Figure 4 Means and standard deviations for each session We can take the analysis one stage further if the measurements have been made on several different capture sessions. It is possible that the large variability in assessor two’s measurements might arise from the patient walking differently for each trial but a more common pattern is that illustrated in Figure 4. Here is clear that the standard deviations for each session (for each assessor) are very similar but there are clear differences between different session means for assessor 2. This reinforces that the variability arises from the way markers are placed for any given session. University of Salford: Clinical Gait Analysis 3.5 2.2 Some technical details The aim of a repeatability study is to provide data that allows this sort of analysis and these sorts of insights into where measurement variability is coming from and what can be done to reduce it. When the work is formalised by statisticians there are some subtle refinements to this technique. The first is that formal approaches tend to talk about the rather than the standard deviation. The variance is simply the square of the standard deviation. They are preferred because variances attributable to different sources can simply be added. (To combine standard deviations you have to square them, add them together and take the square root of the total). The units of variance are different to those of the original measurement however so although the analysis is based upon the variance these are always converted to standard deviations (by taking the square root) before reporting results. Another refinement is that the assessor variance depicted in Figure 3 includes variability arising from differences between the assessors and differences between the different walks. It is preferable to have a measure that assesses the differences between the assessors separately to that arising between different walks. To do this we define the within assessor standard deviation for each assessor as the average standard deviation for the sessions for that particular assessor (see Figure 5). Figure 5 Within assessor standard deviation for Assessor 2 Analysis of all the data will also generally result in an overall within assessor standard deviation which is the average standard deviation across all the sessions. 3.6 University of Salford, An Introduction to Gait Analysis Figure 6 Within assessor standard deviations for the whole group There is a similar issue in defining the variability between assessors which has not so far been defined. Again we want to remove the effect of within assessor and within session variability and we thus define the between assessor standard deviation as the average standard deviation for the different assessors (see Figure 7). Figure 7 Between assessor standard deviation The within session variability gives us some idea of how much variability there is in the individual from walk to walk (or trial to trial). Again this can be calculated for each session individual but is more often reported for the entire dataset in which case the value is the RMS value for all sessions. The result of a formal repeatability study is thus estimates of the quantities outlined in Table 1. University of Salford: Clinical Gait Analysis 3.7 Standard deviation Conceptual definition 3 Illustration Between assessor Standard deviation of mean values for each assessor Figure 7 Within assessor Standard deviation of mean values for sessions for the specified assessor Figure 5 Within assessor (group) Average within assessor (individual) standard deviation across all assessors Figure 6 Inter-trial RMS deviation of trial measurement about session mean for all trials. Figure 4 Table 1 Outputs of a formal repeatability study It should be noted that some studies [e.g. 6] have reported inter-therapist as including inter-session and inter-trial variability (and inter-session as including inter-trial variability). Standard deviations defined in this way will be larger than those defined in Table 1 and need to be interpreted differently. 2.3 Models of variability How we analyse variability depends on how we conceptualise the different sources of variability and how they are related – how we model the variability. Different models of variability are possible and each model is likely to lead to a different value for a given variance component. It is thus important to be very clear about what model we are using and a diagram such as that in Figure 8 is generally the best way of describing any given model. Take note of footnote on page 3 about conceptual definitions and precise mathematical definitions Subject variability ௕௔௦௦Between assessor variability ௪௔௦௦Within assessor variability Inter-trial variability Figure 8 V ariance components comprising measurement variability in gait analysis 3.8 University of Salford, An Introduction to Gait Analysis The model depicted in Figure 8 works well for gait analysis but depends on the assumption that the within assessor variability arises as a only as a consequence of the assessors technique. It is quite possible, however, that people walk a little differently on different occasions, particularly perhaps, if the sessions are separated by a considerable interval. Using this model that will be attributed to the within assessor variability which will therefore be a little greater than that attributable to the assessor’s technique. In order to quantify the natural variability of gait across time a different model of variability is required which gives rise to a different experimental design. One such study is described in Section 5.1 below. Repeatability studies 3.1 Study design Most clinical gait analysis services employ a small number of assessors and it is entirely practical to have each assessor make measurements on all subjects the same number of times. Thus, in the study reported by Schwartz et al [6], two patients were each seen by four assessors on three occasions and data from five trials was collected from each. This is summarised in Figure 9. Figure 9 Experimental design used by Schwartz et al. [6] Such a design may not be practical in larger centres employing more assessors. It requires each subject to be assessed by each assessor on a number of occasions which can become quite a big demand on the subjects. It might be more practical to use more subjects but have each assessor only assess some of them. The most robust way to do this is to ensure that all assessors see the same number of subjects on the same number of occasions and that the allocation of subjects to assessors is conducted systematically to ensure an even mix (cyclic permutation or latin square techniques are appropriate). Such designs are known as balanced designs and the analytical techniques described below can be used. Designs in which different assessors assess different numbers of subjects or in which the distribution of assessors is not even are known as unbalanced and may need a more sophisticated analysis. 3.2 Study conduct Such studies are performed to estimate the likely variability arsing in routine clinical practice and should thus be conducted under circumstances as similar to that of routine clinical practice as possible. This should involve the use of routine equipment, techniques and methods of data processing and modelling. University of Salford: Clinical Gait Analysis 3.9 If the results are published then these should be clearly stated. Despite explicit steps to avoid it repeatability studies are almost always conducted under somewhat idealised conditions and staff will also probably be especially vigilant while making assessments. Estimates of variability obtained from studies will thus almost always under-estimate the variability in routine clinical practice and this should be remembered when applying results to clinical interpretation. One of the most common factors in this regard is the conduct of repeat assessments on the same subject over a short time-scale. Part of measurement variability arises from some uncertainty as to where to place markers to suit any particular individual’s anatomy. If repeat assessments are performed close together on the same subject by the same assessor then there is potential for that assessor to remember where he or she chose to place markers for the first trial. The potential for this will be reduced if capture sessions are conducted on different occasions. (Some markers leave a mark on the skin after removal and a minimal requirement for studies is that sufficient time should lapse in order that this cannot guide repeat marker placement). On the other hand there may be some increased natural variability in data captured on different occasions which be attributed to within assessor variability. The study reported in section 5.1, however, suggests that this effect is probably quite small. 3.3 Representative patients or people without locomotor pathology? Following the paragraph above the ideal repeatability study would be conducted on subjects with the same characteristics as those routinely assessed by each particular service i.e. on typical patients. Having said this there are many practical advantages in using subjects without any locomotor pathology. The most obvious is that patients tend to be more subject to fatigue and this might limit the practicality of some study designs. It is also often convenient to use staff members as subjects or to recruit from their families or other contacts. In deciding which type of subjects to use the main question is whether the measurement variability is likely to depend on the subject’s characteristics. Marker placement for the Conventional Gait Model is primarily determined by physical and functional anatomy and for many pathologies it may be that these differ little between people with or without that pathology. Conditions such as cerebral palsy commonly affect the anatomy and caution is need in applying results from repeatability studies on other populations. Perhaps the most important factor, however, is sub-cutaneous soft tissues. There is a natural tendency to select lean subjects for repeatability studies and results can considerably under-estimate the variability of measurements on more ample patients. Subject compliance is another issue that may result in under-estimates of variability. Subjects chosen for repeatability studies will often be explicitly or implicitly pre-selected to ensure that they are “suitable”. Given the practicalities of conducting such studies it is difficult to avoid this and there is some sense in making it an explicit part of study design. When interpreting data from less co-operative patients clinically, however, it is important to remember that measurement variability may be considerably higher than is suggested by formalises repeatability studies. 3.4 Numbers Repeatability studies require repeat measurements of the same subjects. Given the nature of gait analysis such studies require a considerable investment of time and organisation. Studies thus need to be designed to give a reasonable level of confidence in findings as efficiently as possible. Experimentally determined variance components are subject to some error which can be specified formally by calculating confidence limits. Such calculations can, in principle, be used to inform study design, particularly with regard to the number of subjects, assessors and repetitions. Unfortunately gait analysis is so time consuming that numbers based on such calculations are rarely practical and study design is always determined by what is practically possible rather than by what is statistically appropriate. It is important to 3.10 University of Salford, An Introduction to Gait Analysis remember this when analysing results and to bear in mind that estimates of individual variance components are just that – estimates. The primary focus of such studies is to estimate the overall measurement variability. An important secondary focus, however, is to estimate within and between assessor variability, and the variability of individual assessors in order to direct future staff training. It will thus almost always be a requirement that all assessors involved in patient testing are included. Fortunately this is generally quite a small number. Given this the primary determinant of the confidence in estimates of within and between assessor variability is the total number of sessions. Secondary factors generally favour designs achieving this through using a larger number of subjects (rather than more repeats on each subject). Involving more subjects is likely to protect against rogue results and improve the generalisability of any findings. The more sessions that are included per subject the more likely these are to be performed in quick succession which, it has already been suggested, may lead to under-estimates of true clinical variability. Capturing additional trials once markers have been placed takes very little time but, on the other hand, given that trials are duplicated for every session, they are generally already well represented in the analysis and including many trials per session contributes little. There is little point including more than 5 trials in each session. In summary, trials based on two (or at most three) assessments of each subject are probably to be preferred and using as many subjects as is practical. Collecting large numbers of trials per session is unlikely to be beneficial. 3.5 Frequency Clinical gait analysis is all about measurement and the quality of measurement will be a fundamental determinant of the quality any gait analysis service. Assuring the quality of measurements should thus be viewed as a high priority and an essential component of clinical service delivery. On the other hand conducting repeatability studies is time consuming. Some sensible compromise between these factors is required. On balance, conducting a reasonably definitive study for all staff in a clinical gait analysis service annually seems reasonable. Given that study designs generally require staff to participate equally, a part-time staff member will need to contribute a higher proportion of their time. This might seem particularly onerous but measurement performance generally improves with regular repetition and this might suggest that participation of part-time staff in quality assurance programmes is relatively more important and that full inclusion alongside full-time staff is required. Ensuring that new members of staff learn to make measurements reliably and consistently with other staff is an important part of their induction package. Smaller studies to ensure this should be incorporated. A minimum requirement is that new staff demonstrate within assessor variability to and also demonstrate consistency with at least one existing member of staff. Benchmarks for such competency should be established in line with the findings of the service wide studies. One option here is to use at least one member of staff as a subject for all studies (or someone else who will continue to be available on an ongoing basis). This allows the performance of a new member of staff to be compared with existing data rather than requiring a completely new study. There have been no studies of the consistency of individual gait patterns over periods of months or years. It is likely that a gait pattern remains stable over such timescales but the possibility that it doesn’t should be born in mind if this specific analytical techniques outlined here may need to be modified for such a study which is essentially a comparison with a gold standard rather than a conventional repeatability study. University of Salford: Clinical Gait Analysis 3.11 Presenting the results of repeatability studies Given that the main aim of such repeatability studies is to quantify the overall variability in relation to clinical interpretation and the within and between assessor variability for use in directing staff training then the methods of presenting data should reflect this. 4.1 Overplotting gait graphs Perhaps the most informative representation of results is simply to over-plot data from different sessions in different colours on the same gait graphs. This gives a direct indication of which sessions appear consistent and which differ and of the magnitude and direction of any differences. For example the data in Figure 4 shows the pelvic tilt data from a study in which three different assessors assessed the same subject on two separate occasions. Data from the three assessors is plotted in different colours and all walks captured are plotted. It can be seen that the red traces are most closely bunched suggesting that the red assessor is most consistent in making measurements. The blue traces are more widely spaced suggesting that the blue assessor makes is less consistent in placing the markers used to measure pelvic tilt. The blue traces generally overlie the red traces, however, which suggest there is reasonably consistency between these two assessors. Training might be offered to the blue assessor to improve consistency of marker placement. The green traces are spaced similarly to the red traces suggesting that the green assessor is almost as consistent in placing markers as the red assessor. All the green traces lie above both the red and blue traces, however, suggesting that this assessor is placing markers differently to the other two assessors. Training here is required to improve consistency between the assessors. Figure 10 Data from three different assessors each captured over two sessions. 4.2 Plotting variability across the gait cycle. Schwartz et al. [6] plotted different variance components across the gait cycle. This allows an inspection of whether that variance fluctuates across the cycle. Most joint angles show fairly similar variance across the gait cycle (as in the graphs for hip abduction and flexion in Figure 11). The one exception is hip rotation where there is increased variability around initial contact (see Figure 11). 3.12 University of Salford, An Introduction to Gait Analysis Figure 11 Variance components plotted across the gait cycle [6]. It should be noted that Schwartz et al. used a slightly different approach to this variance analysis in which the inter-session variance is not separated from the inter-therapist variance but is included within it (and similarly for the inter-trial and inter-session variances). 4.3 Gait Reliability Profiles (GRP) It may be useful to summarise the outcome of repeatability studies and the GRP is one way of doing this. The variance estimates are simply averaged across the cycle to provide single numbers for each gait parameter and these are plotted as a histogram as in Figure 12. The total height of the column represents the total measurement variability of a single measurement. This includes inter-trial variability which is assumed to be characteristic of the subject and independent of the measurement procedures. The procedural error (௕௘௧௪௘௘௡ ௔௦௦௘௦௦௢௥௦௪௜௧௛௜௡ ௔௦௦௘௦௦௢௥) is thus calculated and plotted as a blue column within this. The individual between assessor and within assessor terms are then represented by lines hite for w ithin and b lack for b etween). Figure 12 Gait reliability profile for a centre University of Salford: Clinical Gait Analysis 3.13 A similar graph can be plotted to represent the results for any particular assessor. In this case the between assessor term is not relevant and the profile is thus even simpler. The total column height is the total variability for that individual assessor. The red column within this represents the within assessor variability. Figure 13 Gait reliability profile for a single assessor Using the GRPs for the centre and assessors together makes a very powerful tool for understanding the sources of variability within a centre and planning staff training and development activities to reduce variability even further. Published repeatability studies 5.1 Natural variability There have been surprisingly few studies which have attempted to measure natural variability. A recent study [7] assessed variability in 5 young healthy (and relatively slim) subjects at four sessions at two hour intervals throughout one day, the next day and the same days the following week. The position of markers was marked on the skin at the end of the first day to guide marker placement on later days. In the most sessions subjects were asked to walk at a comfortable speed. On the last day the second, third and fourth sessions were conducted with subjects walking in time to a metronome. 3.14 University of Salford, An Introduction to Gait Analysis 2 hours, within day, across day, across week, cadence controlled Figure 14 Inter-session Kinematic variability (SD) across intervals The results show that the inter-session repeatability increases with the distance between sessions, but not very much, and that most gait parameters show inter-session variability of less than 1°. Hip rotation is the exception with inter-session variability of around 2°. These results suggest that natural inter-session variability in healthy young people is low and not accounting for it explicitly will have little effect on calculation of within and between assessor variability. It also suggests that capturing data at sessions several days apart is likely to have little effect on calculation of inter-session variability. 5.2 Measurement Repeatability By contrast there are now several studies which have quantified measurement repeatability in gait analysis. These have been reviewed by McGinley et al. [3] and the results are presented in Figure 5. The data shows that variability differs from joint to joint and across planes. Most joints show variability of 4° or less with the exception of hip rotation. Whilst six of the studies show hip rotation variability of greater than 6°, five show variability of less around 4° or less suggesting some difference between the variability of different practices at different centres. Using the criteria established above in section 2.1 this suggests that most measurements fall in the “reasonable” category (between 2° and 5°) and thus require to be Pel Tilt Pel Ob Pel Rot Hip Flex Hip Ab Hip Rot Kn Flex Kn Val Ank DF Foot Prog University of Salford: Clinical Gait Analysis 3.15 considered when interpreting data. Particular care is required in relation to hip rotation to maintain variability within these limits and there is evidence of “concerning” levels of variability from some studies. Figure 15 Published data on measurement repeatability in gait analysis [1] Some analytical techniques to avoid The analysis of variance components is a part of mainstream statistics and the most appropriate way to quantify variability. The associated standard deviations are reported in the same units as the measurement and their likely impact on clinical interpretation is, generally, immediately obvious. Previous studies, however, have introduced a range of other measures of variability. These have been developed in particular contexts and their application to gait analysis is not always appropriate. Several of them have the potential to lead to highly mis-leading conclusions if accepted uncritically. Most are expressed as dimensionless indices and it can be exceedingly difficult to determine how these should be taken into account during clinical data interpretation. 6.1 Coefficient of variation (CV) The coefficient of variation (CV) is the standard deviation of a measurement divided by its mean value (and often expressed as a percentage). It’s use in gait analysis can probably be traced to the work of David Winter [e.g. 8]. Technically its use should be reserved for ratio data and thus not to joint angles arising from gait analysis. A particular issue issue is that incorporation of the mean values can give rise 3.16 University of Salford, An Introduction to Gait Analysis to similar underlying variability giving rise to very different coefficient’s of variability. Thus if we summarise the data in Figure 5 as demonstrating variability in the sagittal plane as being around 4° at the pelvis, hip and knee and 2°at the ankle and take average values of these parameters for healthy walking we can calculate the CV (see Table 1). The results suggest that even though the variability is the same for the pelvis, knee and hip reporting this in terms of the CV suggests that the knee measurements are far more repeatable. Even more mis-leading is the CV for the ankle. In terms of absolute angles this is far more repeatable than the other three angles but the CV suggests extremely high variability. Variable Variability Mean Coefficient of variation Pelvic tilt 4° 31° 31% Hip flexion 4° 18° 22% Knee flexion 4° 24° 17% Ankle dorsiflexion 2° 2° 100% Table 2 Calculation of coefficient of variability 6.2 Coefficient of Multiple Correlation (CMC) The CMC was introduced into repeatability studies of gait analysis by Kadaba et al. [9]. It is the square root of the Coefficient of Multiple Determination which, like the CV, is also a ratio. In this case the fluctuation of the parameter across the gait cycle is also considered as a source of variation and the total variance (௧௢௧௔௟ ௔௖௥௢௦௦ ௖௬௖௟௘௦for a given subject across the gait cycles) is considered to be comprised of within and between cycle sources. ܥܯܦൌ1െ್೐೟ೢ೐೐೙ ೎೤೎೗೐ೞ೟೚೟ೌ೗ ೌ೎ೝ೚ೞೞ ೎೤೎೗೐ೞ ೢ೔೟೓೔೙ ೎೤೎೗೐ೞ೟೚೟ೌ೗ ೌ೎ೝ೚ೞೞ ೎೤೎೗೐ೞ The smaller the between cycle (measurement) variability the closer this index is to 1. This is superficially attractive but, again the scaling can give rise to mis-leading results. Taking the data used in table 1 and using the within cycle variability for healthy walking we can estimate the associated CMCs. This time pelvic tilt, which has a small range of movement the gait cycle (and hence for which ௧௢௧௔௟ is small) appears to be particularly variable. The other three angles, which all have a considerably greater range of movement (and hence variability) over the gait cycle, appear much more reliable. Indeed, CMCs of 0.98 and 0.99 appear exceedingly high (particularly if erroneously compared with typical values of intra-class correlation coefficients described below) despite that fact that the underlying variability of around 4° is only a little less than that which should be causing concern during clinical interpretation. Again the generally low variability of the ankle angles compared to hip and knee data is masked by the choice of statistics. All these features can be recognised in the data originally presented by Kadaba et al. [9] and in subsequent studies using similar techniques. The CMC can also be affected by differences between healthy subjects and various patient groups. Thus many children with cerebral palsy have an increased range of pelvic tilt throughout the gait cycle. A CMC calculated for such subjects will thus be higher than that for healthy subjects even if there is no difference in the measurement variability. Similarly a reduced range of knee movement will result in reduced CMC values. University of Salford: Clinical Gait Analysis 3.17 Variable Variability (between cycles, SD) Variability (within cycle, SD) (estimate) Pelvic tilt 4° 0.6° 0.39 Hip flexion 4° 16° 0.98 Knee flexion 4° 17° 0.99 Ankle dorsiflexion 2° 8° 0.98 Table 3 Calculation of CMC 6.3 Intra-class correlation coefficients (ICC) The ICC [10] is essentially another variance ratio (although more accurately is a range of variance ratios associated with subtly different models of variance). This was developed for studies in psychology to assess how variability between and within assessors affects the ability of a measure to discriminate subjects in a sample from a given population. In this case the variances used to calculate the ICC come from the between subject and total variability across the sample: ܫܥܥൌ ௕௘௧௪௘௘௡ ௦௨௕௝௘௖௧௦௧௢௧௔௟ ௔௖௥௢௦௦ ௦௔௠௣௟௘ ൌ1െ௪௜௧௛௜௡ ௦௨௕௝௘௖௧௦௧௢௧௔௟ ௔௖௥௢௦௦ ௦௔௠௣௟௘ Like the CMC this gives a value which increases to a value of one as variability reduces to a ero. Portney and Watkins [11] suggested that values of less than 0.75 might be taken to represent “poor to moderate” repeatability, and above 0.75 “good” repeatability. They did, however, qualify this by acknowledging that this attribution is essentially arbtrary. As mentioned above the ICC was developed for applications in psychology were measures can be highly subjective and subject to considerable inter-assesor variability. It does not make sense to apply these categories to biomechanical measurements which we would expect to be much more reliable. As with the CV and the CMC the values of the ICC will not, in general, reflect that variance of different measurement sources. Given that ICC is dependent on the variability between subjects these values will depend on the characteristics of the population sampled as well as the variability in the data and it is not possible to provide a simple example as in Tables 1 and 2 above. It is particularly important to remember, however, that the ICC is essentially a measure of the ability of a measurement to discriminate between individuals within a given population. It makes absolutely no sense to calculate the ICC on a sample from one population and apply the results to a different population. ICCs from a population in which there is considerable between subject variability (e.g. a specturm of children with CP) would be expected to show higher ICCs than those from a population in which there is low between subject variability (e.g. people without locomotor pathology) not because the measurement varaibility is any different but because the between subject variability is different. 6.4 Ad Hoc methods The study of Noonan et al. [1] caused considerable consternation within the gait analysis community when the results were first announced [12, 13]. This was at least in part because they devised their own methods to analyse their repeatability studies which had little statistical justification and included a measure they called the discordance index. This index focussed on the range of measurements rather than the standard deviation and tended to exaggerate the variability within measurements. 3.18 University of Salford, An Introduction to Gait Analysis Whilst the gait analysis community is generally correct to point out that these measures tend to exaggerate variability it is also salient to remember that a single standard deviation tends to under represent the variability. By definition over a third of measurements will be more than one standard deviation away from the mean value. Accuracy studies and Benchmarking 7.1 Accuracy studies Repeatability studies would not be necessary if we could assess the accuracy of measurements directly. If we had some alternate and better method of making measurements that we could compare performance against then we could simply do this for assessors individually and could have a much simpler process than that described above. This is not possible for two reasons, one is practical and the other is theoretical. The practical reason is that, at present, there is no alternative method of making these measurements more accurately that can be used as a “gold standard”. The theoretical reason is that the quantities we are measuring are not actually defined precisely. Thus we all understand that knee flexion is the relative orientation of the long axis of the tibia to that of the femur in the sagittal plane. The bones, however, are complex shapes and there is no consensus on exactly how the long axes or the sagittal plane should be defined. The two are inter-related as there is little point defining these quantities more accurately unless there is technology available to measure them to that level of precision. There thus seems little prospect of the introduction of accuracy studies in the foreseeable future. 7.2 Benchmarking Whilst accuracy studies still appear a long way off it is still desirable for measurements made by different clinical gait analysis services to be consistent. For many years there has been an assumption that differences in technique between centres are permissible as long as data is interpreted against reference normative data captured by each service using clinical protocols. There are few other areas in measurement where this would be tolerated. We don’t expect to measure blood pressure, for example, and then compare this with locally collected normative data because each nurse makes the measurements in a different way. To be taken seriously by other clinicians it is essential that the gait analysis community develops mechanisms to ensure the compatibility of measurements between different services. One potential mechanism is to adopt the techniques of earlier studies [1, 2] and send a number of subjects around different laboratories and analyse the results using techniques very similar to those described above for use within individual services. This is a considerable undertaking requiring considerable organisation and some cost. Whilst such studies might be possible occasionally so often they do not provide a viable mechanism for ongoing validation of measurement procedures. An alternative is to use benchmarking of studies from individual centres against nationally or internationally recognised norms. Natural variability between patients prevents this from being useful for individual subjects but means and standard deviations calculated from healthy subjects should be comparable. Typical variability within the healthy population has a standard deviation of between 3° to 7° depending on the joint kinematic being measured. A sample of 16 subjects would result in a standard error of the population mean of between 1° and 2° (SD/N). Differences greater than this between centres would suggest systematic differences in the way measurements are being made (and differences of more than two standard errors would be strong evidence of this). University of Salford: Clinical Gait Analysis 3.19 References 1. Noonan, K., et al., Inter-observer variability of gait analysis in patients with cerebral palsy. Journal of Pediatric Orthopaedics, 2003. : p. 279-287. 2. Gorton, G.E., 3rd, D.A. Hebert, and M.E. Gannotti, Assessment of the kinematic variability among 12 motion analysis laboratories. Gait Posture, 2009. (3): p. 398-402. 3. McGinley, J.L., et al., The reliability of three-dimensional kinematic gait measurements: a systematic review. Gait Posture, 2009. (3): p. 360-9. 4. Fosang, A.L., et al., Measures of muscle and joint performance in the lower limb of children with cerebral palsy. Dev Med Child Neurol, 2003. (10): p. 664-70. 5. McDowell, B.C., et al., The variability of goniometric measurements in ambulatory children with spastic cerebral palsy. Gait Posture, 2000. (2): p. 114-21. 6. Schwartz, M.H., J.P. Trost, and R.A. Wervey, Measurement and management of errors in quantitative gait data. Gait and Posture, 2004. (2): p. 196-203. 7. McGinley, J., et al., Variability of walking in able bodied subjects. Clinical Biomechanics, 2010. [Submitted]8. Winter, D., The biomechanics and motor control of human gait: Normal, Elderly and Pathological2nd ed. 1991, Waterloo:: Waterloo Biomechanics. 9. Kadaba, M.P., et al., Repeatability of kinematic, kinetic, and electromyographic data in normal adult gait. J Orthop Res, 1989. (6): p. 849-60. 10. Shrout, P.E. and J.L. Fleiss, Intra-class correlations: uses in assessing rater reliability.Psychology Bulletin, 1979. : p. 420-8. 11. Portney, L.G. and M.P. Watkins, Foundations of clinical research: applications to practice. 2nd ed. 2000, Upper Saddle River, NJ: Prentice-Hall. 12. Gage, J., Con:Interobserver variability of gait analysis. Journal of Pediatric Orthopaedics, 2003. : p. 290-291. 13. Wright, J., Pro: Interobserver variability of Gait Analysis. Journal of Pediatric Orthopaedics, 2003. : p. 288-289. 14. Tirosh, O., R. Baker, and J. McGinley, GaitaBase: Web-based repository system for gait analysis.Comput Biol Med, 2010. (2): p. 201-207.