Reliable versus Valid (Beware When You Compare)

 

One of the biggest issues we come across in measurement tools/equipment is whether or not the item we are using is reliable, valid or neither.

Reliable means the piece of equipment you are using is consistent in its measurements.

An example of reliable would be if someone uses the jump mat and jumps the same height three times in a row and the jump mat registers three of the same scores.

Valid means the piece of equipment you are using is measuring what you are actually trying to have it measure.

An example of valid would be if someone uses a jump mat and jumps 30 inches three times in a row and the jump mat measures 30 inches all three times.

Image 1

Reliable Does Not Mean Valid

Using the above jump mat example, we can differentiate between validity and reliability. Reliability is probably the most important aspect of any assessment tool. Reliability allows us to compare results (pre and post tests), measure improvement, and see trends. For example, when we use the jump mat, it may not be the most valid tool. The jump height we are recording may be erroneously high and the results may not actually reflect the athlete’s true vertical jump height.

Image 2

However, if the machine is consistently erroneously high, we still have a metric that allows us to compare results. The number we record is still a representation of the vertical jump height assessed, but it is not a direct measure (the term measurement is only used when recording true values). So, we can still use this representation of the vertical jump height to see if progress is made and whether or not our training is causing positive changes.

Validity

Validity is the gold standard of what we want when it comes to assessment. Validity means your measurements are true representations of what is actually occurring (it will always be reliable). They are undebatable and allow for the most accurate form of comparison. Regardless of the tool, piece of equipment, or modality, if the measurement tool is valid, then results can be compared with other valid measurement tools looking at the same quality. This is why validity is extremely important in the scientific world. It allows for accurate comparison and consistent analysis.

Example of reliable but not valid. We have someone jump three times on the jump mat. Each time the person truthfully jumped 30 inches, but the jump mat recorded 33 inches. This means the value the jump mat is giving you is not a valid measure (it did not register the actual jump height of 30 inches). However, it is reliable because it all three recordings were consistent. All three true jump heights were 30 inches, all jump mat heights were 33 inches. If the jump mat were valid all three jump heights recorded would be 30 inches.

Not Worth It

If the machine is not reliable and the numbers are consistently erroneous, both high and low, then we will never be able to accurately asses what is going on. The numbers we obtain are meaningless and can actually misguide our training.

 

Comparing Reliable Results

Comparing reliable results is much trickier. The issue is, two reliable machines does’t mean they are functioning on consistent form of error. Lets go back to the jump mat example Say you have two brands of jump mat. In this hypothetical situation, the athlete is somehow able to jump on both mats at the same time.

The athlete performs three jumps and the true measure of each vertical jump is 30 inches. Now, jump mat A registers 28 inches and jump mat B registers 33 inches all three times. Both mats are reliable (assessments are consistent), but they cannot be compared to one another because one is consistently low (28inches) and the other is consistently high (33 inches). This becomes a big issue, because if coaches want to compare programs and results, using different reliable forms of measurement doesn’t ensure that they can be compared. If done improperly, you might think one athlete increased their vert by 5 inches, just by switching from jump mat A to jump mat B

 

Two graphs are reliable, but are not “missing” in the same way. Results are not comparable (Image 3 on left and 4 on right)

 

Application

Understanding how reliability and validity work will help coaches manage their metrics a little better. If you have pieces of equipment, such as velocity measurement tools, it might be wise to get them checked for calibration once a year. If you have jump mats, you might want to make sure they are the same brand and from the same year. This small steps can help make sure that your pre and post test physical assessments are more accurate and reliable.

 

Image reference links
  • Image 1: https://www.unthsc.edu/center-for-innovative-learning/assessment-reliability-and-validity/
  • Image 2 https://i.ytimg.com/vi/D8UFhlOoszc/maxresdefault.jpg (Link to youtube: (https://www.youtube.com/watch?v=D8UFhlOoszc)
  • Image 3 http://www.professorpok.com/2013/05/ahhh-reliability-and-validity.html
  • Image 4