startAbout UsFor ConsumersFor FoodserviceFor EducatorsFood Industry Resources

RELIABILITY AND VALIDITY

Evaluation instruments must also be reliable and valid.  Before administering an instrument, one must know if it has been assessed for reliability and validity.  If it has not, then it should not be used because the quality of the data might be questionable.

Reliability

Reliability estimates the consistency or repeatability of what one is measuring.  For example, a reliable instrument that measures change in food safety knowledge will measure knowledge the same way each time the instrument is administered under the same conditions with the same people.  There are two ways that reliability is often estimated -- test/retest and internal consistency.

Test/Retest. Simply put, one should get the same score on the first administration of the instrument as they do on the second administration of the same instrument. Thus one must:

  • Administer the instrument at two separate times for each subject.
  • Compute the correlation between the two separate administrations.
  • Assume there is no change in underlying conditions between administering instrument 1 and instrument 2.
  • If one does not get the same score, then the instrument needs to be revised because it is not reliable.

Internal Consistency. Internal consistency estimates reliability by grouping items that measure the same concept.  For example, one could write two sets of questions that measure knowledge about handwashing.  After collecting the responses, a correlation is run between those two groups of questions to determine if handwashing was reliably measured.  Again, the correlation is weak then the instrument needs to be revised.

The primary difference between test/retest and internal consistency estimates of reliability is that test/retest involves two administrations of the same evaluation instrument, whereas assessing internal consistency involves only one administration of the instrument.

Validity

There are four types of validity –- conclusion validity, internal validity, construct validity, and external validity.  For the scope of this module, the focus will only be on internal validity.  Internal validity is an assessment of the relationship between the intervention and the outcome observed.  For example, does a strict handwashing policy (intervention) cause the frequency of handwashing to increase (outcome)? 

Threats To Internal Validity. There are three main threats to internal validity -- single group, multiple group, and social interaction threats.  For the scope of this module, the focus will be on single-group threats because this is the most common way that Extension programs are evaluated.  Single group threats occur when a single group (such as foodservice workers in one restaurant) who are participating in an intervention (a more strict handwashing policy implemented in a single foodservice establishment) is evaluated. Examples of key threats are summarized below.

  • A History Threat occurs when a historical event causes the outcome rather than the intervention. In the handwashing example, the handwashing policy did not cause an increase in handwashing, but rather a recent foodborne illness outbreak due to the Hepatitis A virus in a nearby restaurant made the workers more conscious of the need to wash their hands.
  • A Testing Threat is when taking a pre-test affects how a group responds on a post-test.  If one measured handwashing prior to implementing the new handwashing policy, workers might become forewarned that there was about to be an emphasis on handwashing.  They might then wash their hands more often simply as a result of taking the pretest and not because there was a new handwashing policy. 
  • An Instrumentation Threat could occur if the effect of increased handwashing could be due to the way the pretest was administered.
  • A Mortality Threat occurs when subjects drop out of the study, and this leads to an inflated measure of the effect.  For example, if as a result of a stricter handwashing policy, most foodservice workers quit, leaving only those more serious workers (those who would wash their hands more often naturally).

Threats are greatly reduced by including a control group that is comparable to the experimental group (foodservice workers working in the establishment that has the more strict handwashing policy).

Test Your Knowledge

1.  What is a reliable evaluation instrument?

2.  What are two ways used to estimate the reliability of an evaluation instrument?  

3.  What is internal validity?

4.  What are the four threats to internal validity?  

5.  How does one control for the threats to validity?  

ANSWER KEY