startAbout UsFor ConsumersFor FoodserviceFor EducatorsFood Industry Resources

QUANTITATIVE DATA ANALYSIS

Quantitative data analysis is the process of presenting and interpreting numerical data such as descriptive statistics and inferential statistics.  Descriptive statistics include measures of central tendency (averages mean, media, and mode) and measures of variability about the average (range and standard deviation). These give one a picture of the data collected.

Inferential statistics are the outcomes of statistical tests, helping deductions to be made from the data collected, to test hypotheses set and relating findings to the sample or population.  For the purpose of this module, we will only focus on descriptive statistics, such as: 

  • Frequencies
  • Percentages
  • Measures of central tendency (mean, mode, median)
  • Measure of variability (range, standard deviation, variance)

Frequencies

Frequencies (or numerical counts) tell one how many times something occurred or how many responses fit into a particular category. Examples of frequencies are:

  • Thirty-two (32) of 37 HACCP training participants were over 55 years of age.
  • Twenty-seven (27) of 30 school site managers rated the HACCP training as very useful in helping them to better implement their HACCP Plan.

In some cases, frequencies are all that is needed or wanted. In other cases, they serve as a base for other calculations. One such calculation is the percentage.

Percentages

The percentage expresses information as a proportion of a whole. Examples include:

  • Eighty-five percent (85%) of the HACCP training participants were over 55 years of age.
  • Ninety percent (90%) of the site managers rated the content of the training to be very useful.
  • Seventy-four percent (74%) of the site managers in County A participated in one of the HACCP trainings conducted in 2006.

Percentages tend to be easy to interpret. For example, it is more understandable to say that 40 percent (40%) of the respondents adopted safe food handling practices than to say that 96 of 240 people adopted safe food handling practices. Percentages are also a good way to show relationships and comparisons — either between categories of respondents or between categories of responses. For example:

  • Ten percent (10%) of NC schools had a HACCP Plan in 2005 as compared to 2006 when 100% reported having a HACCP Plan in place. (Comparing 2005 respondents to 2006 respondents).
  • While 76 percent of school site managers attended Extension-sponsored HACCP trainings, only 23 percent of the Child Nutrition Directors conducted their own training with site managers. (Comparing responses from the same respondents.)

When reporting a percentage, a common practice is to indicate the number of cases from which the percentage is calculated — either the "N" (the total group) or the "n" (the subsample or subgrouping).   For example, report "90% of the 35 participants passed the food safety certification examination."  Do not just report "90% passed the food safety certification examination."

Measures of Central Tendency

Measures of central tendency allow one to identify common characteristics of a sample. The most commonly used measures are the mean, the mode, and the median.

Mean. The mean (or average) is commonly used to report findings. All answers or scores are totaled and divided by the total number. For example, to get the mean class score for a food safety certification exam, sum all scores and divide by the total number of exam takers.

The mean can also be used to summarize findings from rating scales. For example, "not important, slightly important, fairly important, very important" can be assigned 1, 2, 3, and 4, respectively. The mean rating for each item is calculated by multiplying the number of responses in a category by its rating value (1, 2, 3, 4) to get a total and dividing by the total number of responses for that item.

A disadvantage of reporting the mean is that it might give undue value to data at one end or the other of the distribution. For example, if one reports the average number of school cafeterias in six counties, with the number of cafeterias as follows -- 5, 9, 9, 11, 13 and 37, the average number of school cafeterias would be 14. Fourteen is a larger number of school cafeterias in all but one county.  Therefore, it is best to report the range (5 to 37) along with the mean.  A description of range is in the section Measures of Central Tendency.

Mode. The mode is the most commonly occurring answer or value. For example, in a study involving foodservice managers, each is asked to report how many foodservice workers they have. If most report that they have ten workers, then ten is the modal size of the number of foodservice workers in the group. The mode is important only when a large number of values is available.

Median. The median is the middle value. It is the midpoint where half of the values fall below and half are above. Like the mode, the median is not affected by extreme values. To calculate the median, list the data from one extreme to the other. Count halfway through the list of numbers to find the median value. When two numbers tie for the halfway point, take the two middle numbers, add them, and divide by 2 to get the median.

It is best to calculate all three measures and then decide which provides the most meaning.

Measures of Variance

Measures of variability express the spread or variation in responses. As indicated earlier, the mean might be skewed by extreme values at either end of the distribution. For example, one high value can make the mean artificially high, or one extremely low value will result in an overall low mean. Looking at variability often provides a better understanding of the results.

Range. The range is the simplest way to measure variability. It compares the highest and lowest value in conjunction with the mean to show the spread of responses or scores. Range can be expressed in two ways:  by the highest and lowest values or with a single number representing the difference between the highest and lowest score. For example, the mean food safety certification score was 85 with a range of 54 to 98 or as the mean food safety certification score was 85 with a range of 44 points.

Standard deviation. The standard deviation measures the degree to which individual values vary from the mean. It is the average distance that the average score lies from the mean. A high standard deviation means that the responses vary greatly from the mean. A low standard deviation indicates that the responses are close to the mean. When all the answers are identical, the standard deviation is zero.

Variance. Sometimes, instead of the standard deviation, the variance is used. It is simply the square of the standard deviation.

All of these can be calculated by hand, if the data set is small. Or, one commonly uses statistical analysis software, such as SPSS, to generate evaluation findings.

Test Your Knowledge

1.  What is a frequency?

2.  What are three measures of central tendency?

3.  Why should the range be reported along with the mean?

4.  What is standard deviation?

5.  What is a descriptive statistic?

ANSWER KEY