2  Central Tendency

2.1 Overview

Describing a distribution of data in terms of each individual point is usually too much detail for most uses. Instead, most psychological statisticians will attempt to summarize datasets using some measure of #central_tendency referring to some representative point in the data around which data tend to cluster around.

There are three typical or most frequently used indicators of central tendency:

  • The #mode (abbreviated: Mo) is most frequent single value in a distribution, and is most often used with nominal data. In most cases, there is only one mode (unimodal) for a distributions, but it is possible to have bimodal and multimodal distributions meaning a distribution with two or more equally most frequently occurring scores. In these cases all modes are reported.

  • The middlemost score in the distribution is the #median (Mdn), or the score that separates the top 50% of scores from the bottom 50% of scores. For distributions with an even number of values, the median is halfway between the two middlemost scores. For example, if the two middlemost scores in a distribution are 4 and 5, the median is 4.5. When the distribution is an odd number, the median is simply the middlemost score. The median is preferred if data are not normally distributed.

  • The arithmetic average of a distribution is the #mean (M) of the values. The mean is simply the sum of the values divided by the number of values in the distribution. This is the most common index of central tendency, and the sample mean (\(\bar{X}\)) is an unbiased estimator of the population mean (\(\mu\)). Generally, the mean is the preferred measure of central tendency when data are normally distributed.

  • Mean (M)

\[ Mean = \frac{1}{n}(\sum_{i=1}^n{x_i}) \]

  • Median (Md)

If n is odd:

\[ Median_{odd} = x_{(n+1)/2} \]

If n is even:

\[ Median_{even} = \frac{x_{(n/2)}+x_{((n/2)+1)}}{2} \]

  • Mode (Mo)
ct <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,4,5,5,6,6,14)

mo <- function(x) {    
  unique_x <- unique(x)    
  unique_x[which.max(tabulate(match(x, unique_x)))] 
  }
mean_ct <- sum(ct)/length(ct) 
mean_ct == mean(ct)  
[1] TRUE
median(ct) 
[1] 2
mo(ct)
[1] 1