brettscaife.net

Glossary of statistical terms

average

The typical value of a data set. Average does not really have a precise meaning for a statistician, rather there are several different averages (or "measures of central tendency") such as the mean, median and mode

bias

Bias can be defined as the distortion of the estimated effects caused by a systematic difference between the groups being compared.

confounder

If you don't know whether an effect is caused by the variable you are interested in (e.g. a drug or smoking) or by another variable (e.g. age or sex) then the other variable is called a confounder and it is said to cause confounding. A confounder is a variable other than the one being investigated which is asociated with both the exposure and the outcome. A confouder can cause bias in a study.

descriptive statistics

Descriptive statistics are the techniques we use to describe the main features of a sample.

inferential statistics

Statistical inference is the process of using the value of a sample statistic to make informed guesses about the value of a population parameter.

mean

(More properly the arithmetic mean, there other sorts of mean). The sum of all the observations divided by the number of observations. The mean can only be used to describe continuous metric data. The mean is frequently what people mean when they refer to the average.

median

The value of the middle observation. Place the data in order of the size of the observations (rank it). The median is the value of the observation that is halfway along. The median is a type of average but isn't usually what people mean by 'average'.

mode

The most common value observed. The mode is a type of average but isn't usually what people mean by 'average'.

population

A population contains every member of a defined group of interest.

population parameter

The value of a particular characteristic of a population.

sample

A sample is the section of a population that we actually study.

sample statistic

The value of a particular characteristic of a sample. In general this will be an estimate of the value of a population parameter.