The normal distribution is completely described by these two parameters.

The normal distribution is completely described by these two parameters.
The normal distribution is completely described by these two parameters.

The normal distribution, also known as the Gaussian distribution, is a theoretical continuous distribution of a random variable - and is mathematically defined by several formulae.

The normal distribution is completely described by these two parameters.
For non-mathematicians, a qualitative description of its properties may be more useful.

The normal distribution was so named because it was thought to be the natural or normal distribution for any continuous variable to follow. We now know that, in biology at least, that is not necessarily the case. But in statistics the distribution remains extremely important because it more-or-less describes the random variation of sample means - and many statistics that behave as means.

The normal distribution is completely described by these two parameters.

{Fig. 1}

The normal distribution is completely described by these two parameters.

Whilst many people visualise a normal distribution as a 'bell-shaped' curve, for a critical appraisal you need to define its properties much more clearly and quantitatively. Let us begin by stating the properties of the distribution.

  • Any truly normal distribution has a maximum of infinity and a minimum of minus infinity - and, having an infinite range, is therefore unbounded.

  • If you randomly select a value from a normal distribution, that value can be any number between minus and plus infinity. Moreover, because the probability obtaining a predefined value of a is vanishingly small, the probability of obtaining the same observation twice is effectively zero. As a result, every observation of a normal population is unique. In other words, because no two observations can be the same, so a truly normal population has no ties. Consequently, the normal distribution is smooth and continuous.

    In order to have these properties, a completely normal population must be infinitely large. From this it follows that, although a sample of a normal population might have almost any distribution, if a sample contains a finite number of observations it cannot be perfectly normal!

  • The normal distribution is completely symmetrical so the mean and median are identical. So, although the mean and the median of a sample of that population may be unequal, on average (given an infinite number of samples of that population) they would be the same as each other - and the same as the population from which they were drawn - provided their selection was unbiased.

  • Lastly, if a population is normal, its distribution is defined by two - and only two - values. The location of the population described by the population mean, and the dispersion of the population, described by the population standard deviation. These are known as the population parameters, and methods which assume your observations are normal are known as parametric.

Let us consider how varying these population parameters affects the appearance of the distribution.

 

The normal distribution is completely described by these two parameters.
The normal distribution is completely described by these two parameters.

The population's location

    Because the normal distribution is smooth and symmetrical, the mean, median, and mode of any normal population are identical. The graph below shows the distribution of 3 normal populations, whose only difference is their location.

    {Fig. 2}

    The normal distribution is completely described by these two parameters.

    The population mean is usually defined as the mean of all the values in that population - or μ. More explicitly, if we call our population X, μx would be the population mean. In contrast, the mean of a sample of that population,

    The normal distribution is completely described by these two parameters.
    is only an estimate of the 'true' population mean, and the two are only the same on average.

    Another way of defining the population mean is in terms of the average result of randomly sampling that population. For example, if x is an observation of population X then, if we took an infinitely large sample its mean would be Σx/∞ - which causes a few headaches! To avoid this dilemma, we say that, if we repeatedly sample a population, we would expect the average value of x, E(x) to be identical to the population mean, μx. In other words, μx = E(x) = E(

    The normal distribution is completely described by these two parameters.
    ) - or various other mathematical formulae, depending upon the context.

    If the probability of observing a value was unrelated to the value being observed the distribution would be uniform. However, if you sample a normal population at random, the most commonly observed values are closest to the population mean.

 

The normal distribution is completely described by these two parameters.
The normal distribution is completely described by these two parameters.

The population's dispersion

    With a normal population, for mathematical reasons, the dispersion is usually defined in terms of the root mean squared deviation of its observations about their population mean - in other words the population standard deviation, σ. The graph below shows the distribution of 3 normal populations, whose only difference is their dispersion.

    {Fig. 3}

    The normal distribution is completely described by these two parameters.

    By convention, the standard deviation of a population called Y is generally represented by the Greek letter s - in other words σy - or just σ. The standard deviation of a sample of that population may be written as sy, or just s.

Aside from their mean and standard deviation, every normal population is identical. Therefore, if you rescale normal populations to allow for these two parameters, every normal population is completely identical. The commonest way to rescale a normal population is to subtract the mean from each observation, and divide by the standard deviation. This will produce a standard normal population, which has a mean of zero and a standard deviation of one. This is mathematically the simplest of all - and, because is so useful, the standard normal distribution has its own special symbols and terminology.

 

The normal distribution is completely described by these two parameters.
The normal distribution is completely described by these two parameters.

Central limit

As we said above, the main reason why the normal distribution is so important in statistics is that many sample statistics, including the mean, tend towards a normal distribution, irrespective of the population distribution. The way in which a statistic's normal tendency depends upon sample size is described by what is known as central limit theorem. Non-mathematically, there are three factors which determine how large a sample you need in order to assume a statistic is approximately normal.

  • Which statistic you are using - although you may not have much choice in this matter.
  • How normal is the population that is sampled
    The normal distribution is completely described by these two parameters.
    - which, to some extent, depends upon what sort of variable you are dealing with.
  • How 'approximate', or unrealistic, an answer you and your critics are prepared to accept.

Another reason the normal distribution is so popular is because its properties are well known - at least to mathematicians. In particular, there are various formulae which estimate the proportion of a normal population in a defined interval. These are known as probability functions.

Skewness and kurtosis Chauvenet's criterion for identifying outliers

The log normal distribution Gaussian smoothing

For the majority of the remainder of this class, we’ll be focusing on variables that have a (roughly) normal distribution. For example, data sets consisting of physical measurements (heights, weights, lengths of bones, and so on) for adults of the same species and sex often follow a similar pattern: most individuals are clumped around the average or mean of the population, with numbers decreasing the farther values are from the average in either direction.

The normal distribution is completely described by these two parameters.

The shape of any normal curve is a single-peaked, symmetric distribution that is bell-shaped. A normally distributed random variable, or a variable with a normal probability distribution, is a continuous random variable that has a relative frequency histogram in the shape of a normal curve. This curve is also called the normal density curve. The actual functional notation for creating the normal curve is quite complex:

The normal distribution is completely described by these two parameters.

where μ and σ are the mean and standard deviation of the population of data.

What this formula tells us is that any mean μ and standard deviation σ completely define a unique normal curve. Recall that μ tells us the “center” of the peak while σ describes the overall “fatness” of the data set. A small σ value indicates a tall, skinny data set, while a larger value of σ results in a shorter, more spread out data set. Each normal distribution is indicated by the symbols N(μ,σ) . For example, the normal distribution N(0,1) is called the standard normal distribution, and it has a mean of 0 and a standard deviation of 1.

Properties of a Normal Distribution

  1. A normal distribution is bell-shaped and symmetric about its mean.
  2. A normal distribution is completely defined by its mean, µ, and standard deviation, σ.
  3. The total area under a normal distribution curve equals 1.
  4. The x-axis is a horizontal asymptote for a normal distribution curve.

A graphical representation of the Normal Distribution curve below:

The normal distribution is completely described by these two parameters.

Because there are an infinite number of possibilities for µ and σ, there are an infinite number of normal curves. In order to determine probabilities for each normally distributed random variable, we would have to perform separate probability calculations for each normal distribution.

The normal distribution is completely described by these two parameters.

One amazing fact about any normal distribution is called the 68-95-99.7 Rule, or more concisely, the empirical rule. This rule states that:

  • Roughly 68% of all data observations fall within one standard deviation on either side of the mean. Thus, there is a 68% chance of a variable having a value within one standard deviation of the mean
  • Roughly 95% of all data observations fall within two standard deviations on either side of the mean. Thus, there is a 95% chance of a variable having a value within two standard deviations of the mean
  • Roughly 99.7% of all data observations fall within three standard deviations on either side of the mean. Thus, there is a 99.7% chance of a variable having a value within three standard deviations of the mean

A graphical representation of the empirical rule is shown in the following figure:

The normal distribution is completely described by these two parameters.

Image from: http://2.bp.blogspot.com/-J2YOCi9-1Tg/U95XGRQBS-I/AAAAAAAABKQ/y5vD4qMSJb4/s1600/stdeviation.png

Example:

Suppose a variable has mean μ = 17   and standard deviation σ = 3.4. Then, according to the empirical rule:

  • Approximately 68% of individual data values will lie between: 17 – 3.4 = 13.6 and 17 + 3.4 = 20.4. In interval notation we write: (13.6, 20.4).
  • Approximately 95% of individual data values will lie between 17 – 2⋅3.4 = 10.2 and 17 + 2⋅3.4 = 23.8. In interval notation we write: (10.2, 23.8).
  • Approximately 99.7% of individual data values will lie between 17 – 3⋅3.4 = 6.8 and 17 + 3⋅3.4 = 27.2. In interval notation we write: (6.8, 27.2).

The results from the third bullet point illustrate how a data value of, say, 2.1 (which is less than 6.8) or a data value of, say, 33.2 (a value greater than 27.2) would both be very unusual, since almost all data values should lie between 6.8 and 27.2.

Back to the Standard Normal Curve

All normal distributions, regardless of their mean and standard deviation, share the Empirical Rule. With some very simple mathematics, we can “transform” any normal distribution into the standard normal distribution. This is called a z-transform.

The normal distribution is completely described by these two parameters.

The normal distribution is completely described by these two parameters.

Using the z-transformation, any data set that is normally distributed can be converted to the same standard normal distribution by the conversion:

The normal distribution is completely described by these two parameters.

where X is the normally distributed random variable, and Z is a random variable following the standard normal distribution.

Notice when X = μ that Z = (μ – μ)/σ = 0, which explains how Z transforms our mean to 0.

Properties of the Standard Normal Distribution

  1. The standard normal distribution is bell-shaped and symmetric about its mean.
  2. The standard normal distribution is completely defined by its mean, µ = 0, and standard deviation,  σ = 1.
  3. The total area under the standard normal distribution curve equals 1.
  4. The x-axis is a horizontal asymptote for the standard normal distribution curve.

The normal distribution is completely described by these two parameters.