Measures of Spread - Range, Variance, and Standard Deviation

Last Updated : 29 Jul, 2024

Collecting the data and representing it in form of tables, graphs, and other distributions is essential for us. But, it is also essential that we get a fair idea about how the data is distributed, how scattered it is, and what is the mean of the data. The measures of the mean are not enough to describe the data and its nature. We also need to measure the dispersion in the data with respect to different data statistics. For example, we need to be able to answer the questions like, How much is the dispersion of the surrounding data's mean, or its median. These values allow us to describe the data better. Let's look at some of them,

The dispersion, which is also called scatter is measured on the basis of the type of chosen central tendency and the observations available to us. These measures tell us how much the observations are varied or similar to each other. There are many ways of measuring the dispersion in the data, some major ways to measure the spread are given below:

Range
Variance
Standard Deviation

Table of Content

Sample Problems

What is Range?

The range of the data is given as the difference between the maximum and the minimum values of the observations in the data. For example, let's say we have data on the number of customers walking in the store in a week.

10, 14, 8, 10, 15, 4, 7

Minimum value in data = 7

Maximum Value in the data = 15

Range = Maximum Value in the data - Minimum value in the data

= 15 – 7

= 8

Now we can say that the range of the data is 8. This gives us an idea about the spread of the data but doesn't tell how the data is distributed.

What is Variance?

The variance of the data is given by measuring the distance of the observed values from the mean of the distribution. Here we are not concerned with the sign of the distance of the point, we are more interested in the magnitude. So, we take squares of the distance from the mean. Let's say we have x₁, x₂, x₃ .... x_n as n observations and \bar{x} be the mean.

(x_1 - \bar{x})^2 + (x_2 - \bar{x})^2 + (x_3 - \bar{x})^2 .... (x_n - \bar{x})^2 = \sum^{n}_{0}(x_i - \bar{x})^2

If this sum zero, then each term has to be zero which means that there is no scattering in the data. If it is small, then it means that the data is concentrated at the mean and vice versa for large values of the variance.

But this measure is still dependent on the number of observations in the data. That is if there are lots of observations this value will become large. So, we take the mean of the data,

\text{Variance} = \frac{\sum^{n}_{0}(x_i - \bar{x})^2}{n}\\ \sigma^2 = \frac{\sum^{n}_{0}(x_i - \bar{x})^2}{n}

What is Standard Deviation?

In the calculation of variance, notice that the units of the variance and the unit of the observations are not the same. So, to remove this problem, we define standard deviation. It is denoted as \sigma

\sigma = \sqrt{\frac{\sum^{n}_{0}(x_i - \bar{x})^2}{n}}

Let's see how to calculate these measures in some problems,

Sample Problems

Question 1: Find out the range of the following data:

-4	5	-10	6	9

Solution:

For calculating the range of a data, we need to find out the maximum and the minimum of the data:
Max = 9
Min = -10
Range = Max - Min
= 9 -(-10)
= 19

Question 2: Find out the mean and the median of the same data:

-4	5	-10	6	9

Solution:

Mean of the data: \frac{( -4 + 5 -10 + 6 + 9 )}{5}
= \frac{6}{5}
= 1.2
Median is called the middle element of the data.
Median = -10.

Question 3: Let's say we have the following data,

-4	-2	0	-2	6	4	6	0	-6	4

Calculate the range, variance, and standard deviation of the data.

Solution:

Range
We need to find out the minimum and the maximum values of the data distribution.
Min Value = -6
Max Value = +6
Range = Max Value - Min Value
= 6 - (-6)
= 12
Variance
To find the variance, we first need to find the mean,
Mean = \frac{( -4 + -2 + 0 + -2 + 6 + 4 + 6 + 0 - 6 + 4)}{10}
= \frac{6}{10}
= 0.6
We know the formula for Variance,
Variance = \frac{\sum^{n}_{0}(x_i - \bar{x})^2}{n}
= \frac{(x_1 - \bar{x})^2 + (x_2 - \bar{x})^2 + (x_3 - \bar{x})^2 .... (x_n - \bar{x})^2}{n}
= \frac{(-4 - 0.6)^2 + (-2 - 0.6)^2 + (0 - 0.6)^2 .... }{10}
= 17.82
Standard Deviation
\sigma = \sqrt{\frac{\sum^{n}_{0}(x_i - \bar{x})^2}{n}}
⇒ \sigma = \sqrt{17.82}
⇒ \sigma = 4.22

Question 4: Let's say we have the following data,

-3	-3	-3	-3	0	3	3	3	3

Calculate the range, variance, and standard deviation of the data.

Solution:

Range
We need to find out the minimum and the maximum values of the data distribution.
Min Value = -3
Max Value = +3
Range = Max Value - Min Value
= 3 - (-3)
= 6
Variance
To find the variance, we first need to find the mean,
Mean = \frac{( -3 -3 -3 -3 + 0 + 3 + 3 + 3 +3 )}{10}
= 0
We know the formula for Variance,
Variance = \frac{\sum^{n}_{0}(x_i - \bar{x})^2}{n}
= \frac{(x_1 - \bar{x})^2 + (x_2 - \bar{x})^2 + (x_3 - \bar{x})^2 .... (x_n - \bar{x})^2}{n}
= \frac{(-3 - 0)^2 + (-3 - 0)^2 + (-3 - 0)^2 + .... }{9}
= 9
Standard Deviation
\sigma = \sqrt{\frac{\sum^{n}_{0}(x_i - \bar{x})^2}{n}}
⇒ \sigma = \sqrt{9}
⇒ \sigma = 3

Question 5: Let's say we have the following data of the number of TVs sold by a consumer electronics shop over the week,

Monday	4
Tuesday	5
Wednesday	3
Thursday	4
Friday	5
Saturday	5
Sunday	3

Calculate the range, variance, and standard deviation of the data.

Solution:

Range
We need to find out the minimum and the maximum values of the data distribution.
Min Value = 3
Max Value = +5
Range = Max Value - Min Value
= 5 - (3)
= 2
Variance
To find the variance, we first need to find the mean,
Mean = \frac{( 4 + 5 + 3 + 4 + 5 + 5 +3 )}{10}
= \frac{29}{10}
= 2.9
We know the formula for Variance,
Variance = \frac{\sum^{n}_{0}(x_i - \bar{x})^2}{n}
= \frac{(x_1 - \bar{x})^2 + (x_2 - \bar{x})^2 + (x_3 - \bar{x})^2 .... (x_n - \bar{x})^2}{n}
= 0.809
Standard Deviation
\sigma = \sqrt{\frac{\sum^{n}_{0}(x_i - \bar{x})^2}{n}}
⇒ \sigma = \sqrt{0.809}
⇒ \sigma = 0.899

Summary

Measures of spread such include range, variance and the standard deviation tell us how scattered the values in a particular set of data are. The range is the easiest measure computed by the difference between the largest value and the smallest one. While range is limited in generating proportional values, variance calculates the mean of the squared differences from average values, to represent general data dispersion. Standard deviation being the square root of variance, it is lot more meaningful in terms of the units of the data collected. These measures find application in numerous fields including finance, quality assurance, and research as they assist in commenting on the variability and consistency of data along with decision-making.

Interquartile Range and Quartile Deviation using NumPy and SciPy

anjalishukla1859

Improve

Article Tags :

Measures of Spread - Range, Variance, and Standard Deviation

What is Range?

What is Variance?

What is Standard Deviation?

Sample Problems

Summary

Similar Reads

Introduction to Data Analysis

Data Analysis Libraries

Data Visulization Libraries

Exploratory Data Analysis (EDA)

Data Preprocessing

Data Transformation

Time Series Data Analysis

Case Studies and Projects