If you prefer learning through video lessons, check out the explanation of Frequency Distribution for Grouped Data and many other topics on our YouTube channel:

Before we begin… you might be interested to know that we have a statistics calculator!

How to create a Frequency Distribution Table for Grouped Data?

In general, a frequency table with grouped data is used when the dataset is large and/or the variable is continuous.

Basically, it consists of grouping the data into intervals of equal width, known as classes. Each class is assigned values for each type of frequency.

Let’s get straight to the point with an example:

We surveyed 50 people about their age and obtained the following results:

38 – 15 – 10 – 12 – 62 – 46 – 25 – 56 – 27 – 24 – 23 – 21 – 20 – 25 – 38 – 27 – 48 – 35 – 50 – 65 – 59 – 58 – 47 – 42 – 37 – 35 – 32 – 40 – 28 – 14 – 12 – 24 – 66 – 73 – 72 – 70 – 68 – 65 – 54 – 48 – 34 – 33 – 21 – 19 – 61 – 59 – 47 – 46 – 30 – 30

Step 1: Identify the maximum and minimum values

38 – 15 – 10 – 12 – 62 – 46 – 25 – 56 – 27 – 24 – 23 – 21 – 20 – 25 – 38 – 27 – 48 – 35 – 50 – 65 – 59 – 58 – 47 – 42 – 37 – 35 – 32 – 40 – 28 – 14 – 12 – 24 – 66 – 73 – 72 – 70 – 68 – 65 – 54 – 48 – 34 – 33 – 21 – 19 – 61 – 59 – 47 – 46 – 30 – 30

maximum and minimum values ​​of grouped data

Step 2: Calculate the Range

To obtain the age range of the surveyed individuals, simply determine the difference between the youngest and the oldest:

Calculate the Range for Grouped Data

Step 3: Calculate the Number of Intervals

Intervals are also known as classes. They are simply the «categories» in which we will categorize our surveyed individuals.

There are several ways to calculate how many intervals we should use. Let’s analyze a couple of them:

How to Calculate the Number of Intervals for Grouped Data

For both methods of calculating the number of intervals to use, the value of n corresponds to the number of data points we have to analyze. In this case, there are 50 data.

With the first method, we would need to round the result since intervals correspond to whole numbers (you can’t have a half-interval or a fraction of an interval… you should round as you would normally do).

How to Calculate the Number of Intervals for Grouped Data using the Square Root of n

The second method is known as Sturges’ Rule, and the result obtained should be rounded UP, that is, to the next whole number (for example, if it gives you 5.1, you should round it to 6, not 5). For our example:

How to Calculate the Number of Intervals for Grouped Data using Sturges' Rule

By both methods, we obtained that we should use 7 intervals.

Step 4: Calculate the Width of the Intervals – The class width

We already know the age range of our surveyed individuals… and we know into how many intervals we need to DISTRIBUTE the categories… Here’s how to calculate the width:

How to Calculate the Width of the Intervals for Grouped Data

Interval width = Range / Number of intervals

Step 5: Building the Intervals

The first interval has a lower limit equal to the minimum value of the data, in this case, 10 years. Add the interval width, which is 9 years, and you’ll get the upper limit of 19 years. That gives us the first interval:

first interval

Watch out! Pay close attention — a bracket is used for the value that IS INCLUDED, and a parenthesis is used for the value that is NOT INCLUDED. This means that data points of 10 years are counted, but those of 19 are NOT.

The value 19 is included in the next interval, where it becomes the lower limit. Add the interval width, which is 9 years, and you’ll get the upper limit of 28 years. That gives us the second interval:

second interval

The use of the bracket means that we DO include 19 here, but the parenthesis indicates that we do NOT include those who are 28 years old. That value is included in the next interval.

Let’s take a look at the 7 constructed intervals:

Construction of Intervals for Grouped Data

If you look closely, the last interval must end at the maximum value, which is 73 years. Logically, that final interval must end with a bracket to make sure the data point of 73 years is not left out.

Step 6: Calculating the Class Mark of Each Interval

The class mark is simply the midpoint of each interval.

What you need to do is add the lower and upper limits of each interval and divide the result by 2. Like this:

Class Mark for Grouped Data

Step 7: Determine the Absolute Frequency of Each Interval

The absolute frequency simply consists of COUNTING the number of data points that fall within each interval. It is represented by a lowercase «f» with a subscript (a small number below) indicating the interval in which the absolute frequency (fi) is located.

Let’s see how many data points fall within the first interval of [10 – 19).

Absolute Frequency Count of the First Interval

If you look closely, we are NOT counting the data of 19 years… those will be counted in the next interval. For the first interval, we have 5 data, so that will be its absolute frequency, its COUNT.

Let’s see how many data fall within the second interval of [19 – 28).

Absolute Frequency Count of the Second Interval

If you look closely, we are NOT counting the data points of 28 years… those will be counted in the next interval. For the second interval, we have 11 data, so that will be its absolute frequency, its COUNT.

Let’s see how many data points fall within the third interval of [28 – 37).

Absolute Frequency Count of the Third Interval

If you look closely, we are NOT counting the data points of 37 years… those will be counted in the next interval. For the third interval, we have 8 data, so that will be its absolute frequency, its COUNT.

These are the absolute frequencies for the 7 intervals:

Frequency Distribution for Grouped Data: Absolute Frequency

Obviously, the sum of all the absolute frequencies should give us the total number of data points, in this case, 50.

Step 8: Determine the Cumulative Absolute Frequency for Each Interval

Don’t complicate it, ACCUMULATING means ADDING everything I’ve done so far.

The Cumulative Absolute Frequency (Fi) for each interval consists of adding all the absolute frequencies of the previous intervals and the current one. To differentiate its symbol from the absolute frequency, simply use an uppercase F.

The first cumulative absolute frequency is the same as the first absolute frequency because we’re just starting… there’s nothing to accumulate yet.

The second cumulative absolute frequency is 16 because we need to add 5+11, as those are the absolute frequencies we’ve accumulated so far.

Calculating the Cumulative Absolute Frequency

The third cumulative absolute frequency is 24 because we need to add 5+11+8, as those are the absolute frequencies we’ve accumulated so far.

Calculating the Cumulative Absolute Frequency

The fourth cumulative absolute frequency is 29 because we need to add 5 + 11 + 8 + 5, as those are the absolute frequencies we’ve accumulated so far.

Calculating the Cumulative Absolute Frequency

When you reach the last interval, you should get an accumulated total equal to the total number of data points, which in this case is 50.

Frequency Distribution for Grouped Data: Cumulative Absolute Frequency

Step 9: Determine the Relative Frequency of Each Interval

The word RELATIVE tells us that we are going to RELATE each Absolute Frequency to its Total… and in mathematics, when you’re told to relate something to something else… it means DIVIDING that something by that something.

A small example with money (this makes things more interesting… right?)

Everyone in my family contributes money for the monthly groceries… together we contribute a TOTAL of 200 dollars. Of those 200, I only contribute 20 dollars.

Let’s find the RELATION of MY CONTRIBUTION to the TOTAL.

Easy, 20 ÷ 200 = 0.1

If I convert it to a percentage… 0.1 x 100% = 10%

So, MY RELATIVE CONTRIBUTION is 10% of the TOTAL.

I hope you understand what the word RELATIVE means.

The Relative Frequency (fr) of each interval consists of dividing the Absolute Frequency of that interval by the Total number of data.

Frequency Distribution for Grouped Data: Relative Frequency

From the table built so far, we can observe that the relative frequency can be expressed either in decimal or percentage, and that the sum of all relative frequencies should equal 100%.

Step 10: Determine the Cumulative Relative Frequency of Each Interval

Here we go again with the cumulative part… don’t complicate things, ACUMULATE just means SUM everything I’ve accumulated so far.

The Cumulative Relative Frequency (Fr) of each interval consists of adding all the relative frequencies of the previous intervals and the current one. To differentiate its symbol from the relative frequency, simply use an uppercase F.

The first cumulative relative frequency is the same as the first relative frequency because we’re just starting… there’s nothing to accumulate yet.

The second cumulative relative frequency is 0.32 because we need to sum 0.1 + 0.22, which are the relative frequencies we’ve accumulated so far.

Calculation of Cumulative Relative Frequency

The third cumulative relative frequency is 0.48 because we need to sum 0.1 + 0.22 + 0.16, which are the relative frequencies we’ve accumulated so far.

Step-by-step calculation of cumulative relative frequency:

Sure, you probably already understood the dynamic… let’s go ahead and see all the Cumulative Relative Frequencies from our example:

Table of Frequency Distribution for Grouped Data

You can validate the previous example in our statistics calculator. Download it for free and take a look.

Did you know that we have a growing app?

Click on the image to go to the download site.