Courses
Courses for Kids
Free study material
Offline Centres
More
Store Icon
Store

Grouping of Data in Statistics Explained

Reviewed by:
ffImage
hightlight icon
highlight icon
highlight icon
share icon
copy icon

How to Group Data in Statistics with Steps and Examples

Grouping data plays a significant role when we have to deal with large data. This information can also be displayed using a pictograph or a bar graph. Data is formed by arranging individual observations of a variable into groups so that a frequency distribution table of these groups provides a convenient way of summarising or analyzing the data. This is how we define grouped data.


In mathematics, in the topic grouping data, we basically learn to define grouped data mathematically. When the number of observations is very large, we may condense the data into several groups, by the concept of a grouping of data. We record the frequency of observations falling in each of the groups. The presentation of data in groups along with the frequency of each group is called the frequency distribution of the grouped data.


What are the Advantages of Grouping Data?

The Advantages of grouping data in statistics are-

  • It helps to focus on important subpopulations and ignores irrelevant ones.

  • Grouping of data improves the accuracy/efficiency of estimation.

  • Frequency Distribution Table for Grouped Data

  • To analyze the frequency distribution table for grouped data when the collected data is large, then we can follow this approach to analyze it easily.


Example

Consider the marks of 50 students of class VII obtained in an examination. The maximum mark of the exam is 50.


23, 8, 13, 18, 32, 44, 19, 8, 25, 27, 10, 30, 22, 40, 39, 17, 25, 9, 15, 20, 30, 24, 29, 19, 16, 33, 38, 46, 43, 22, 37, 27, 17, 11, 34, 41, 35, 45, 31, 26, 42, 18, 28, 30, 22, 20, 33, 39, 40, 32


If we create a frequency distribution table for each and every observation, then it will form a large table. So for easy understanding, we can make a table with a group of observations say 0 to 10, 10 to 20, etc.


The distribution obtained in the above table is known as the grouped frequency distribution. This helps us to bring various significant inferences like:

  1. Many students have secured between 20-40, i.e. 20-30 and 30-40.

  2. 8 students have secured higher than 40 marks, i.e. they got more than 80% in the examination.


In the above-obtained table, the groups 0-10, 10-20, 20-30,… are known as class intervals (or classes). It is observed that 10 appears in both intervals, such as 0-10 and 10-20. Similarly, 20 appears in both the intervals, such as 10-20 and 20-30. But it is not feasible that observation of either 10 or 20 can belong to two classes concurrently. To avoid this inconsistency, we choose the rule that the general conclusion will belong to the higher class. It means that 10 belongs to the class interval 10-20 but not to 0-10. Similarly, 20 belongs to 20-30 but not to 10-20, etc.


Consider a class that says 10-20, where 10 is the lower class interval and 20 is the upper-class interval. The difference between upper and lower class limits is called class height or class size or class width of the class interval.


This is how we create a frequency distribution table for grouped data as shown above.


Histogram

We can show the above frequency distribution table graphically using a histogram. We need to consider class intervals on the horizontal axis and we need to consider the frequency on the vertical axis.


Let’s See A Few Grouped Data Examples in Detailed Step-by-Step Explanations.


Example 1. The marks obtained by forty students of class VIII in an examination are listed below: 

16, 17, 18, 3, 7, 23, 18, 13, 10, 21, 7, 1, 13, 21, 13, 15, 19, 24, 16, 2, 23, 5, 12, 18, 8, 12, 6, 8, 16, 5, 3, 5, 0, 7, 9, 12, 20, 10, 2, 23 


Divide the data into five groups, namely, 0-5, 5-10, 10-15, 15-20 and 20-25, where 0-5 means marks greater than or equal to 0 but less than 5 and similarly 5-10 means marks greater than or equal to 5 but less than 10, and so on. Prepare a grouped frequency table for the grouped data.


Solution: We need to arrange the given observations in ascending order. After arranging them in ascending order we get them as


0, 1, 2, 2, 3, 3, 5, 5, 5, 6, 7, 7, 7, 8, 8, 9, 10, 10, 12, 12, 12, 13, 13, 13, 15, 16, 16,16, 17, 18, 18, 18, 19, 20, 21, 21, 23, 23, 23, 24


Thus, the frequency distribution of the data may be given as follows:


Note: Here, each of the groups that is 0-5, 5-10, 10-15, 15-20, and 20-25 is known as a class interval. In the class interval 10-15, the number 10 is known as the lower limit and 15 is known as the upper limit of the class interval and the difference between the upper limit and the lower limit of any given class interval is known as the class size.


Thus, the class size in the above frequency distribution is equal to 5.


The mid-value of a class is known to be its class mark and the class mark is obtained by adding its upper and lower class limits and dividing the sum by 2.


Thus, the class mark of 0-5 range is equal to (0 + 5)/2 = 2.5


And the class mark of 5-10 range is equal to (5 + 10)/2 = 7.5, etc.


Questions to be Solved:

Question 1)The weights (in kg) of 35 persons are given below: 

43, 51,  62,47, 48, 40, 50, 62, 53, 56, 40, 48, 56, 53, 50, 42, 55, 52, 48, 46, 45, 54, 52, 50, 47, 44, 54, 55, 60, 63, 58, 55, 60, 53,58


Prepare a frequency distribution table equal to the class size. One such class is the 40-45 class (where 45 is not included).

FAQs on Grouping of Data in Statistics Explained

1. What is grouping of data in statistics?

Grouping of data is the process of organizing raw data into classes or intervals to make it easier to understand and analyze. In statistics, large data sets are arranged into groups called class intervals to form a grouped frequency distribution.

  • Raw data → Collected observations
  • Grouped data → Data arranged into intervals
  • Each interval has a corresponding frequency
This method helps in simplifying calculations of mean, median, and mode and in drawing graphs like histograms.

2. Why do we group data?

We group data to make large and unorganized data easier to interpret and analyze. When there are many observations, grouping helps in:

  • Summarizing information clearly
  • Identifying patterns and trends
  • Calculating statistical measures like mean, median, and mode
  • Creating graphs such as histograms
Without grouping, raw data can be difficult to read and compare.

3. What is a grouped frequency distribution?

A grouped frequency distribution is a table that shows how many observations fall within each class interval. It contains:

  • Class intervals (e.g., 0–10, 10–20)
  • Frequency (number of values in each interval)
Example:
  • 0–10 → 3
  • 10–20 → 5
This table helps in summarizing large datasets in a compact form.

4. How do you group raw data into class intervals?

To group raw data into class intervals, divide the range of data into equal-sized intervals and count the frequencies. Follow these steps:

  • Find the range = Maximum − Minimum
  • Decide the number of classes (usually 5–10)
  • Calculate class width = Range ÷ Number of classes
  • Form equal intervals and tally frequencies
Example: If data ranges from 10 to 50, range = 40. For 5 classes, class width = 40 ÷ 5 = 8.

5. What is class interval in grouping of data?

A class interval is the range of values between two limits in grouped data. It has:

  • Lower class limit (smallest value)
  • Upper class limit (largest value)
Example: In 20–30,
  • Lower limit = 20
  • Upper limit = 30
The size of the class interval is called the class width.

6. What is the formula for mean of grouped data?

The mean of grouped data is calculated using the formula Mean (x̄) = Σ(fx) / Σf. Here:

  • f = frequency
  • x = class mark (midpoint of class interval)
  • Σf = total frequency
Class mark is calculated as:
Class mark = (Lower limit + Upper limit) / 2.
This method is also called the direct method of finding mean.

7. How do you find the median of grouped data?

The median of grouped data is found using the formula Median = l + [(N/2 − cf) / f] × h. Where:

  • l = lower boundary of median class
  • N = total frequency
  • cf = cumulative frequency before median class
  • f = frequency of median class
  • h = class width
The median class is the class where cumulative frequency first exceeds N/2.

8. How do you find the mode of grouped data?

The mode of grouped data is calculated using the formula Mode = l + [(f₁ − f₀) / (2f₁ − f₀ − f₂)] × h. Where:

  • l = lower limit of modal class
  • f₁ = frequency of modal class
  • f₀ = frequency of preceding class
  • f₂ = frequency of succeeding class
  • h = class width
The modal class is the class with the highest frequency.

9. What is the difference between grouped and ungrouped data?

The main difference is that grouped data is arranged into class intervals, while ungrouped data is listed as individual values.

  • Ungrouped data: Raw list of observations (e.g., 5, 8, 10, 12)
  • Grouped data: Values organized into intervals (e.g., 0–10, 10–20)
Grouped data is useful for large datasets, while ungrouped data is suitable for small datasets.

10. Can you give an example of grouping of data?

Grouping of data means arranging raw values into class intervals with their frequencies. Example: Marks of students: 12, 18, 22, 25, 27, 32, 35, 40.

  • 10–20 → 2
  • 20–30 → 3
  • 30–40 → 2
  • 40–50 → 1
This forms a grouped frequency distribution table, which makes analysis easier and clearer.