Courses
Courses for Kids
Free study material
Offline Centres
More
Store Icon
Store

Sampling in Statistics Explained for Students

Reviewed by:
ffImage
hightlight icon
highlight icon
highlight icon
share icon
copy icon

Types of Sampling Methods with Formulas and Examples

A sample statistic refers to quantity from the sample of the given population. A sample is a group of elements that are chosen from the population. The features which we use to describe the population are called the parameters and the properties of the sample data are known as statistics. Population and sample both are the important part of statistics. A sample statistic is a piece of information that we collect from a fraction of a population. Here we will study about sampling statistics methods, hypothesized mean, mean standard deviation and distribution of means.


What is a Sample in Statistics?

A sample statistic is a numerical descriptive measure of a sample data points. A statistic is generally derived from measurements of the individual data from the sample. The statistics are a characteristic of a sample data distribution such as mean, median, mode, standard deviation and proportions. A sample statistic can be used to measure any characteristic of the sample.


Hypothesized Mean

Hypothesis testing is an essential procedure in statistics. A hypothesis test used to evaluate two mutually exclusive statements about a population that determine which statement is best and also supported by the sample data.

The process of hypothesis testing involves setting up two competing hypotheses, first is null hypothesis and second one is alternate hypothesis.

The techniques for hypothesis testing depend on

(i) the type of outcome variable being analyzed (continuous, dichotomous, discrete)

(ii) the number of comparison groups in the investigation

(iii) whether the comparison groups are independent


Estimating the Mean

Following are the steps for estimating the mean :

Step 1. First we have to add a new column to the table writing down the midpoint (middle value) of each group.

Step 2.  Multiply each midpoint value by the frequency of that group and then add the results in a new column.

Step 3. Add the values in the midpoint × frequency column.

Step 4. Finally, divide that value by the total frequency to get the estimate of the mean.


Sample Standard Deviation

The sample standard deviation formula is:

s = \[\sqrt{\frac{\sum (X - \bar{X})^{2}}{n-1}}\]

Sample standard deviation formula

where,

s = sample standard deviation

\[\sum\] = sum

\[\bar{X}\] = sample mean

n = number of scores in the sample.


Sampling Distribution

A sampling distribution is similar to a probability  distribution of a statistic that we choose from random samples of a given population. It is also known as a finite-sample distribution, it represents the distribution of frequencies for how to spread apart various outcomes for a specific population.


The sampling distribution depends on multiple factors such as statistics, sample size, sampling process, and the overall population. It is used to help calculate statistics such as means, ranges, variances and standard deviations for the given sample.


Sample Mean

The sample mean refers to the average value found in a sample. A sample is just a small part of a whole data. For example, if we work for a polling company and want to know how much people pay for food a year, you aren’t going to want to poll over 300 million people. Instead of that, we take a fraction of that 300 million (perhaps a thousand people) that fraction is called a sample. In other words, mean refers to “average.” So in this example, the sample mean will be the average amount therefore those thousand people will have to pay for food a year.


The sample mean is useful when we have to estimate what the whole population is doing, without surveying everyone. Suppose sample mean for the food example was $2400 per year. The odds that we will get is a very similar figure if we surveyed all 300 million people. So the sample mean is a way to save a lot of time as well as money.


The Sample Mean Formula 

The sample mean formula is: \[\bar{X}\] = \[\frac{\sum x_{i}}{n}\]

Here

  • \[\bar{X}\] just stands for the “sample mean”

  • \[\sum\] is summation notation

  • x\[_{i}\] “all of the x-values”

  • n is number of items in the sample mean

Mean and Standard Deviation

The mean refers to average or the most common value in a collection of numbers. There are multiple ways to calculate the mean. There are the two most popular methods i.e Arithmetic mean and geometric mean.


A standard deviation is the measurement of the distribution of a dataset which is related to its mean and it is calculated by the square root of the variance. It is calculated as the square root of variance by determining each data point's deviation which is relative to the mean. If the data points are further from the mean, then there is a chance of higher deviation within the data set. Therefore, the more spread out the data, the higher is the standard deviation.


The Formula for Standard Deviation is Given Below:

Standard deviation = s = \[\sqrt{\frac{\sum_{i=1}^{n} (X_{i} - \bar{X})^{2}}{(n-1)}}\]

Where 

X\[_{i}\] = It is the of the i\[^{th}\] point in the data set

\[\bar{X}\] = It is the mean value of the data set

X = It is the  number of data points in the data set


Probability Sample

Probability sampling is a sampling technique that is used by researchers to choose samples from a larger population using a method that is based on the theory of probability. For a participant to be considered as a probability sample, they must be selected using a random selection.


The most critical requirement of probability sampling is that everyone in the population is known and they have equal chance of getting selected. Suppose, if we have a population of 100 people, and every person would have odds of 1 in 100 for getting selected. In this case probability sampling gives us the best chance to create a sample that is mainly representative of the population.


It uses statistical theory while selecting a small group of people (or sample) from an existing large population and then predicts all their responses that will match with the overall population.


Errors in Sampling

Sampling error often occurs when the sample we use in the study is not representative of the whole population. It often occurs, that’s why, researchers always calculate a margin of error during final results as a statistical practice. The margin error is the amount of error that is allowed for miscalculation while representing the difference between the sample and the actual population. We can control and eliminate these sampling by creating a sample design, having a large enough sample to reflect the entire population, or using an online sample or survey audience to collect responses. 

FAQs on Sampling in Statistics Explained for Students

1. What is sampling in statistics?

Sampling in statistics is the process of selecting a subset of individuals from a population to estimate characteristics of the whole population. In statistics, a population is the entire group of interest, while a sample is a smaller group chosen from it. Sampling is used because studying the entire population is often expensive or impractical. For example, surveying 200 voters to estimate the preference of 10,000 voters is an example of statistical sampling.

2. What are the main types of sampling methods?

The main types of sampling methods are probability sampling and non-probability sampling.

  • Probability sampling: Every member of the population has a known chance of being selected (e.g., simple random sampling, stratified sampling, systematic sampling, cluster sampling).
  • Non-probability sampling: Members are selected without a known probability (e.g., convenience sampling, quota sampling, judgment sampling).
Probability sampling is generally preferred in statistics because it reduces sampling bias and supports valid statistical inference.

3. What is simple random sampling?

Simple random sampling is a method where every member of the population has an equal chance of being selected. In simple random sampling, selections are made using random methods such as random number tables or computer-generated random numbers. For example, choosing 5 students randomly from a class of 30 by drawing names from a box ensures each student has an equal probability of selection.

4. What is the difference between population and sample?

A population is the entire group being studied, while a sample is a smaller subset selected from that population.

  • Population (N): All individuals or items of interest.
  • Sample (n): A part of the population used for analysis.
For example, if you study the heights of all 500 students in a school, that is the population. If you study only 50 students, those 50 form the sample.

5. What is sampling bias?

Sampling bias occurs when certain members of a population are more likely to be selected than others, leading to unrepresentative results. Sampling bias affects the accuracy of statistical conclusions. For example, conducting an online survey to estimate average income may exclude people without internet access, causing biased results. Proper random sampling methods help reduce bias.

6. What is the formula for sample mean?

The formula for the sample mean is x̄ = (Σx) / n. Here:

  • Σx = sum of all sample values
  • n = sample size
For example, if the sample values are 4, 6, and 10, then Σx = 20 and n = 3, so the sample mean is x̄ = 20/3 = 6.67 (approximately).

7. What is sample size and why is it important?

Sample size is the number of observations included in a sample, and it directly affects the accuracy of statistical estimates. The sample size (n) determines how reliable the results are.

  • Larger samples generally reduce sampling error.
  • Smaller samples may give less precise estimates.
For example, a survey of 1,000 people usually gives more reliable results than a survey of 50 people.

8. What is stratified sampling?

Stratified sampling is a probability sampling method where the population is divided into groups called strata, and random samples are taken from each group. In stratified sampling, each stratum shares similar characteristics (e.g., age groups, income levels). For example, if a school has 60% girls and 40% boys, the sample can be selected in the same proportion to ensure fair representation.

9. What is systematic sampling?

Systematic sampling is a method where every kth member of a population is selected after a random starting point. The selection interval is calculated using k = N / n, where N is population size and n is sample size. For example, if N = 100 and n = 10, then k = 10, so every 10th person is chosen after a random start between 1 and 10.

10. What is sampling error?

Sampling error is the difference between a sample statistic and the true population parameter. Sampling error occurs because a sample does not perfectly represent the population. For example, if the population mean is 50 and the sample mean is 48, the sampling error is 2. Increasing the sample size generally reduces sampling error in statistical analysis.