STATISTICS 101: TYPES OF DISTRIBUTIONS
In the last blog, we talked about different types of data. Data, as we know, can be discrete or continuous. Discrete data is data that can take only countable values. Continuous data is data that can take any value, including decimal numbers.
While data in its raw form may seem unremarkable, its true value lies in its distribution. Think of a distribution as a visual storyteller, painting a picture of how the data is scattered across different values or ranges. These patterns, when deciphered, can unlock a wealth of insights and guide us towards informed decisions.
The most used distributions are mentioned below.
A Binomial Distribution is the simplest type of distribution. It is used when an event can have only two possible values with equal probability. So, if an event can be either a pass or a fail, and we distribute several such events, it is a binomial distribution. A researcher may want to know the success of a treatment on a set of patients. A treatment can either be a success or a failure. Using the binomial distribution, the researcher can predict how many patients can be successfully treated with the treatment.
A Poisson Distribution gives the probability of the number of events in a given period. A Poisson Distribution can, for example, predict how many deaths are likely to occur in a day in a town or predict how many patients might come to a doctor’s clinic in a fixed period.
For example, predict how many deaths are likely to occur in a day in a town or predict how many patients might come to a doctor’s clinic in a fixed period.
The Normal Distribution is the most used and perhaps crucial distribution for continuous data. It is a symmetric and bell-shaped distribution. It is useful because it can represent many real-life natural phenomena in physics, biology, mathematics, finance, and economics. Another essential feature of Normal Distribution is that it can approximate other types of distributions.
Some of the commonly encountered Normal Distributions in a population are the birthweight of newborn babies, heights of males in a population and diastolic blood pressure.
Student’s t-distribution, also known as the t-distribution, is a bell-shaped, continuous probability distribution similar to the normal distribution but has heavier tails and is flatter and shorter. It is used to analyse data sets that would otherwise be unsuitable for analysis using the normal distribution. The t-distribution is used when data are approximately normally distributed.
Â
The t-distribution was developed in 1908 by William Sealy Gosset, an Englishman who published under the pseudonym Student. Gosset worked at the Guinness brewery in Dublin and found that existing statistical techniques using large samples were not useful for the small sample sizes he encountered.
The Exponential Distribution is another distribution used for continuous data. It is widely used in the field of reliability. This distribution is concerned with the amount of time until some specific event occurs. For example, the time that the battery of an electric car will last or the time taken for a metal hip implant to fail can be predicted by using Exponential Distributions.