Home Artificial Intelligence Understanding Histograms and Kernel Density Estimation

Understanding Histograms and Kernel Density Estimation

Understanding Histograms and Kernel Density Estimation

An in-depth exploration of histograms and KDE

Towards Data Science

A histogram is a graph that visualizes the frequency of numerical data. It is usually utilized in data science and statistics to have a raw estimate of the distribution of a dataset. Kernel density estimation (KDE) is a technique for estimating the probability density function (PDF) of a random variable with an unknown distribution using a random sample drawn from that distribution. Hence, it allows us to infer the probability density of a population, based on a finite dataset sampled from it. KDE is usually utilized in signal processing and data science, as a necessary tool to estimate the probability density. This text discusses the mathematics and intuition behind histograms and KDE and their benefits and limitations. It also demonstrates how KDE will be implemented in Python from scratch. All figures in this text were created by the creator.

Probability density function

Let X be a continuous random variable. The probability that X takes a worth within the interval [a, b] will be written as

where f(x) is X‘s probability density function (PDF). The cumulative density function (CDF) of X is defined as:

Hence the CDF of X, evaluated at x, is the probability that X will take a worth lower than or equal to x. Using Equation 1, we will write:

Using the basic theorem of calculus, we will show that

which suggests that the PDF of X will be determined by taking the derivative of its CDF with respect to x. A histogram is the best approach to estimate the PDF of a dataset, and as we show in the subsequent section it uses Equation 1 for this purpose.


In Listing 1, we create a bimodal distribution as a mix of two normal distributions and draw a random sample of size 1000 from this distribution. Here we mix two normal distributions:

Hence, the mean of the conventional distributions is 0 and 4 respectively and their variance is 1 and 0.8 respectively. The blending coefficients are 0.7 and 0.3, so the PDF of the mixture of those distributions is:

Listing 1 plots this PDF and sample in Figure 1.


Please enter your comment!
Please enter your name here