Let’s discover with KDE plots
At what age are singer-songwriters most successful? I wondered this the opposite day once I heard an old Stevie Wonder song. My impression was that, like mathematicians, singer-songwriters peak of their mid-late 20s. But what does the data say?
On this Quick Success Data Science project, we’ll use Python, pandas, and the Seaborn plotting library to analyze this query. We’ll take a look at the careers of 16 outstanding singer-songwriters with over 500 hits amongst them. We’ll also incorporate a sexy graphic often called the kernel density estimate plot into the evaluation.
To find out when songwriters are most successful, we’ll need some guidelines. The plan is to look at:
- Singer-songwriters including those that work with co-writers.
- Singer-songwriters with decades-long careers.
- A various number of singer-songwriters and musical genres.
- Singer-songwriters on the Billboard Hot 100 chart.
The Hot 100 is a weekly chart, published by Billboard magazine, that ranks the best-performing songs in the US. The rankings are based on physical and digital sales, radio play, and online streaming. We’ll use it as a consistent and objective strategy to judge success.
We’ll use songs written by the next highly successful artists:
I’ve recorded the age of every artist on the time of every of their hits and saved it as a CSV file stored on this Gist. In the event that they had multiple hits in the identical yr, their age entry was repeated for every hit. Here’s a glimpse at the highest of the file:
Cross-referencing this information is tedious (ChatGPT refused to do it!). Consequently, a number of hits written by these artists but performed by others can have been inadvertently excluded.
A kernel density estimate plot is a technique — just like a histogram — for visualizing the distribution of information points. While a histogram bins and counts observations, a KDE plot smooths the observations using a Gaussian kernel. This…