Discover more from DSBoost
How to turn a wall of numbers into a beautiful visual? - DSBoost Guest Post #1
🙋♂️ Guest Post
Today I’ll show you one of the graphs I’ve created recently and explain some of the design choices I’ve made. The data for the visualization you’ll see in a moment is a very simple one and comes from the following tweet:
The World of Statistics often plots very simple stats like the one above. And as much as I enjoy them, every time I see a wall of numbers, I’m thinking:
It would have looked so much better if it was a plot!
And this one time I decided to plot the data and see it’s really better looking when visualized. This is what I got:
This plot was created in Python and Matplotlib only. At the bottom of this post, you’ll find a link to a full code.
It started off as a simple plot with dots along the x-axis denoting a founding year and a label next to it. But it was ugly. The labels were jam-packed and staggered on top of each other.
So I moved the labels to the y-axis. This gave me a similar effect to the horizontal bar chart. Except there are dots instead of bars. I’ve also added a mild grey shading underneath.
I think it’s a nice touch as it looks a bit like a mountain slope. Every manufacturer (every new dot) is like another milestone that the automotive business has collectively achieved (by growing bigger and having more players in the game).
I was quite happy with the result as it was but I wanted to add more depth to the data. Without spending too much time, it was easy to expand the original data by adding the continent the manufacturer was founded in.
At this point, it could have been any other feature: current market value, current market share, annual volume of cars, etc. But I’ve very generically limited myself to the continent of origin.
Once I’ve added the continent, I color-coded the dots accordingly and added the legend. The graph had more information now but it became cluttered and difficult to read again.
Therefore, I’ve decided to add the bottom part where we can see a clear split between the continents. Also, this is the plot where I provide one more bit of information by using the country code to indicate the origin place for a given manufacturer.
At this point, I felt the graph design was complete and all that was left to do was visual touches (choosing fonts/colors/hiding spines/grids, etc.).
From the graph you can clearly see that:
The majority of the automotive industry was developed in the first half of the XX century.
It started in Europe.
Asia started developing their own cars well after Europe and the US.
After publication, I had another idea of how you can add some more context to this modest dataset.
I’ve decided to add a bar chart next to the main graph. The bar chart shows the time difference between the current manufacturer (one per bar) and the first established company. Alternatively, this could have been replaced by the time difference between the current manufacturer’s and the previous manufacturer’s founding year.
From the programmatic side:
The data was stored as a Pandas DataFrame
After I’ve added the continent information, I used Pandas’ groupby() function to split the data. This helped me with assigning respective colors to each continent
To plot the dots, I’ve used Matplotlib’s ax.plot() function as it does the job well and there was no need to resort to the ax.scatter()
The sidebar chart in the extended version of the plot was created using ax.barh() function
The last element of the plot was lines connecting the dots with the labels. They were plotted using ax.vlines() and ax.hlines()
All this was bundled together in a subplot using Matplotlib’s fairly recent function:
This function makes it very easy to create a grid layout.
The raw structure for my plot was coded with the following lines:
And it produced the following grid:
And that’s it for today.
You can explore the full code here:
Thanks for reading DSBoost! Subscribe for free to receive new posts and support our work.
📑 Recommended Content
Pawel has a list on X with people who create amazing visuals.
You can follow it here:
If we talk about visuals why don’t you check this roadmap that includes how to create such charts as well?