It’s actually been quite a while since I have put together these free sample data sets for everyone but it just struck me today that I’ve never blogged about them. It was in December 2020 when I finally published and announced these data sets on Twitter. But what am I actually talking about here, you may ask? Well, here comes a story of “when life gives you lemons, make lemonade!”
It all started, like most stories of 2020, with COVID-19 and a being stuck home because of lockdown. Well, technically, it started slightly before then. You see, my wife Federica gave me one of my best Christmas presents ever back in 2019, the freshly printed National Geographic Atlas of the World, 11th Edition. Little did I know that the atlas would come in handy so soon. Pretty much as soon as the lockdown hit, I took the opportunity to study the atlas. What caught my attention immediately was the list of all the countries. My wife was quite curious herself and so we started checking out the atlas together, making it a regular evening activity.
The geek that I am, it didn’t take long for me to dislike her taking notes on her block notes. Naturally, an Excel spreadsheet would be so much better, we could then even sort and filter the data and see, for example, what the smallest country in the world would be, by square kilometers or by population. It would almost be as if that data was in a database… You have guessed it, my next idea was “never mind the spreadsheet, let’s put it all in a data model”. And so, over the time of a couple of months, we studied each country, wrote down the details, proof-read everything twice, and I decided to publish that work under https://github.com/gvenzl/sample-data under the Creative Commons Attribution 4.0 International License (with other words, it’s fully free to use, although a shout out is always appreciated). The project was not only educational and a lot of fun, but it also inspired me to provide another, fictional data set about employees in a company. That actually happened because my wife got curious about how databases work, but she very quickly seemed to have lost that curiosity again. 😀
Under https://github.com/gvenzl/sample-data you will find a repository of free data sets to use. Everything up there has been put together by me and is, as already said, free to use and also free of any copyright infringements. My goal was to provide data that anybody can use for whatever (legal) purposes without having to worry about it. What I like a lot about the
countries data set is that it is also educational about our planet. However, the
employees data set is equally useful for, as an example, demonstrating the basic capabilities of relational databases. You can download the data as a set of CSV files and import them into Excel. I also provided
.sql files that will create a relational model and insert the data for most commonly used databases. Over time, my aim is to provide more such data sets in that repository, however, I am in no rush to do so. So don’t expect to see anything anytime soon.
Just like I hope that this will help others with their demos, I, of course, use them also for my own demos. Just today I loaded the data into my Always Free Tier Oracle Autonomous Database and made the data accessible via REST endpoints. It was quite fun and now anybody in the world can also use the data for any REST-based demo under the following endpoints:
- Get all countries in the world:
- Get a specific country by its country code:
- Get all employees:
- Get a specific employee by the ID:
While doing all that, I also learned about a cool, little feature of Brave that allows you to generate QR codes for a URL (just click in the URL field and click on the square that appears to the right). Just point your phone camera to these guys: