Free sample data of our solar system

About two years ago I introduced a couple of free sample data sets on GitHub ( You can read the entire back story of them in this blog post. Back then I stated, “Over time, my aim is to provide more such data sets in that repository, however, I am in no rush to do so. So don’t expect to see anything anytime soon“. Well, that time has now come. Last year my wife and I visited the Kennedy Space Center in Florida which reignited my curiosity about space. I was always very interested in space and space travel and last year I finally fulfilled a childhood dream of mine seeing the Apollo capsules and Space Shuttles with my own eyes. Back then I joked with my wife about how cool it would be to try and put data about our solar system together and sure thing not long after I couldn’t resist any longer and started looking into it.

As it turns out, we have a lot of information when it comes to our solar system yet the sources on the web are fewer than I first thought. Luckily I found the NASA Space Science Data Coordinated Archive (NSSDCA) Planetary Fact Sheets, which provide a great overview and a great many numbers for the sun and the planets in our solar system. These pages are curated by Dr. David R. Williams, the acting head of the NSSDCA, to whom I have reached out to inquire about the license that the data is under. To my luck, the data is all in the public domain and hence I was free to reuse it for my repository.

Having data about the solar system is one thing. Putting it in a format so that it can be loaded into a database and still be meaningful, however, is a whole different story. Almost all numbers are so large that they are best expressed in scientific notation (10n). Some of them are even so large that the data types of some databases wouldn’t even be able to store them otherwise. And then there are the units for the individual values. They range from Joule, Kelvin, Celsius, Kilograms, Kilometers per Second, Meters per Second2, and many, many more. When I looked at the data, storing the values was easy, but conveying what they mean was a challenge on its own. Hence this data set is different from the other ones you will find in the GitHub repository. For the solar system data, the column names are long. They include not only the meaning of the value itself, such as Volume or Mass but also the scientific notation and the unit. So, for example, the column name for Mass (1024 kg) is mass_10_exp_24_kg, the column name for Volume (1010 km3) is volume_10_exp_10_km_exp_3. Of course, there are many other ways that this could be done but to me, this approach seems the most pragmatic one as the column name itself carries the meaning of the values, rather than having to, e.g., look it up somewhere else every time.
Additionally, this data set also provides a metadata table that contains more human-readable information about the meaning of all the columns and their values.

SQL fun with solar data

Just like with the other data sets you can have some SQL fun with the solar system data. For example, how do the planets in our solar system rank in terms of volume? Easy:

SELECT name, volume_10_exp_10_km_exp_3 AS volume
  FROM planets
    ORDER BY volume DESC;

NAME           VOLUME
__________ __________
Jupiter        143128
Saturn          82713
Uranus           6833
Neptune          6254
Earth         108.321
Venus          92.843
Mars           16.312
Mercury         6.083
Pluto           0.702

Their relative percentage in size to all of them combined, also easy and quite insightful:

    volume_10_exp_10_km_exp_3 AS volume,
    ,2) AS percentage
  FROM planets
    ORDER BY volume DESC;

__________ __________ _____________
Jupiter        143128         59.85
Saturn          82713         34.59
Uranus           6833          2.86
Neptune          6254          2.62
Earth         108.321          0.05
Venus          92.843          0.04
Mars           16.312          0.01
Mercury         6.083             0
Pluto           0.702             0

Just look how massive Jupiter and Saturn are! Every other planet is just a tiny “blip” compared to the two. However, what about their density alongside their volume:

    volume_10_exp_10_km_exp_3 AS volume,
    ,2) AS "VOLUME %",
    mean_density_kg_per_m_exp_3 AS density,
    ,2) AS "DENSITY %"
  FROM planets
    ORDER BY volume DESC;

__________ __________ ___________ __________ ____________
Jupiter        143128       59.85       1326         4.93
Saturn          82713       34.59        687         2.55
Uranus           6833        2.86       1270         4.72
Neptune          6254        2.62       1638         6.09
Earth         108.321        0.05       5514         20.5
Venus          92.843        0.04       5243        19.49
Mars           16.312        0.01       3934        14.63
Mercury         6.083           0       5429        20.19
Pluto           0.702           0       1854         6.89

Now, this is also interesting! Although Saturn is the second largest planet in the solar system, it’s also the least dense of all of them. On the flip side, Earth is the densest planet of them all with close competition by Mercury.

How do they all compare to the sun? Here is where the real fun starts. Note how the stars table uses volume_10_exp_12_km_exp_3 (Volume (1012 km3)) instead of the planets volume_10_exp_10_km_exp_3 (Volume (1010 km3)). That’s because the sun is just so much more massive compared to all of our planets. So, to calculate the percentages with the sun included, one first has to make sure these numbers are normalized. This was one of the reasons why I thought having the scientific notion and units in the column names makes this more apparent and hopefully easier to normalize when working with the data:

    ROUND((100*RATIO_TO_REPORT(volume) OVER()),7) AS "VOLUME %",
    ROUND((100*RATIO_TO_REPORT(density) OVER()),3) AS "DENSITY %"
  FROM (
        volume_10_exp_10_km_exp_3 AS volume,
        mean_density_kg_per_m_exp_3 AS density
      FROM planets
        volume_10_exp_12_km_exp_3*100 AS volume,
        mean_density_kg_per_m_exp_3 AS density
      FROM stars
    ORDER BY volume DESC;

NAME             VOLUME      VOLUME %    DENSITY    DENSITY %
__________ ____________ _____________ __________ ____________
Sun           141200000    99.8309151       1408        4.975
Jupiter          143128      0.101194       1326        4.685
Saturn            82713     0.0584796        687        2.427
Uranus             6833     0.0048311       1270        4.487
Neptune            6254     0.0044217       1638        5.787
Earth           108.321     0.0000766       5514       19.482
Venus            92.843     0.0000656       5243       18.525
Mars             16.312     0.0000115       3934         13.9
Mercury           6.083     0.0000043       5429       19.182
Pluto             0.702     0.0000005       1854        6.551

It should be no big surprise that the sun is by far the largest body in the solar system. What is interesting, however, is that it is not the densest. In the result above, you can actually somewhat guess which of the bodies are gas/ice giants and which ones are solid bodies.

Ok, to round up, one last one that also surprised me. Everybody knows that Saturn is the planet with the rings around it, yes? But did you know that it is not the only one? Turns out, there are a couple of planets that have rings:

  FROM planets
    WHERE has_ring_system='Y';


Final words

I hope that you enjoy the addition of our solar system data to the data sets just as much as I do.

A special Thanks goes to Dr. David R. Williams who not only responded to my initial request but also took the time to answer some of my amateur astronomy questions!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.