Using PostgreSQL for JSON Storage
Let's imagine a scenario in which you are dealing with JSON in your application and you want to store it in your database. You let out a heavy sigh and think, "I guess I am going to have to add something besides my favorite DB (Postgres) to my architecture. I wish I could just keep using PostgreSQL."
You start thinking that instead of one problem, now you have many. You're probably going to have to learn a new data query syntax, data creation statement, install new software, and worst of all for you as an app developer, running some new server in production. Now rather than developing your features, you are going to be spending valuable time learning, experimenting, and praying.
Well my friend, today is the day your wishes come true. In the blog post we will talk a little about how you can use PostgreSQL for all your JSON needs. We will also point you at some free learning resources so you can dig in deeper.
JSON versus JSONB
Since 9.2, released in September 2012, PostgreSQL has had a JSON type. This original JSON type was not much more than just a simple storage field that let you dump JSON into your database table. It is just a simple text field that checks to make sure your JSON is well formed. Other than that it doesn't do much and I would not recommend using it.
With PostgreSQL release 9.4 in December 2014, a JSONB type was added. Though I joke that the B stands for better it really stands for Binary. When you put JSON data into a JSONB column, in addition to checking for well formed JSON, you now have the ability to indexing and query and retrieve portions of the document. Generally for all your work you should use JSONB unless you have a compelling reason not to. Here are a couple of nice discussions on the tradeoffs and choosing JSON versus JSONB
What can you do with JSONB in PostgreSQL
Say you had some JSON like:
{
"person": {
"first_name": "Steve",
"last_name": "Pousty"
},
"score": 100,
"status": "Awesome",
"best_feature": "humbleness"
}
Once you put it in a JSOB column named json_content (and make a GIN Index for faster queries) you can do all sorts of fun things. Please note that I will be using the JSONB navigation and function syntax found in PostgreSQL version 11. There was a major improvement to JSON document navigation and querying in version 12 which will be the focus of another blog post.
In the select part of the query
Let's get the users last name.
SELECT json_content #> {person, last_name}
FROM mytable;
The #> or #>> is the JSON path navigator with the difference being #> returns JSON and the #>> returns the JSON text value.
In the Where clause
Using that same document navigation syntax, we can then combine that with the containment check. Just like the name sounds, we will check to see if the stored JSON contains the JSON we are looking for. For example, if we wanted to return only those records that had a status of awesome we would write:
SELECT json_content
FROM mytable
WHERE json_content @> '{"status": "Awesome"}'::jsonb;
This
@>
operator looks for JSON that contains the JSON on the right side of the operator.
The Beauty of All This
The best part of working with JSON in PostgreSQL is that you get to leverage all the normal SQL you already love along with these JSON functions. SQL processing can be used to greatly reduce the amount of code you need in your application. For example, here is the query to get all the distinct status types in the table:
SELECT distinct(json_content #>> {status}) as status, count(json_content) FROM mytable
GROUP BY json_content #>> {status};
Learn more
So if you are intrigued by what you saw here and want to learn more here are some great resources (if I do say so myself).
We have an online tutorial to get your started with JSON in PostgreSQL. It is free and available 24/7 and I also dida live stream walking through the material above.
Let us know @crunchydata what you think of the material! I would also love to hear about how YOU are using JSON with PostgreSQL.
Thanks and happy coding.
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read