A PostgreSQL Row Level Security Primer + Creating Large Policies
Row Level Security, aka "RLS," allows a database administrator to define if a user should be able to view or manipulate specific rows of data within a table according to a policy. Introduced in PostgreSQL 9.5, row level security added another layer of security for PostgreSQL users who have additional security and compliance considerations for their applications.
At Crunchy Data, we care a lot about data security and supporting PostgreSQL. When we discovered an issue with creating a large row level security policy, we worked to find a workaround that provided a secure outcome but avoided creating a custom fork of PostgreSQL. As of the publication of this post, the Crunchy Data team is working to address the issue with a patch that will be submitted for inclusion upstream.
To understand the issue and how our team found a solution, I'll first give a quick primer of how row level security works in PostgreSQL.
A Brief RLS Introduction
Example Setup
For this example, I created a PostgreSQL 12 instance using Docker with the following settings:
PG_PRIMARY_USER=postgres
PG_PRIMARY_PASSWORD=securepassword
PG_DATABASE=accounts
PG_USER=jkatz
PG_PASSWORD=securepassword
PG_ROOT_PASSWORD=securepassword
You can substitute jkatz with your desired username.
Once the container booted up, I logged into the PostgreSQL instance as the postgres user from the command-line with the following command:
psql -h localhost -U postgres accounts
In a separate window, I logged into the PostgreSQL instance as the jkatz user from the command-line with the following command:
psql -h localhost -U jkatz accounts
Using the postgres user, I created a table called accounts inside the public schema:
CREATE TABLE public.accounts (
id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
username text UNIQUE NOT NULL,
full_name text NOT NULL,
created_at timestamptz NOT NULL DEFAULT CURRENT_TIMESTAMP
);
I populated the public.accounts table with some random names I generated using the Faker library in Python. I ran the following SQL as the postgres user:
Last, but not least, as the postgres user I inserted a reference to the jkatz user into the "public.accounts" table:
INSERT INTO public.accounts (username, full_name, created_at)
VALUES ('jkatz', 'Jonathan Katz', CURRENT_TIMESTAMP);
What happens when I try to view the table as the jkatz user? I ran the following SQL from my window connected as jkatz:
SELECT * FROM public.accounts;
and received the following message:
**ERROR: permission denied for relation accounts**
That is to be expected. I did not give any read permissions (i.e., being able to run SELECT) to jkatz on the public.accounts table. As the postgres user, I ran the following command to give jkatz access to public.accounts
GRANT SELECT ON public.accounts TO jkatz;
Now, as the jkatz user, when I run the following
SELECT * FROM public.accounts;
I will see a list of all accounts.
Bonus: A PostgreSQL shortcut for running SELECT * FROM *tablename*
is to use the TABLE *tablename*
, e.g.
TABLE public.accounts;
Setting Up Your First Row Level Security Policy
The basis of row level security is to create policies that define how a user interacts with rows within a given table. Policies can be defined over a current database user, specific roles, or over all roles in a database by using the "PUBLIC" attribute. Policies can be created by database superusers and table owners.
We will start with a simple policy: we will enforce that the current database user can only view their own entry in the public.accounts table. As the postgres user, run the following command:
CREATE POLICY accounts_policy
ON public.accounts
USING (username = CURRENT_USER);
The USING clause evaluates whether an existing rows in a table should be returned in a query. In this case, the accounts_policy will only return rows where the username matches the current database user.
Now as the jkatz user, run the following:
TABLE public.accounts;
And watch every single row in the table be returned. This does not seem like row level security at all. Confused?
Don't worry, I left out the most important step: you MUST EXPLICITLY TURN ROW LEVEL SECURITY ON IN A TABLE in order for the policies to be enforced. Row level security is turned off by default (you will see why shortly). To enforce the row level policies for the public.accounts, run the following command as the postgres user:
ALTER TABLE public.accounts ENABLE ROW LEVEL SECURITY;
Switch back to the jkatz user and run:
TABLE public.accounts;
And only a single row that represents jkatz will be returned.
As a bonus, run the above query as the postgres user: you will see every row returned. This is because row level security policies do not apply to superusers (or in this case, the table owner as well as ENABLE ROW LEVEL SECURITY is used vs. FORCE ROW LEVEL SECURITY, which is more restrictive) in your database.
Setting a RLS Policy for an Update
I want to change my name in the system to add my middle initial. As the jkatz user, I try running:
UPDATE public.accounts SET full_name = 'Jonathan S. Katz' WHERE username = 'jkatz';
I receive an error:
**ERROR: permission denied for relation accounts**
This makes sense: we never granted permission for any user other than the superuser (who is also the table owner in this case) to perform UPDATE queries on the public.accounts table. Additionally, we may want to have our user to only update his or her name, otherwise the user could arbitrarily change the username, and perhaps violate our row level security policies.
To only grant the ability to only update the full_name column on the public.accounts table, as the postgres user run:
GRANT UPDATE (full_name) ON public.accounts TO jkatz;
Then, as the jkatz user, try running the update command again:
UPDATE public.accounts SET full_name = 'Jonathan S. Katz' WHERE username = 'jkatz';
which returns:
**UPDATE 1**
What happens if the jkatz user tries to update the created_at timestamp? If you run:
UPDATE public.accounts SET created_at = CURRENT_TIMESTAMP WHERE username = 'jkatz';
You will receive:
**ERROR: permission denied for relation accounts**
A "Too Large" Policy
Another feature of PostgreSQL row level policies is the CHECK clause, which is checked when data is manipulated to ensure they adhere to a policy. For instance, you could set a CHECK clause to disallow a row from being edited more than five minutes after its creation time.
To reproduce the issue (which exists as of this writing in PostgreSQL 10.2), let's create a policy that restricts the user from being set to certain full_name based on a list. Similar to populating the data set, I generated a random list of 1000 names to create the policy. As the postgres user, try running the following code (note: the code is run in a transaction so that if there is an error, the old policy is not accidentally dropped):
You will receive a message like this:
**ERROR: row is too big: size 23368, maximum size 8160**
(For safe measure, check that the existing policy is still in place by running the following as the jkatz user:
TABLE public.accounts;
If it returns only the one row, then the original policy is still being enforced).
What does that error mean?
PostgreSQL groups its data into "pages" and each page by default is 8KB, or 8192 bytes in size. The row level security policies are supposed to fit into a single page, with 32 bytes allocated for meta information.
The definition of the USING and CHECK clauses are stored in a data type called pg_node_tree, which is a system data type that stores a parsed query tree. Thus, if you look at how large the CHECK clause is in the previous policy definition, you can see that it is larger than the 8KB page limit.
The Fix and "The Fix"
A team of Crunchy Data engineers quickly identified the above problem in the PostgreSQL code and came up with a solution: TOAST. TOAST stands for "The Oversized Attribute Storage Technique" and is the system used by PostgreSQL to store data that does not fit within a page. In short, the system columns storing the data for the USING and CHECK clauses in a policy definition need to be marked as "toastable" in order to save large policies.
However, as I mentioned in the beginning, the Crunchy Data team is committed to submitting patches upstream to the PostgreSQL project versus keeping a separate fork, and thus we needed at least a temporary solution for the problem at hand.
The team came up with the pattern of using a PostgreSQL function, which is TOASTable, to encapsulate any complicated logic within a policy. Thus, the policy size should be well within the 8KB page limit.
For the above case, I put all the names in a set-returning function which I created with the postgres user:
After that, as the postgres user, I ran the following code:
BEGIN;
DROP POLICY IF EXISTS accounts_policy ON public.accounts;
CREATE POLICY accounts_policy
ON public.accounts
USING (username = CURRENT_USER)
WITH CHECK (
NOT accounts_is_excluded_full_name(full_name)
);
COMMIT;
As the jkatz user, I now attempt to set full_name to "Robert Johnson," which is one of the names on the list:
UPDATE public.accounts SET full_name = 'Robert Johnson' WHERE username = 'jkatz';
The result:
**ERROR: new row violates row-level security policy for table "accounts"**
In Summary
PostgreSQL is robust and contains a lot of features to both help your development and keep your data secure. In fact, PostgreSQL is so robust that if you do run into issues with the platform, you already have the tools to efficiently work around them.
If you do believe you've found a bug in PostgreSQL, please read up on the community bug reporting policy and file a report.
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read