Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Postgres Query Boost: Using ANY Instead of IN

Avatar for Matt Hudson

Matt Hudson

4 min read

The IN operator has to be in the top-3 SQL-specific operators that people learn, the close competitors are LIKE and BETWEEN. Thus, IN feels familiar when crafting SQL: you can use it with nested SQL statements, or create your own list. But, it does have limitations — what if you wanted to send a list of unknown length? You can’t really use SQL placeholders without also modifying the SQL statement.

What if I told you there was a less ubiquitous operator that offered all of the power and less of the drawbacks. In this article we'll talk about using = ANY(array) in instead of IN(list) to improve code and query quality.

We don't give Java enough love in our tutorials, so these examples are written for JDK with the Vert.x reactive Postgres library. Don’t worry, you’ll be able to read it. We chose a direct Postgres client because ORMs typically default to IN(list), and I’m making an argument they should convert to = ANY(array) where applicable.

Unforeseenn Consequences

It starts simply enough: I have a collection of items that I want to filter by id. Perhaps, this is a user-selectable form that permits users to multi-select values. The most common query looks like this:

SELECT i.*
FROM item AS i
JOIN shipment AS s ON s.item_id=i.id
WHERE s.status <> 'shipped' -- Grab only Unshipped items
  AND i.id IN (1,2,3,4,5,6);

Seems perfectly reasonable. I'll just copy that query into our code and replace the literal list with a placeholder right?

Introduce Complexity

Back in the code window, I drop it into some code approximately like this.

// Prepare Tuple of IDs
Tuple t_list = Tuple.of(1, 2, 3, 4, 5, 6);

// Prepare a parameterized query
pool.preparedQuery("SELECT i.*\\n" +
                "FROM item AS i\\n" +
                "JOIN shipment AS s ON s.item_id=i.id\\n" +
                "WHERE s.status <> 'shipped'\\n" +
                "  AND i.id IN ($1)")
				// Attempt to run it
        .execute(t_list)
				// Log the IDs returned
        .onSuccess(rows -> rows.forEach(
					row -> logger.info("id: " + row.getInteger("id"))))
				// Failure: Log the complaint
        .onFailure(t -> logger.error("QUERY FAIL", t));

But this doesn’t work. It will throw an exception about the wrong number of parameters. The $1 placeholder expects a single value, but it receives 6 values. Aha, that Tuple must be packed wrong for the IN parameter. Maybe sending an array is better.

// Prepare Tuple of array of IDs
Integer[] items = {1, 2, 3, 4, 5, 6};
Tuple t_list = Tuple.of(items);

// Prepare a parameterized query
pool.preparedQuery("SELECT i.*\\n" +
                "FROM item AS i\\n" +
                "JOIN shipment AS s ON s.item_id=i.id\\n" +
                "WHERE s.status <> 'shipped'\\n" +
                "  AND i.id IN ($1)")
				// Attempt to run it
        .execute(t_list)
				// Log the IDs returned
        .onSuccess(rows -> rows.forEach(
					row -> logger.info("id: " + row.getInteger("id"))))
				// Failure: Log the complaint
        .onFailure(t -> logger.error("QUERY FAIL", t));

This doesn’t work either. It throws a wrong type error. Because Postgres is strongly typed, when comparing id SQL expects a Number, but gets Integer[].

Time to Read the Fine Manual

The exceptions coming back from these attempts are not getting better. Let's head over to the PostgreSQL 15 documentation to see if anything leaps out, regarding this.

9.24.1 IN

The right-hand side is a parenthesized list of scalar expressions.

How do I send a parenthesized list of scalars? Is that even possible?

Well, I tried sending an array and I tried sending multiple scalars, but none of those matched a parenthesized list of scalar expressions. As it turns out, you can't prepare a variable list of scalar expressions. If you mean to send 5 scalars, you must write IN ($1, $2, $3, $4, $5). This operator is finicky.

The case for ANY (or SOME)

Fortunately the solution is easily found on the same PostgreSQL 15 documentation page if I continue scrolling.

9.24.3. ANY/SOME (array)

	expression operator ANY (array expression)
	expression operator SOME (array expression)

The right-hand side is a parenthesized expression, which must yield an array value.

Possible match? From the definitions, expression IN (...) is equivalent to expression = ANY (...) but for the parameter type!

Endgame

Let's give it a shot.

// Prepare Tuple of array of IDs
Integer[] items = {1, 2, 3, 4, 5, 6};
Tuple t_list = Tuple.of(items);

// Prepare a parameterized query
// this time replacing IN with =ANY
pool.preparedQuery("SELECT i.*\\n" +
                "FROM item AS i\\n" +
                "JOIN shipment AS s ON s.item_id=i.id\\n" +
                "WHERE s.status <> 'shipped'\\n" +
                "  AND i.id = ANY($1)")
				// Attempt to run it
        .execute(t_list)
				// Log the IDs returned
        .onSuccess(rows -> rows.forEach(
					row -> logger.info("id: " + row.getInteger("id") + " name: " + row.getString("name"))))
				// Failure: Log the complaint
        .onFailure(t -> logger.error("QUERY FAIL", t));

Success! My array binds as a single parameter to ANY.

Conclusions

Given the constraints on IN and the need to send parameters securely, it makes sense use = ANY where you might otherwise use IN.

In addition to parameter binding, it's worth mentioning that ANY works with all boolean comparison operators:

* foo LIKE ANY ('{"%bar", "%baz"}')
* foo ILIKE ANY ('{"%bar", "%baz"}')
* id <> ANY ('{1, 2, 3}')