Temporal Filtering in pg_featureserv with CQL
In a previous post we announced the CQL filtering capability in pg_featureserv
. It provides powerful functionality for attribute and spatial querying of data in PostgreSQL and PostGIS.
Another important datatype which is often present in datasets is temporal. Temporal datasets contain attributes which are dates or timestamps. The CQL standard defines some special-purpose syntax to support temporal filtering. This allows pg_featureserv
to take advantage of the extensive capabilities of PostgreSQL for specifying queries against time-valued attributes. This post in the CQL series will show some examples of temporal filtering in pg_featureserv
.
CQL Temporal filters
Temporal filtering in CQL is provided using temporal literals and conditions.
Temporal literal values may be dates or timestamps:
2001-01-01
2010-04-23T01:23:45
Note: The temporal literal syntax is based on an early version of the OGC API Filter and CQL standard. The current draft CQL standard has a different syntax: DATE('1969-07-20')
and TIMESTAMP('1969-07-20T20:17:40Z')
. It also supports intervals: INTERVAL('1969-07-16', '1969-07-24')
. A subsequent version of pg_featureserv
will support this syntax as well.
Temporal conditions allow time-valued properties and literals to be compared via the standard boolean comparison operators <
,>
,<=
,>=
,=
,<>
and the BETWEEN..AND
operator:
start_date >= 2001-01-01
event_time BETWEEN 2010-04-22T06:00 AND 2010-04-23T12:00
The draft CQL standard provides dedicated temporal operators, such as T_AFTER
, T_BEFORE
, T_DURING
, etc. A future version of pg_featureserv
will likely provide these operators.
Publishing Historical Tropical Storm tracks
We'll demonstrate temporal filters using a dataset with a strong time linkage: tracks of tropical storms (or hurricanes). There is a dataset of Historical Tropical Storm Tracks available here.
The data requires some preparation. It is stored as a set of records of line segments representing 6-hour long sections of storm tracks. To provide simpler querying we will model the data using a single record for each storm, with a line geometry showing the entire track and attributes for the start and end time for the track.
The data is provided in Shapefile format. As expected for a worldwide dataset, it is in the WGS84 geodetic coordinate system (lat/long). In PostGIS this common Spatial Reference System is assigned an identifier (SRID) of 4326.
The PostGIS shp2pgsql
utility can be used to load the dataset into a spatial table called trop_storm_raw
. The trop_storm_raw
table is a temporary staging table allowing the raw data to be loaded and made available for the transformation phase of data preparation.
shp2pgsql -c -D -s 4326 -i -I -W LATIN1 "Historical Tropical Storm Tracks.shp" public.trop_storm_raw | psql -d database
The options used are:
-c
- create a new table-D
- use PostgreSQL dump format to load the data-s
- specify the SRID of 4326-i
- use 32-bit integers-I
- create a GIST index on the geometry column (this is not strictly necessary, since this is just a temporary staging table)-W
- specifies the encoding of the input attribute data in the DBF file
Next, create the table having the desired data model:
CREATE TABLE public.trop_storm (
btid int PRIMARY KEY,
name text,
wind_kts numeric,
pressure float8,
basin text,
time_start timestamp,
time_end timestamp,
geom geometry(MultiLineString, 4326)
);
It's good practice to add comments to the table and columns. These will be displayed in the pg_featureserv
Web UI.
COMMENT ON TABLE public.trop_storm IS 'This is my spatial table';
COMMENT ON COLUMN public.trop_storm.geom IS 'Storm track LineString';
COMMENT ON COLUMN public.trop_storm.name IS 'Name assigned to storm';
COMMENT ON COLUMN public.trop_storm.btid IS 'Id of storm';
COMMENT ON COLUMN public.trop_storm.wind_kts IS 'Maximum wind speed in knots';
COMMENT ON COLUMN public.trop_storm.pressure IS 'Minumum pressure in in millibars';
COMMENT ON COLUMN public.trop_storm.basin IS 'Basin in which storm occured';
COMMENT ON COLUMN public.trop_storm.time_start IS 'Timestamp of storm start';
COMMENT ON COLUMN public.trop_storm.time_end IS 'Timestamp of storm end';
Now the power of SQL can be used to transform the raw data into the simpler data model. The track sections can be combined into single tracks with a start and end time using the following query.
- The original data represents the track sections as
MultiLineString
s with single elements. The element is extracted usingST_GeometryN
so that the result of aggregating them usingST_Collect
is aMultiLineString
, not aGeometryCollection
. (An alternative is to aggregate into a GeometryCollection and useST_CollectionHomogenize
to reduce it to aMultiLineString
.) - The final
ST_Multi
ensures that all tracks are stored asMultiLineStrings
, as required by the type constraint on thegeom
column. - the filter condition
time_end - time_start < '1 year'::interval
removes tracks spanning the International Date Line.
WITH data AS (
SELECT btid, name, wind_kts, pressure, basin, geom,
make_date(year::int, month::int, day::int) + ad_time::time AS obs_time
FROM trop_storm_raw ORDER BY obs_time
),
tracks AS (
SELECT btid,
MAX(name) AS name,
MAX(wind_kts) AS wind_kts,
MAX(pressure) AS pressure,
MAX(basin) AS basin,
MIN(obs_time) AS time_start,
MAX(obs_time) AS time_end,
ST_Multi( ST_LineMerge( ST_Collect( ST_GeometryN(geom, 1)))) AS geom
FROM data GROUP BY btid
)
INSERT INTO trop_storm
SELECT * FROM tracks WHERE time_end - time_start < '1 year'::interval;
This is a small dataset, and pg_featureserv
does not require one, but as per best practice we can create a spatial index on the geometry column:
CREATE INDEX trop_storm_gix ON public.trop_storm USING GIST ( geom );
Once the trop_storm
table is created and populated, it can be published in pg_featureserv
. Issuing the following request in a browser shows the feature collection in the Web UI:
http://localhost:9000/collections.html
http://localhost:9000/collections/public.trop_storm.html
The dataset can be viewed using pg_featureserv
's built-in map viewer (note that to see all 567 records displayed it is probably necessary to increase the limit on the number of response features):
http://localhost:9000/collections/public.trop_storm/items.html?limit=1000
Querying by Time Range
That's a lot of storm tracks. It would be easier to visualize a smaller number of tracks. A natural way to subset the data is by querying over a time range. Let's retrieve the storms between the start of 2005 and the end of 2009. This is done by adding a filter
parameter with a CQL expression against the dataset temporal property time_start
(storms typically do not span the start of years). To query values lying between a range of times it is convenient to use the BETWEEN
operator. The filter condition is time_start BETWEEN 2005-01-01 AND 2009-12-31
. The full request is:
http://localhost:9000/collections/public.trop_storm/items.html?filter=time_start BETWEEN 2005-01-01 AND 2009-12-31&limit=100
Submitting this query produces a result with 68 tracks:
Querying by Time and Space
Temporal conditions can be combined with other kinds of filters. For instance, we can execute a spatio-temporal query by using a temporal condition along with a spatial condition. In this example, we query the storms which occurred in 2005 and after in Florida. The temporal condition is expressed as time_start > 2005-01-01
.
The spatial condition uses the INTERSECTS
predicate to test whether the line geometry of a storm track intersects a polygon representing the (simplified) coastline of Florida. The polygon is provided as a geometry literal using WKT. (For more information about spatial filtering with CQL in pg_featureserv
see this blog post.)
POLYGON ((-81.4067 30.8422, -79.6862 25.3781, -81.1609 24.7731, -83.9591 30.0292, -85.2258 29.6511, -87.5892 29.9914, -87.5514 31.0123, -81.4067 30.8422))
Putting these conditions together in a boolean expression using AND
, the request to retrieve the desired tracks from pg_featureserv
is:
http://localhost:9000/collections/public.trop_storm/items.html?filter=time_start > 2005-01-01 AND INTERSECTS(geom, POLYGON ((-81.4067 30.8422, -79.6862 25.3781, -81.1609 24.7731, -83.9591 30.0292, -85.2258 29.6511, -87.5892 29.9914, -87.5514 31.0123, -81.4067 30.8422)) )&limit=100
This query produces a result with only 9 tracks, all of which cross Florida:
Try it yourself!
CQL temporal filtering is included in the forthcoming pg_featureserv
Version 1.3. But you can try it out now by downloading the latest build. Let us know what use cases you find for CQL temporal filtering! Crunchy Data offers full managed PostGIS in the Cloud, with Container apps to run pg_featureserv. Try it today.
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read