Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more
Paul Ramsey
Paul Ramsey
With the extension, it is possible to access gigabytes of raster data from the cloud, without ever downloading the data . How? The venerable extension (released 13 years ago ) already has the critical core support built-in! Rasters can be stored inside the database, or outside the database, on a local file system or anywhere it can be accessed by the underlying GDAL raster support library. The storage options include S3, Azure, Google, Alibaba, and any HTTP server that supports RANG...
Read MorePaul Ramsey
Paul Ramsey
The number of cool things you can do with the http extension is large, but putting those things into production raises an important problem. The amount of time an HTTP request takes, 100s of milliseconds, is 10- to 20-times longer that the amount of time a normal database query takes. This means that potentially an HTTP call could jam up a query for a long time. I recently ran an HTTP function in an update against a relatively small 1000 record table. The query took 5 minutes to run, and durin...
Read MorePaul Ramsey
Paul Ramsey
"Retrieval Augmented Generation" (RAG) is a useful technique in working with large language models (LLM) to improve accuracy when dealing with facts in a restricted domain of interest. Asking an LLM about Shakespeare: works pretty good. The model was probably fed a lot of Shakespeare in training. Asking it about holiday time off rules from the company employee manual: works pretty bad. The model may have ingested a few manuals in training, but not yours ! Is there a way around this LLM limi...
Read MorePaul Ramsey
Paul Ramsey
In late November, on the day after GIS Day, we hosted the annual PostGIS day online event. 22 speakers from around the world, in an agenda that ran from mid-afternoon in Europe to mid-afternoon on the Pacific coast. We had an amazing collection of speakers, exploring all aspects of PostGIS, from highly technical specifics, to big picture culture and history. A full playlist of PostGIS Day 2024 is available on the Crunchy Data YouTube channel . Here’s a highlight reel of the talks and themes t...
Read MorePaul Ramsey
Paul Ramsey
Large language models (LLM) provide some truly unique capacities that no other software does, but they are notoriously finicky to run, requiring large amounts of RAM and compute. That means that mere mortals are reduced to two possible paths for experimenting with LLMs: • Use a cloud-hosted service like OpenAI . You get the latest models and best servers, at the price of a few micro-pennies per token. • Use a small locally hosted small model. You get the joy of using your own hardware, and on...
Read MorePaul Ramsey
Paul Ramsey
If you missed some of the headlines and release notes, Postgres 17 added another huge JSON feature to its growing repository of strong JSON support with the JSON_TABLE feature. JSON_TABLE lets you query JSON and display and query data like it is native relational SQL. So you can easily take JSON data feeds and work with it like you would any other Postgres data in your database. A few days ago, I was awakened in the middle of the night when my house started to shake. Living in the Cascadia su...
Read MorePaul Ramsey
The Overture Maps collection of data is enormous, encompassing over 300 million transportation segments , 2.3 billion building footprints , 53 million points of interest , and a rich collection of cartographic features as well. It is a consistent global data set, but it is intimidatingly large -- what can a person do with such a thing? Building cartographic products is the obvious thing, but what about the less obvious. With an analytical engine like PostgreSQL and Crunchy Bridge for Analy...
Read MorePaul Ramsey
Paul Ramsey
Back in the 1990s, before anything was cool (or so my children tell me) and at the dawn of the Age of the Meme, a couple of college students invented a game they called the " Six Degrees of Kevin Bacon ". The conceit behind the Six Degrees of Kevin Bacon was that actor Kevin Bacon could be connected to any other actor, via a chain of association of no more than six steps. Why Kevin Bacon? More or less arbitrarily, but the students had noted that Bacon said in an interview that "he had worked...
Read MorePaul Ramsey
Paul Ramsey
Calculating distance is a core feature of a spatial database, and the central function in many analytical queries. • "How many houses are within the evacuation radius?" • "Which responder is closest to the call?" • "How many more miles until the school bus needs routine maintenance?" "How many houses are within the evacuation radius?" "Which responder is closest to the call?" "How many more miles until the school bus needs routine maintenance?" PostGIS and any other spatial database let you answ...
Read MorePaul Ramsey
Paul Ramsey
Clustering points is a common task for geospatial data analysis, and PostGIS provides several functions for clustering. • ST_ClusterDBSCAN • ST_ClusterKMeans • ST_ClusterIntersectingWin • ST_ClusterWithinWin ST_ClusterDBSCAN ST_ClusterKMeans ST_ClusterIntersectingWin ST_ClusterWithinWin We previously looked at the popular DBSCAN spatial clustering algorithm, that builds clusters off of spatial density. This post explores the features of the PostGIS ST_ClusterKMeans function. K-means cluste...
Read More