Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Posts about Analytics

  • 19 min read

    Postgres Tuning & Performance for Analytics Data

    Karen Jex

    Your database is configured for the needs of your day-to-day OLTP (online transaction processing) application workload, but what if you need to run analytics queries against your application data? How can you do that without compromising the performance of your application?

    OLTP & OLAP activity in the same database

    Read More
  • 11 min read

    pg_incremental: Incremental Data Processing in Postgres

    Marco Slot

    Today I’m excited to introduce pg_incremental, a new open source PostgreSQL extension for automated, incremental, reliable batch processing. This extension helps you create processing pipelines for append-only streams of data, such as IoT / time series / event data workloads.

    Notable pg_incremental use cases include:

    • Creation and incremental maintenance of rollups, aggregations, and interval aggregations
    • Incremental data transformations
    • Periodic imports or export of new data using standard SQL
    Read More
  • 8 min read

    Iceberg ahead! Analyzing Shipping Data in Postgres

    Marco Slot

    PostgreSQL is one of the most versatile data storage and processing tools available. We enhanced it even further by adding Iceberg tables to PostgreSQL in Crunchy Data Warehouse with a fast analytical query engine.

    What is Iceberg? Iceberg tables are stored in a compressed columnar format for fast analytics in object storage (S3). This means storage is cheap and there are no storage limits. Yet the tables are still transactional and work with nearly all PostgreSQL features. Crunchy Data Warehouse can also query or load raw data from object storage into Iceberg tables via PostgreSQL commands.

    A pattern we repeatedly see in data analytics scenarios is:

    • Use temporary or external tables to collect raw data
    • Use Iceberg as a central repository to organize data
    • Use PostgreSQL tables or materialized views for querying insights
    Read More
  • 8 min read

    Crunchy Data Warehouse: Postgres with Iceberg for High Performance Analytics

    Marco Slot

    PostgreSQL is the bedrock on which many of today’s organizations are built. The versatility, reliability, performance, and extensibility of PostgreSQL make it the perfect tool for a large variety of operational workloads.

    The one area in which PostgreSQL has historically been lacking is analytics, which involves queries that summarize, filter, or transform large amounts of data. Modern analytical databases are designed to query data in data lakes in formats like Parquet

    Read More
  • 5 min read

    Easy Totals and Subtotals in Postgres with Rollup and Cube

    Elizabeth Christensen

    Postgres is being used more and more for analytical workloads. There’s a few hidden gems I recently ran across that are really handy for doing SQL for data analysis, ROLLUP and CUBE. Rollup and cube don’t get a lot of attention, but follow along with me in this post to see how they can save you a few steps and enhance your date binning

    Read More
  • 8 min read

    8 Steps in Writing Analytical SQL Queries

    Christopher Winslett

    It is never immediately obvious how to go from a simple SQL query to a complex one -- especially if it involves intricate calculations. One of the “dangers” of SQL is that you can create an executable query but return the wrong data. For example, it is easy to inflate the value of a calculated field by joining to multiple rows.

    Use Crunchy Playground to follow allow with this blog post using a Postgres terminal:

    Postgres Playground w/ Sample Data

    Read More
  • 8 min read

    4 Ways to Create Date Bins in Postgres: interval, date_trunc, extract, and to_char

    Christopher Winslett

    You followed all the best practices, your sales dates are stored in perfect timestamp format …. but now you need to get reports by day, week, quarters, and months. You need to bin, bucket, and roll up sales data in easy to view reports. Do you need a BI tool? Not yet actually. Your Postgres database has hundreds of functions that let you query data analytics by date. By using some good old fashioned SQL - you have powerful analysis and business intelligence with date details on any data set.

    In this post, I’ll walk through some of the key functions querying data by date.

    Read More
  • 4 min read

    pg_parquet: An Extension to Connect Postgres and Parquet

    Craig Kerstiens

    Today, we’re excited to release pg_parquet - an open source Postgres extension for working with Parquet files. The extension reads and writes parquet files to local disk or to S3 natively from Postgres. With pg_parquet you're able to:

    • Export tables or queries from Postgres to Parquet files
    • Ingest data from Parquet files to Postgres
    • Inspect the schema and metadata of existing Parquet files

    Code is available at: https://github.com/CrunchyData/pg_parquet/.

    Read on for more background on why we built pg_parquet or jump below to get a walkthrough of working with it.

    Read More
  • 14 min read

    Vehicle Routing with PostGIS and Overture Data

    Paul Ramsey

    The Overture Maps collection of data is enormous, encompassing over 300 million transportation segments, 2.3 billion building footprints, 53 million points of interest, and a rich collection of cartographic features as well. It is a consistent global data set, but it is intimidatingly large -- what can a person do with such a thing?

    Building cartographic products is the obvious thing, but what about the less obvious. With an analytical engine like PostgreSQL and Crunchy Bridge for Analytics, what is possible? Well turns out, a lot of things.

    Read More
  • 8 min read

    Window Functions for Data Analysis with Postgres

    Elizabeth Christensen

    SQL makes sense when it's working on a single row, or even when it's aggregating across multiple rows. But what happens when you want to compare between rows of something you've already calculated? Or make groups of data and query those? Enter window functions.

    Window functions tend to confuse people - but they’re a pretty awesome tool in SQL for data analytics. The best part is that you don’t need charts, fancy BI tools or AI to get some actionable and useful data for your stakeholders. Window functions let you quickly:

    • Calculate running totals
    • Provide summary statistics for groups/partitions of data
    • Create rankings
    • Perform lag/lead analysis, ie comparing two separate sets of data with each other
    • Compute moving/rolling averages
    Read More