Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more
Paul Ramsey
Paul Ramsey
"Retrieval Augmented Generation" (RAG) is a useful technique in working with large language models (LLM) to improve accuracy when dealing with facts in a restricted domain of interest.
Asking an LLM about Shakespeare: works pretty good. The model was probably fed a lot of Shakespeare in training.
Asking it about holiday time off rules from the company employee manual: works pretty bad. The model may have ingested a few manuals in training, but not yours
Paul Ramsey
Paul Ramsey
Large language models (LLM) provide some truly unique capacities that no other software does, but they are notoriously finicky to run, requiring large amounts of RAM and compute.
That means that mere mortals are reduced to two possible paths for experimenting with LLMs:
Christopher Winslett
Christopher Winslett
Over the past 12 months, AI has taken over budgets and initiatives. Postgres is a popular store for AI embedding data because it can store, calculate, optimize, and scale using the pgvector extension. A recently introduced gem to the Ruby on Rails ecosystem, the neighbor gem, makes working with pgvector and Rails even better.
Christopher Winslett
Christopher Winslett
Postgres’ pgvector extension recently added HNSW as a new index type for vector data. This levels up the database for vector-based embeddings output by AI models. A few months ago, we had written about approximate nearest neighbor pgvector performance using the available list-based indexes
Christopher Winslett
Christopher Winslett
Note: We have additional articles in this Postgres AI series.
Vector data has made its way into Postgres and I’m seeing more and more folks using it by the day. As I’ve seen use cases trickle in, I have been thinking a lot about scaling data and how to set yourself up for performance success from the beginning. The two primary trade-offs are performance versus accuracy. When seeking performance with vector data, we are using nearest neighbor algorithms, and those algorithms are built around probability of proximity. If your use-case requires 100% accuracy on nearest neighbor, performance will be sacrificed.
After choosing between performance versus accuracy, the next tools in the toolbox are caching and partitioning. Caching is obvious in some situations, if your product is finding “similar meals” or “similar products” or “similar support questions”, then the similarities will not change rapidly.
For the most part, the keys to scaling AI data are the same as scaling any other data type: reduce the number of rows in index and reduce the concurrent queries hitting the database. Once the index has done its work, CPU becomes the primary constraint: how fast can you calculate and compare distances between vectors? Scaling vector data is currently about performance mitigation as much as it is overpowering the data.
In the next few weeks, the Postgres pg_vector extension is launching HNSW indexes (see the commit history for pgvector
Craig Kerstiens
Craig Kerstiens
We are happy to unveil the newest release of Crunchy Postgres for Kubernetes version 5.4. This update brings an array of features set to improve your experience including:
Craig Kerstiens
Craig Kerstiens
There's a lot of excitement around AI, and even more discussion than excitement. The question of Postgres and AI isn't a single question, there are a ton of paths you can take under that heading...
Christopher Winslett
Christopher Winslett
Note: pgvector 0.5 released HNSW indexes which improved performance significantly. Read more about it HNSW Indexes with Postgres and pgvector. We have additional articles in this Postgres AI series
Christopher Winslett
Christopher Winslett
Note: We have additional articles in this Postgres AI series.
In the past month at Crunchy Data, we have talked to a steady stream of customers & community folks wanting to know how to augment their data platforms for AI. Fortunately, Postgres is equipped, nearly out of the box, and ready for the task of storing and querying this data. Through the magic of OpenAI’s API we can easily send data for classification and return the values.
Alongside this post, I created a sample code-base and data packet here