Crunchy Data joins Snowflake. Read the announcement

Posts about Fun with SQL

Fun with SQL
7 min read
Steve PoustyDec 2, 2020
Replacing Lines of Code with 2 Little Regexs in Postgres
Steve PoustyDec 2, 2020
Greetings readers, today we're going to take a semi-break from my “doing data science in SQL ” series to cover a really cool use case I just solved with regular expressions ( regex ) in Postgres . For those of you who have a bad taste in your mouth from earlier run-ins with regexs, this will be more use case focused and I will do my best to explain the search patterns I used. If you've never heard of regex, there are good resources to learn more about them but I will not be giving a t...
Read More
Fun with SQL
8 min read
Steve PoustyNov 25, 2020
Using Postgres for Statistics: Centering and Standardizing Data
Steve PoustyNov 25, 2020
In the last two blog posts on data science in Postgres, we got our data ready for regression analysis and had predictive variables that are on wildly different scales. Another example of data on different scales would be annual income versus age. The former is usually at least tens of thousands while age rarely gets to a hundred. If you do the regression with non-transformed variables, it becomes hard to compare the effect of the different variables. Statisticians account for this by convertin...
Read More
Fun with SQL
10 min read
Joe ConwayNov 3, 2020
Election Night Prediction Modeling using PL/R in Postgres
Joe ConwayNov 3, 2020
I was sent a link to a tweet regarding election night forecasting using R, and of course the default question was ... could it be run under PL/R inside Postgres? Like almost everything at Crunchy Data , we believe all things are better with Postgres. So I decided to give it a shot, and a bit of a database spin as it were. Since I had to get this blog done quickly, it is going to be mostly code -- sorry about that! The code in this blog (please see a small but important correction at the end)...
Read More
Fun with SQL
7 min read
Steve PoustyOct 27, 2020
Using PostgreSQL and SQL to Randomly Sample Data
Steve PoustyOct 27, 2020
In the last post of this series we introduced trying to model fire probability in Northern California based on weather data. We showed how to use SQL to do data shaping and preparation. We ended with a data set that was ready with all the fire occurrences and weather data in a single table almost prepped for logistic regression. There is now one more step: sample the data. If you have worked with logistic regression before you know you should try to balance the number of occurrences (1) with a...
Read More
Fun with SQL
7 min read
Steve PoustySep 11, 2020
Joins or Subquery in PostgreSQL: Lessons Learned
Steve PoustySep 11, 2020
My introduction to databases and PostgreSQL was for web application development and statistical analysis. I learned just enough SQL to get the queries to return the right answers. Because of my work with PostGIS (and FOSS4G) I became friends with Paul Ramsey . We are now co-workers at Crunchy Data and he is helping me up my SQL-fu. One of the first lessons he taught me was "Try to use joins rather than subqueries." Today's post is going to work through this advice, as Paul and I work throug...
Read More
Fun with SQL
5 min read
Craig KerstiensAug 14, 2020
Building a recommendation engine inside Postgres with Python and Pandas
Craig KerstiensAug 14, 2020
I'm a big fan of data in general. Data can tell you a lot about what users are doing and can help you gain all sorts of insights. One such aspect is in making recommendations based on past history or others that have made similar choices. In fact, years ago I wrote a small app to see if I could recommend wines based on how other ones were rated. It was a small app that I shared among just a handful of friends, some with similar taste, some with different taste. At first it was largely an academi...
Read More
Fun with SQL
5 min read
Steve PoustyMay 18, 2020
Announcing the Crunchy Data Developer Portal
Steve PoustyMay 18, 2020
Greetings friends of Crunchy Data, it is my pleasure to announce the initial release of our application developer portal . An awesome team has been working behind the scenes to bring together this nice little website to help application developers find all their Postgres needs in one place. Our goal is to become a single-stop resource for application developers looking to work with PostgreSQL. We have released three main parts to the site that form the foundation for future growth. Let’s go ove...
Read More
Fun with SQL
8 min read
Paul RamseyMay 14, 2019
Quick and Dirty Address Matching with LibPostal
Paul RamseyMay 14, 2019
Most businesses have databases of previous customers, and data analysts will frequently be asked to join arbitrary data to the customer tables in order to provide analysis. Unfortunately joining address data together is notoriously difficult: • The same address can be expressed in many ways • The parts of addresses are not always clear • There are valid lexically very similar addresses very nearby any given address The same address can be expressed in many ways The parts of addresses are not alw...
Read More

1 2 3