Heroku PostgreSQL and its limits

I have a pet project that I’m passionate about in my free time, and as most of the web-related pet projects, I needed to host it somewhere and make it live, at least for myself.

This time I decided to go with Heroku, just like my good old times, because it’s just so much easier to go live with, scale, configure, and make sure it’s always up and running.

Since this project involves scheduling and concurrent executions, I needed it to never sleep, and that requires a paid plan on Heroku. That plan comes with some higher limits than the free one, but it still has limitations here and there. For example, I could only store a maximum of 10k records in the PostgreSQL database (via Heroku PostgreSQL addon) or double my expenses to have a higher limit. This wasn’t really convenient because the amounts of data were growing quite quickly, and the more single records to track I would have, the more amounts of related data would grow exponentially. Going up a plan or two is still an option, but I wanted to find out if there are any other opportunities.

Screenshot 2022-08-09 at 20.58.51.png

Searching for alternatives

I started exploring alternative solutions:

  • Dumping outdated data (e.g. longer than 1 week) somewhere else
  • Compressing data (removing/dumping granular data and leaving only “one record per day” for stats)
  • Using a different storage solution

One thing that came up to my mind, and that relates to the third option as well, was hosting an external instance with a database and having an API layer for writing/reading data, but it instantly felt redundant and not very productive because of the additional HTTP interaction, time and resources I would need to build that communication.

Then I thought: every Rails or other app can read the DATABASE_URL env variable and instantiate a connection that way. The database doesn’t have to live on the same machine the app is running. What if I could just create an instance of PostgreSQL anywhere else and use it?

Amazon Web Services

Turns out, Heroku uses AWS too. Under the hood, the Heroku PostgreSQL addon creates an instance with PostgreSQL via AWS for you, and automatically sets up the environment to use that database. If you have a Rails (or any other backend app) on Heroku, most probably you can confirm it by going to the “Settings” section on Heroku dashboard of your app and revealing the env variables.

So since Heroku uses AWS and has its convenient monitoring of the database on its own level, why can’t I just use any other provider I choose and maintain my database myself, but make my application connect to it?

Turns out, Heroku generously lets you do it! While being a very easy to set up and easy to use platform, it still lets you customize a lot of things.

I found Amazon RDS for PostgreSQL a nice option, plus for the moment of writing this post it has a free tier for 12 months, which should be enough for experimenting and making the project grow. Turns out, folks at Heroku are aware of this option, and they have a guide about using Amazon RDS with MySQL for your Heroku Rails application. It’s not exactly PostgreSQL, but the guide itself is very close and lets you get an overall picture.

Setting it up

Here is the general outline of what needs to be done:

  1. Create and configure a PostgreSQL instance on Amazon RDS
    1. Sign up at Amazon Web Services
    2. Create an RDS instance for PostgreSQL
    3. Configure incoming traffic
    4. Creating the database for your app
  2. Configure your Heroku app to use the new database
    1. Download and track the certificate into your project’s git
    2. Set up the DATABASE_URL environment variable

Below are the detailed instructions.

1. Creating the PostgreSQL instance on Amazon RDS

Signing up at Amazon Web Services

Obviously, you need an Amazon account in order to be able to create an instance. You can do it here, or maybe you already have an Amazon account.

Keep in mind that you will also need to add a payment method to your Amazon account if you don’t have one, but don’t worry, if you use a free tier it shouldn’t charge you anything.

Creating an RDS instance

It is really useful to check out this section which will give you an overall look at the service, and you cal also find out the details of pricing and performance of different types of instances if you consider a paid tier.

When you’re about to create an instance, make sure you have a region that corresponds geographically to your Heroku application region, so that there would be less distance for the data/traffic between your application and your database.

Screenshot 2022-08-09 at 20.11.50.png

NOTE: If, for some reason, your Heroku app exists in a region it will not operate, and you want to migrate it closer to yourself or the region it will operate, there is a guide Migrating an Application to Another Region

When you’ve selected the region and are ready to go, you can proceed and create an RDS instance. Choose RDS, then “Create database”, select PostgreSQL (you can use anything else but this guide is all around the PostgreSQL example for the sake of exact details).

For the moment of writing, it is possible to choose the Free tier, so that’s what I would propose for getting started. You can modify your instance later and scale it if you need to.

Screenshot 2022-08-09 at 20.17.02.png

Further, you can follow this official guide to get more into the details of different configuration options you can choose.

It is also recommended to set a custom Master username (I’ll use jane in the examples) and corresponding password (I’ll use mypassword) so that you would use it in your DATABASE_URL on Heroku to authenticate to the database.

And it’s also important to create the database as “Public” so that you would be able to access it from the outside of the AWS ecosystem.

When your instance is created, please don’t forget to save your password somewhere safe.

From the guide mentioned above, you can try to connect to your database from console.

Make sure you can get these variables:

YOUR_HOST - It’s in your instance page under the “Connectivity & security” section – the “Endpoint”. Looks like mysuperapp.qweasdzxc.eu-north-1.rds.amazonaws.com

username – what you used while creating your instance. In the examples we’ll jane

The command would look like this:

psql \
   --host=YOUR_HOST \
   --port=5432 \
   --username=jane \
   --password \
       --dbname=postgres

and you will be prompted with your password that was mentioned above. If you’ve got to the postgres console – congrats!

If you have a timeout or something – chances are your IP is not allowed to access your database, and you might need to configure it.

Configuring incoming traffic

What helped me was creating a custom security group that would allow all incoming IPv4 addresses:

  1. Navigate to the EC2 console (under Compute > EC2), and select Security Groups
  2. Tick the ‘default’ one, click “Actions” and “Copy to new security group” (we need to create a copy and modify it)
  3. Give it a name
  4. Make sure that under “Inbound rules” it has Source type as “Anywhere-IPv4”, and the same for “Outbound rules”. Save the group

    Screenshot 2022-08-09 at 20.33.32.png

  5. Going back to your RDS instance settings (the “Modify” button), “Connectivity” section – make sure you only pick the security group that you’ve just created.

Now you should be good to go and proceed connecting to your new database from your local console!

If that goes fine, it’s time to configure the Heroku app.

Creating a database

For convenience, create the database for your Rails app ahead using the PostgreSQL console:

create database mysuperapp owner jane;

2. Configuring your Heroku app

Download the Amazon RDS certificate into your project

As written in the official guide, download the Amazon RDS CA Certificate.

Navigate to your project and download it:

curl https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem > ./config/amazon-rds-ca-cert.pem

Then add it to your git repository. Don’t deploy yet.

Edit your config/database.yml to look something like this:

default: &default
  adapter: postgresql
  encoding: unicode
  # For details on connection pooling, see Rails configuration guide
  # https://guides.rubyonrails.org/configuring.html#database-pooling
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
# .....
production:
  <<: *default
  url: <%= ENV['DATABASE_URL'] %>
# .....

Detach your current Heroku PostgreSQL addon

WARNING: The actions above for detaching/destroying your current PostgreSQL addon can be very dangerous, and maybe not necessary if you have important production data. Make sure you back it up first!

After making sure you agree to lose the data, detach your current Heroku PostgreSQL addon.

I tried it it with both detach and destroy variations:

heroku addons:detach DATABASE --app mysuperapp

heroku addons:destroy DATABASE --app mysuperapp

But the value in place of DATABASE can actually vary. So you can also do it from the Heroku UI by going to Resources and deleting the Heroku PostgreSQL addon.

After that you can update the DATABASE_URL env variable in your Heroku app settings (Go to Settings → Reveal Config Vars)

Make sure your DATABASE_URL looks like following:

postgresql://jane:mypassword@myhost:5432/mysuperapp?sslrootcert=config/amazon-rds-ca-cert.pem&sslmode=require

Now you can commit both the certificate and your config/database.yml, push it and deploy to Heroku and you should be good to go.

NOTE: I had to play a bit with Heroku to finally load the DATABASE_URL properly. It looks like it only reloads env variables with successful deployments, so make sure you get a successful deploy so that the env variable changes would apply afterwards. You can get into the rails console of your heroku app and check the actual state of the env variable by doing this:

heroku run console -a hellomrdoctor

and then running ENV['DATABASE_URL'] – it should correspond to your new one.

Enjoying

If you’ve made it to this step – congratulations! You now have a db instance that is more “personal”, scalable and configurable, but it also might not be doing some automatic magic that Heroku used to do for you. And all the monitoring and logging for the database will happen on Amazon instead of on Heroku now. So make sure it’s well-maintained.

There are also lots of different things you can explore – creating replicas for easier reading under heavy loads, taking snapshots, also scaling – in other words, a lot of fun things to explore and experiment with.

While being such an easy to set up platform, Heroku still lets you choose your own modules and options that fit your exact case, and I think it makes Heroku an even more versatile solution for many engineers and companies. In my case it was really important to get off that 10k limit by any means and make room for exponentially growing data, but in other cases it might be much easier to purchase higher-tier plans on Heroku because it just works, and it saves time. It was also a great experience of connecting different tools and platforms and making them work, and so far, work perfectly. It’s great that tech nowadays can be so pluggable, having so many different external modules and interfaces, and still deliver its essence, or its goal, or its product really well.

Useful links