Image may be NSFW.
Clik here to view.
I am in the process of setting up a very small web-app that I hope to eventually grow into something larger. For now it is a near-zero-budget personal project so I don’t have a lot of resources to throw at it. I do want to set things up in a way that I’ll be able to scale up if I start to see some success.
I am intending to set up a single micro instance on EC2 with a Rails app using PostgreSQL as the database. I’m new to EC2 and getting quite confused about the process of setting up a server.
I have read enough to know that running PostgreSQL on a Micro instance is not generally recommended and running the web server on the same instance even less so, but performance is not of any substantial concern at this time (I only have 3 users for now!). Up-time is more important, but not critical. What is critical is the integrity and reliability of the database.
From what I can tell, just setting up a default PostgreSQL installation on the EC2 instance will work, but the data would disappear if the instance was terminated. What I want to know is;
-
How do I set up PostgreSQL to store it’s data somewhere that will persist?
-
How do I set up continuous backups so there is always another copy of my user’s data?
I realise that neither of these questions is trivial but just being pointed in the right direction would be a huge help as this is a pretty big topic to get my head around.
Image may be NSFW.
Clik here to view.
Create your instance on EBS with “delete on terminate” disabled for the root volume. That’ll make it harder to destroy your data by accident. I wrote a bit about EBS vs instance store in this post a few days ago. The same post discusses options for making PostgreSQL on EC2 perform acceptably (hint: don’t use a micro instance).
Now, most importantly, ensure you set up regular pg_dump
backups or pg_basebackup
+ WAL archiving to somewhere off the Amazon cloud. Check out barman for that. You could archive to S3 (preferably another AWS region’s S3) instead if you’re willing to trust the AWS cloud that much or you don’t mind the odd outage.
If possible, also set up streaming replication to a second Pg server in another availability zone or region.
Test your backups regularly. Monitor your replication (possibly using tools like repmgr
to help automate it).
Snapshots of your host aren’t a bad idea, but shouldn’t really be necessary if you document how you configured it and you test your backups.
Check more discussion of this question.