Engineering

6 things you must back up on your web application

Are you taking it for granted that your hosting provider or developers have everything under control? In our years inheriting or taking over Ruby on Rails and other web applications for our clients we have seen plenty of situations where little more than shoestring backup scripts were in place (and sometimes not even that!). Are you prepared or have you been lucky so far?

Here are 6 things you should check you have covered so when the proverbial hits the fan, you’re not the one running for cover.

1. Databases

Let’s start with what should be an easy win. Your databases. Notice the pluralisation. Many modern web applications extend beyond a single data store. The primary store might be a relational database such as PostgreSQL or MySQL, or one of the many NoSQL databases like MongoDB. However, you may also have several additional databases providing more specialised operations such as search services like Elasticsearch, and other key-value stores like Redis. How necessary, and how often each data store should be backed-up is a decision which should be made with your development team. Make sure that they are backed up often enough to cover your disaster recovery requirements. Is a daily backup enough or would losing a whole day of data entered cause your organisation big problems?

2. Files

I can see your eyes rolling at this heading already. There are few nuances that it’s worth picking up here. First, let’s differentiate between files that make up the application itself, source code, and files that are for configuration or uploaded by users. We’ll come back to source code shortly. For now, let’s concentrate on the others. The specifics of these will depend on how your web application is written and what it does, but configuration files and credentials they might hold might not always be held with source code, so it’s worth ensuring you have them covered for the convenience of restoration. More important, are the files that your users upload. If it’s a modern web application, they may be stored in a cloud storage service such as Amazon S3. While they probably won’t lose your files, you might also want to make sure you take your own backups in case of users accidentally delete a file that they want to be restored. (And also how much do you want to rely on someone else backing up your data?)

How you back up both file and database may be significant. You might need files and database to be in sync, so they accurately reflect their current state, such that if you need to restore data to other servers, then you’re not left with orphaned files or missing data. For example, when a user uploads a file, there will be the file itself and references to it in the database. They both need to be in place for that file to be visible to the user later. If your backups of files and database are independent, there is a risk that one won’t match the other.

3. Source Code

Ideally, you, or your developers, are using version control which provides a blow-by-blow history of the changes to the application. If your development team is using an external third-party service such as GitHub to host the repository, then you should also make sure you have copies as backups where you have control in case anything should happen to GitHub or your development team.

In our team, we use a self-hosted Gitlab Enterprise which we back up to a separate, independent backup service, but we also have the repositories mirrored to private repositories on Github every hour. Generally speaking, you can never have too many backups.

4. Dependencies

In any modern application, it’s likely that it relies on third-party code and services. In our world, we write Ruby applications and the primary dependency eco-system is called RubyGems. It is better practice to vendor these into the source code so you can be sure they will always be available. It is more common than you might think that a specific version is yanked or removed. If you rely on a particular forked version of code via a site like Github, then it’s even more likely that will disappear over the course of a few years. Have your development team, fork a version they control and reference that. It might take a little more work to maintain, but you’ll thank your past selves for your foresight a few years later.

5. Credentials

We rescue many applications from developers who’ve disappeared, or the client has had a disagreement and wanted to move on. I am always surprised how little thought is put into ensuring that the client has everything they need to be in full control of their web application, especially considering how critical the application often is to their business.

It is essential to make sure you have recorded all the credentials to the server/s and other services the application relies on. This includes domain name and DNS management services. It is likely to be a big laundry list of services. Here are some typical examples from applications that we host and maintain for our clients.

Logins and API credentials for:

  • Mail services – servers that send mail, parsing emails, act as “traps” for testing and possibly even mail hosting services.
  • Exception monitoring
  • Logging services
  • Uptime monitoring
  • PDF Generation
  • Content Management services
  • File Storage services
  • Image manipulation services
  • DNS Management
  • Domain Name management

It should go without saying that these must be stored securely and access to them should be tightly controlled. They are the keys to your digital locks.

6. Documentation

Your applications are more than just the source code and data. It took a lot of time and money to create the system, and it didn’t just spring from nowhere. In it’s daily running there will be procedures that are followed by your users and the developers alike. As the person responsible for the application you should ensure that you have access to all the documentation that details this, and ensure it’s being stored and securely and backed up. Without it, if there are issues and staff or developers have moved on, it will take longer than it needs to, to rediscover the processes. Inevitably when there is a current issue in progress, then is not the time you want to waste reinventing already solved problems.

We create document libraries for our clients’ applications and share them with our clients securely. These are backed up and versioned.

Backup Storage and Encryption

While not the focus of this article, you also need to consider where backups are located and how the data is transferred and stored securely. I’ll cover this in more detail in another article.

Restoration and Verification

I’ll talk more in-depth about this in a separate article too, but it’s also critical to regularly test and verify that these backups are successful. You don’t want to find out during a crisis that in fact, the back-ups weren’t doing what everyone thought. Gitlab found out the hard way. They thought they had multiple backup systems in place but they hadn’t checked any of it.

So in other words, out of five backup/replication techniques deployed none are working reliably or set up in the first place. We ended up restoring a six-hour-old backup.

30-second test

Why don’t you try a quick 30-second test? Think of a scenario that is reasonably likely to occur for your web application. Maybe one of your biggest users calls you because they deleted a file a few days ago and now realise they need it back. Alternatively, think of something a little worse. Perhaps, your developers tell you that they just accidentally deleted the production database (It happens – read the Gitlab article above).

If you think you’re well prepared, choose something even more extreme. The data centre your application is hosted within is the victim of a terrorist attack, and your site is not only down, but the server is destroyed.

In any of the scenarios you dream up, can you begin taking meaningful action within 30 seconds? Do you know where to go to start initiating action immediately? I don’t mean ringing up your developers or IT Support asking them if they have got the passwords, or whatever.

If you can’t, you’ve got work to do.

If you’re responsible for a Ruby on Rails web application and you’re concerned that maybe you aren’t backing up and verifying everything you should be, fill out the form below for a free chat to discuss your specific worries. We can provide a health-check for your application and infrastructure if you’d like some independent verification and assurance.

About the author

Andy Henson is the CTO at Foxsoft and is an advocate for growth through continual improvement. Embracing the motto "Obstacles Make Me Stronger," he sees challenges as stepping stones to greater opportunities. With over 20 years of experience, Andy is dedicated to creating solutions that meet client needs and likes to leave things and people better than he found them.

Next