ActiveRecord validations are a very powerful tool. They allow you clearly and concisely declare rules for your model that must be met in order for any instance of that model to be considered “valid”. And using them gives you all sorts of nifty error handling and reporting out of the box.
This validation does exactly what it says. It won’t consider an instance of
Account valid unless the email address associated with it is unique across all accounts.
Or will it?
This particular validation will perform a
SELECT to see if there are any other accounts with the same email address as the one being validated. If there are, then the validation will trip and an error will be recorded. If not, you will be allowed to save the record to the database.
But, here is the problem. Somewhat obviously, ActiveRecord validations are run within the application. There is a gap in time from when ActiveRecord checks to see if an account exists with the same email address, and when it saves that account to the database. This gap is small, but it most certainly exists.
Let’s say we have two processes running our application, serving requests concurrently. At roughly the same time, both processes get a request to create an account with the email address “email@example.com”. Consider the following series of events:
Process 1: Receive request
Process 2: Receive request
Process 1: Verify no account exists with the email “firstname.lastname@example.org”
Process 2: Verify no account exists with the email “email@example.com”
Process 1: Save the record to the database
Process 2: Save the record to the database => whoops!
At the end of this series of events, we will have two accounts in the database with an email of “firstname.lastname@example.org”.
Relational database constraints exist for this exact reason. Unlike the application processes, the database knows exactly what it contains when it processes each request. It is the only one that can reliably determine if it is about to create a duplicate record.
Some Rails developers feel that database constraints, including null constraints, foreign keys, and unique indexes, are not necessary because of the validations performed by ActiveRecord. As you can see by the example above, this is certainly not the case. It is almost a certainty that non-null columns without a non-null constraint will eventually contain null data, foreign key columns without a foreign key constraint will eventually contain ids to records that do not exist, and unique columns without a unique index will eventually contain non-unique data. I’ve seen many projects that solely relied on ActiveRecord validations to ensure the integrity of their data. Every single one of those projects had junk data in the database.
A database with inconsistent or bad data can be incredibly difficult to work with. Relational databases have many tried and true tools that can be used to ensure that the integrity of your data remains in-tact. All of them can easily be setup in a Rails migration. So, be sure to use them when creating or altering tables or columns.
Note: This article has been cross posted on the UrbanBound product blog.