Introduction
Database migrations made simple

Database migrations made simple

database migrations
I make no secret of the fact that I don’t like ORM’s. One part of why I don’t like them is the way they handle database migrations. To successfully create and execute database migrations, you often need to know quite a bit about the framework. I don’t like having to know things which can be solved in a much simpler way.
Apart from ORM migrations, there are tools out there such as Redgate’s database tools. While this is actually a very good and useful tool, it’s often overkill for small to medium-sized applications. Apart from that, it’s also quite expensive, so maybe not the best tool to use in your start-up.

KISS

Database migrations, contrary to popular belief, are not rocket science. Essentially, you want to execute some scripts whenever you release a new version of your software and possibly execute some scripts to undo those in case you want to rollback a faulty deployment. Building such a thing is not very difficult. It won’t have all the bells and whistles that a tool such as Redgate has, but the knowledge required is far less and, more importantly, instantly understandable for any new hires on a team. If all you need is upgrade and downgrade, you can use the code from this article with your modifications and tweaks. It’s based on some simple conventions without any possibilities for configuration or customization, but again, YAGNI (you aren’t gonna need it). If you end up needing it, then you can just modify the script and get on with more interesting stuff.

The code is very simple in its setup and it works like this:

  • A table MigrationScripts is created in the database which contains all scripts that have been executed and at what date
  • On startup of the application (or whichever moment you choose), it scans a folder for .sql scripts with the naming convention YYYYMMDD-HHMM-<some_name>.sql
  • The code then does a diff to see which scripts have already been executed in the database
  • It then runs the scripts that haven’t been executed yet, in the order of the date parsed from the naming convention

Database migrations: upgrading

The example that I’ll be showing here is something I used previously on an ASP.NET MVC app, which was the only client accessing the database. In case you have multiple applications accessing the same database, you probably want to extract this code into a separate application so you can deploy that application whenever a database update is needed. I’ll show the simple version though.

NOTE: I’m using Dapper in this example, as we where using it in said project, but this could easily be done with any other micro-ORM or ADO.NET directly.

public static void Run() 
{ 
    string conString = ConfigurationManager.ConnectionStrings["sql_migrations"] 
                                           .ConnectionString; 
    using (var con = new SqlConnection(conString)) 
    { 
        // check if the migrations table exists, otherwise execute the first script (which creates that table) 
        if (con.ExecuteScalar<int>(@"SELECT count(1) FROM sys.tables 
                                            WHERE T.Name = 'migrationscripts'") == 0) 
        { 
            con.Execute(GetSql("20151204-1030-Init")); 
            con.Execute(@"INSERT INTO MigrationScripts (Name, ExecutionDate) 
                                 VALUES (@Name, GETDATE())", 
                                 new { Name = "20151204-1030-Init" }); 
         } 
                                 
        // Get all scripts that have been executed from the database 
        var executedScripts = con.Query<string>("SELECT Name FROM MigrationScripts"); 
        // Get all scripts from the filesystem 
        Directory.GetFiles(HostingEnvironment.MapPath("/App_Data/Scripts/")) 
                // strip out the extensions 
                .Select(Path.GetFileNameWithoutExtension) 
                // filter the ones that have already been executed 
                .Where(fileName => !executedScripts.Contains(fileName)) 
                // Order by the date in the filename 
                .OrderBy(fileName => DateTime.ParseExact(fileName.Substring(0, 13), "yyyyMMdd-HHmm", null)) 
                .ForEach(script => 
                { 
                    // Execute each one of the scripts 
                    con.Execute(GetSql(script)); 
                    // record that it was executed in the migrationscripts table 
                    con.Execute(@"INSERT INTO MigrationScripts (Name, ExecutionDate) 
                                         VALUES (@Name, GETDATE())", 
                                         new { Name = script }); 
                 }); 
             } 
         } 
         
         static string GetSql(string fileName) => 
             File.ReadAllText(HostingEnvironment.MapPath($"/App_Data/Scripts/{fileName}.sql"));

That’s it, about 20 lines of code (comments and line breaks don’t count ;-)) for a fully working database migration infrastructure.

Database migrations: downgrading

In cases where you want to be able to roll back the database, you could add another convention: all rollback scripts have the same name but with _rollback appended to the filename. Then you can add a separate function which takes as an argument the name of the script you want to roll back. From there on, it’s a case of loading the correct rollback scripts, sorting them, executing them and removing the records from the MigrationScripts table. All in all, another 20 lines.

Conclusion

The above code allows you to:

  • Store all your database changes in Git
  • Do code reviews on SQL scripts (by anyone, including DB Admins)
  • Pull the repo with a fresh database, run the application and get started (handy for new hires)
  • Test the migrations locally and in every test environment
  • Do data migrations (or any SQL you want to write for your migrations)
  • Be flexible: you own the code, so anything is possible (convention change, extract it, deploy it separately, etc. )

It does have a little cost of ownership, as you may need to modify it sometimes. I’d argue however that the cost is smaller than having to know about your ORM’s migration intricacies or learn how to use a database management tool.

works-on-my-machine
Also, this code is fully certified “Works on my machine”-ware. You can use it, tweak it, ask me a question about it, but don’t ask me to create a NuGet package of it, as it would go right back to the place I wanted to avoid with this code snippet.

Kenneth Truyers
View Comments
Next Post

Avoiding code ownership

Previous Post

Writing custom EsLint rules