Moving Azure storage (Tables & Queues) to another data center

As a follow-up to my last post Moving a SQL Azure database to another data center, I have one final step to gather all components in the same data center – move Azure Storage (tables & queues).

This article starts with the following geographical infrastructure, all in Azure but still in two separate data centers (North Central US and North Europe):

I haven’t found a way to sync table storage (except for blobs, which I don’t use) so for this migration I will need an outage so that data is no longer written to the old storage account while I migrate to the new one.

My plan of attack for this migration is:

  1. Implement a way to put my application (web and API) in offline mode with a friendly user message
  2. Migrate all static and historic data that doesn’t change during operations
  3. Put my application in offline mode
  4. Let my background worker process empty all queues
  5. Migrate the rest of the data
  6. Change the storage connection string
  7. Put the application back online
  8. Delete all tables and queues from the old storage account after smoke testing the app

Let’s do this and hope that my plan works! (documenting as I go…)

Step 1 – Implement a way to put my application offline

I implemented this step by adding two new configuration settings in ServiceConfiguration and a handler in global.asax that uses the settings to determine whether to redirect to a separate page showing that the application is offline. By placing the setting in ServiceConfiguration, instead of web.config, I can update the setting without re-deploying the app.

      <Setting name="Offline" value="false" />
      <Setting name="OfflineMessage" value="Am I Interesting is down for maintanence. Please try again in a few minutes!" />

And some code in global.asax to handle the setting:

        protected void Application_BeginRequest(object sender, EventArgs e)
        {
            ...
            // Handle offline mode

            if (Request.Url.AbsolutePath.ToLower().Contains("/offline"))
            {
                return;
            }

            if (!_configurationHelper.Offline)
            {
                return;
            }

            Response.Redirect("/Offline/Index");
        }

I added a simple offline page and then deployed the new version of the app (still with Offline=false).

Step 2 – Migrate all static and historic data that doesn’t change during operations

For the data migration I used Cloud Storage Studio from Cerebrata, now part of RedGate. First I created the same table names in the new storage account that I’ve created in North Central US. I then downloaded all static data and also all “old” data (modified earlier than yesterday) to one XML file per table on my machine:

Uploading the XML files was just as easy using the “Upload Table Data” function in Cloud Storage Studio. This was quite a bit more time consuming (about 3 minutes per 1000 rows) since the entities were uploaded individually, but that was still ok since my application remained online throughout the operation.

Now the remaining steps need to go quite fast to minimize downtime, even though the users now at least getting a message stating that we’re performing maintenance of the application and that they should retry again in a few minutes.

Step 3 – Put the application offline

I changed my new Offline setting to True in Azure Management Portal and clicked OK to recycle the instances and put the application offline.

Here is encountered an unexpected behavior! I expected the setting to get applied to one instance at a time (I’m running two instances of my hosted service) with no downtime, but during about one minutes I was unable to reach the application while this setting was applied. Actually, I never received an error, but the response time from the app was very long. After that minute I started seeing the offline page served from the instance that was updated, while the other instance was still recycling (nice behavior). Still not too bad with a slightly unresponsive app during one minute…

Step 4 – Let my background worker process empty all queues

After putting the application offline and the instances recycled to apply the setting, I checked the queue lengths and made sure they were empty before continuing.

Step 5 – Migrate the rest of the data

I did a new download of fresh data from Cloud Storage Studio by changing for example the query:
(UpdatedDateUtc le datetime’2012-02-18′)
to:
(UpdatedDateUtc gt datetime’2012-02-18′)

For those of you who are not familiar with querying the storage services:
le = Less than or Equal to
gt = Greater than

I then appended the data by uploading it to the new tables and verified that the count of entities in the new tables matched the old tables.

Step 6 – Change the storage connection string

I changed the connection string in ServiceConfiguration via the Azure Management portal…

Step 7 – Put the application back online

…and at the same time changed to Offline=false. The instances recycled once again and came back online, without interruption, now working against the new storage account in a completely different part of the world!

Step 8 – Delete all tables and queues the old storage account after smoke testing the app

For smoke testing, I checked that new data was written to tables in the new storage account. I also observed that new queues were created – as expected (I have code for this on startup) – when the instances recycled. I then deleted all tables and queues from the old storage account!

Done!

I only had about 10-15 minutes of app downtime during this operation, but it was “pretty” downtime as I started by implementing offline handling in the app.

Ahh…finally all my services live in the same data center – mission accomplished!

Using the Table Storage Upsert (Insert or Replace)

I just implemented the new Upsert functionality introduced in the Azure SDK 1.6 and wanted to document the solution as I missed one detail that kept me struggling for a while – hopefully this post can help someone else avoid it!

My intent with this change was to replace another update method that first checked to see if the entity already existed, and then either added the new entity or updated the existing one. Upsert will cut down my storage transactions for updates in half as it only executes one request per update. It will also boost performance for the same reason.

Step 1:
Make sure that a specific version header is sent in the REST requests to the storage service. I have a static method that creates a TableServiceContext that I updated with one row (line 10):

        public static TableServiceContext CreateContext()
        {
            var context = new TableServiceContext(BaseAddress, Credentials)
            {
                IgnoreResourceNotFoundException = true // FirstOrDefault() will return null if no match, instead of 404 exception
            };

            // Add specific header to support Upsert
            // TODO: Remove when Azure supports this by default
            context.SendingRequest += (sender, args) => (args.Request).Headers["x-ms-version"] = "2011-08-18";

            return context;
        }

Step 2:
Use the AttachTo method combined with UpdateObject and the ReplaceOnUpdate option for the SaveChanges method to execute the upsert (I have a generic class with the table storage methods):

        public void AddOrReplace(T obj)
        {
            _tableServiceContext.AttachTo(_tableName, obj);
            _tableServiceContext.UpdateObject(obj);
            _tableServiceContext.SaveChangesWithRetries(SaveChangesOptions.ReplaceOnUpdate);
        }

Note: I Already had an Update method that used the same code but with the following slightly different AttachTo call (the last asterisk parameter). This caused a ResourceNotFound error when saving changes and it took me a while to find and fix this error.

_tableServiceContext.AttachTo(_tableName, obj, "*");

That’s it!