Exploring Azure Functions for Scheduler

azure-function-documentdb.png

During my first job after finishing my undergraduate degree in NUS, I worked in a local startup which was then then the largest bus ticketing portal in Southeast Asia. In 2014, I worked with a senior to successfully migrate the whole system from on-premise to Microsoft Azure Virtual Machines, which is the IaaS option. Maintaining the virtual machines is a painful experience because we need to setup the load balancing with Traffic Manager, database mirroring, database failover, availability set, etc.

In 2015, when I first worked in Singapore Changi Airport, with the support of the team, we made use of PaaS technologies such as Azure Cloud Services, Azure Web Apps, and Azure SQL, we successfully expanded our online businesses to 7 countries in a short time. With the help of PaaS option in Microsoft Azure, we can finally have a more enjoyable working life.

Azure Functions

Now, in 2017, I decided to explore Azure Functions.

Azure Functions allows developers to focus on the code for only the problem they want to solve without worrying about the infrastructure like we do in Azure Virtual Machines or even the entire applications as we do in Azure Cloud Services.

There are two important benefits that I like in this new option. Firstly, our development can be more productive. Secondly, Azure Functions has two pricing models: Consumption Plan and App Service Plan, as shown in the screenshot below. The Consumption Plan lets us pay per execution and the first 1,000,000 executions are free!

Screen Shot 2017-02-01 at 2.22.01 PM.png
Two hosting plans in Azure Functions: Consumption Plan vs. App Service Plan

After setting up the Function App, we can choose “Quick Start” to have a simpler user interface to get started with Azure Function.

Under “Quick Start” section, there are three triggers available for us to choose, i.e. Timer, Data Processing, and Webhook + API. Today, I’ll only talk about Timer. We will see how we can achieve the scheduler functionality on Microsoft Azure.

Screen Shot 2017-02-05 at 11.16.40 PM.png
Quick Start page in Azure Function.

Timer Trigger

Timer Trigger will execute the function according to a schedule. The schedule is defined using CRON expression. Let’s say if we want our function to be executed every four hours, we can write the schedule as follows.

0 0 */4 * * *

This is similar to how we did in the cron job. The CRON expression consists of six fields. The first one is second (0-59), followed by minute (0 – 59), followed by hour (0 – 23), followed by day of month (1 – 31), followed by month (1 – 12) and day of week (0-6).

Similar to the usual Azure Web App, the default time zone used in Azure Functions is also UTC. Hence, if we would like to change it to use another timezone, what we need to do is just add the WEBSITE_TIME_ZONE application setting in the Function App.

Companion File: function.json

So, where do we set the schedule? The answer is in a special file called function.json.

In the Function App directory, there always needs a function.json file. The function.json file will contain the configuration metadata for the function. Normally, a function can only have a single trigger binding and can have none or more than one I/O bindings.

The trigger binding will be the place we set the schedule.

{
    "bindings": [
        {
            "name": "myTimer",
            "type": "timerTrigger",
            "direction": "in",
            "schedule": "0 0 */4 * * *"
        },
        ...
    ],
    ...
}

The name attribute is to specify the name of the parameter used in the C# function later. It is used for the bound data in the function.

The type attribute specifies the binding time. Our case here will be timerTrigger.

The direction attribute indicates whether the binding is for receiving data into the function (in) or sending data from the function (out). For scheduler, the direction will be “in” because later in our C# function, we can actually retrieve info from the myTimer parameter.

Finally, the schedule attribute will be where we put our schedule CRON expression at.

To know more about binding in Azure Function, please refer to the Azure Function Developer Guide.

Function File: run.csx

2nd file that we must have in the Function App directory is the function itself. For C# function, it will be a file called run.csx.

The .csx format allows developers to focus on just writing the C# function to solve the problem. Instead of wrapping everything in a namespace and class, we just need to define a Run method.

#r "Newtonsoft.Json"

using System;
using Newtonsoft.Json;
...

public static async Task Run(TimerInfo myTimer, TraceWriter log)
{
    ...
}

Assemblies in .csx File

Same as how we always did in C# project, when we need to import the namespaces, we just need to use the using clause. For example, in our case, we need to process the Json file, so we need to make use of the library Newtonsoft.Json.

using Newtonsoft.Json;

To reference external assemblies, for example in our case, Newtonsoft.Json, we just need to use the #r directive as follows.

#r "Newtonsoft.Json"

The reason why we are allowed to do so is because Newtonsoft.Json and a few more other assemblies are “special case”. They can be referenced by simplename. As of Jan 2017, the assemblies that are allowed to do so are as follows.

  • Newtonsoft.Json
  • Microsoft.WindowsAzure.Storage
  • Microsoft.ServiceBus
  • Microsoft.AspNet.WebHooks.Receivers
  • Microsoft.AspNet.WebHooks.Common
  • Microsoft.Azure.NotificationHubs

For other assemblies, we need to upload the assembly file, for example MyAssembly.dll, into a bin folder relative to the function first. Only then we can reference is as follows.

#r "MyAssembly.dll"

Async Method in .csx File

Asynchronous programming is recommended best practice. To make the Run method above asynchronous, we need to use the async keyword and return a Task object. However, developers are advised to always avoid referencing the Task.Result property because it will essentially do a busy-wait on a lock of another thread. Holding a lock creates the potential for deadlocks.

Inputs in .csx File and DocumentDB

latest-topics-on-dotnet-sg-facebook-group
This section will display the top four latest Facebook posts pulled by Azure Function.

For our case, the purpose of Azure Function is to process the Facebook Group feeds and then store the feeds somewhere for later use. The “somewhere” here is DocumentDB.

To gets the inputs from DocumentDB, we first need to have 2nd binding specified in the functions.json as follows.

{
    "bindings": [
        ...
        {
            "type": "documentDB",
            "name": "inputDocument",
            "databaseName": "feeds-database",
            "collectionName": "facebook-group-feeds",
            "id": "41f7adb1-cadf-491e-9973-28cc3fca57df",
            "connection": "dotnetsg_DOCUMENTDB",
            "direction": "in"
        }
    ],
    ...
}

In the DocumentDB input binding above, the name attribute is, same as previous example, used to specify the name of the parameter in the C# function.

The databaseName and collectionName attributes correspond to the names of the database and collection in our DocumentDB, respectively. The id attribute is the Document Id of the document that we want to retrieve. In our case, we store all the Facebook feeds in one document, so we specify the Document Id in the binding directly.

The connection attribute is the name of the Azure Function Application Setting storing the connection string of the DocumentDB account endpoint. Yes, Azure Function also has Application Settings available. =)

Finally, the direction attribute must be “in”.

We can then now enhance our Run method to include inputs from DocumentDB as follows. What it does is basically just reading existing feeds from the document and then update it with new feeds found in the Singapore .NET Facebook Group

#r "Newtonsoft.Json"

using System;
using Newtonsoft.Json;
...

private const string SG_DOT_NET_COMMUNITY_FB_GROUP_ID = "1504549153159226";

public static async Task Run(TimerInfo myTimer, dynamic inputDocument, TraceWriter log)
{
    string sgDotNetCommunityFacebookGroupFeedsJson = 
        await GetFacebookGroupFeedsAsJsonAsync(SG_DOT_NET_COMMUNITY_FB_GROUP_ID);
    
    ...

    var existingFeeds = JsonConvert.DeserializeObject(inputDocument.ToString());

    // Processing the Facebook Group feeds here...
    // Updating existingFeeds here...

    inputDocument.data = existingFeeds.Feeds;
}

Besides getting input from DocumentDB, we can also have DocumentDB output binding as follows to, for example, write a new document to DocumentDB database.

{
    "bindings": [
        ...
        {
            "type": "documentDB",
            "name": "outputDocument",
            "databaseName": "feeds-database",
            "collectionName": "facebook-group-feeds",
            "id": "41f7adb1-cadf-491e-9973-28cc3fca57df",
            "connection": "dotnetsg_DOCUMENTDB",
            "createIfNotExists": true,
            "direction": "out"
        }
    ],
    ...
}

We don’t really use this in our dotnet.sg case. However, as we can see, there are only two major differences between DocumentDB input and output bindings.

Firstly, we have a new createIfNotExists attribute which specify whether to create the DocumentDB database and collection if they don’t exist or not.

Secondly, we will have to set the direction attribute to be “out”.

Then in our function code, we just need to have a new parameter with “out object outputDocument” instead of “in dynamic inputDocument”.

You can read more at the Azure Functions DocumentDB bindings documentation to understand more about how they work together.

Application Settings in Azure Functions

Yes, there are our familiar features such as Application Settings, Continuous Integration, Kudu, etc. in Azure Functions as well. All of them can be found under “Function App Settings” section.

Screen Shot 2017-02-18 at 4.40.24 PM.png
Azure Function App Settings

As what we have been doing in Azure Web Apps, we can also set the timezone, store the App Secrets in the Function App Settings.

Deployment of Azure Functions with Github

We are allowed to link the Azure Function with variety of Deployment Options, such as Github, to enable the continuous deployment option too.

There is one thing that I’d like to highlight here is that if you are also starting from setting up your new Azure Function via Azure Portal, then when in the future you setup the continuous deployment for the function, please make sure that you first create a folder having the same name as the name of your Azure Function. Then all the files related to the function needs to be put in the folder.

For example, in dotnet.sg case, we have the Azure Function called “TimerTriggerCSharp1”. we will have the following folder structure.

Screen Shot 2017-02-18 at 4.49.11 PM.png
Folder structure of the TimerTriggerCsharp1.

When I first started, I made a mistake when I linked Github with Azure Function. I didn’t create the folder with the name “TimerTriggerCSharp1”, which is the name of my Azure Function. So, when I deploy the code via Github, the code in the Azure Function on the Azure Portal is not updated at all.

In fact, once the Continuous Deployment is setup, we are no longer able to edit the code directly on the Azure Portal. Hence, setting up the correct folder structure is important.

Screen Shot 2017-02-18 at 4.52.17 PM.png
Read only once we setup the Continuous Deployment in Azure Function.

If you would like to add in more functions, simply create new folders at the same level.

Conclusion

Azure Function and the whole concept of Serverless Architecture are still very new to me. However, what I like about it is the fact that Azure Function allows us to care about the codes to solve a problem without worrying about the whole application and infrastructure.

In addition, we are also allowed to solve the different problems using the programming language that best suits the problem.

Finally, Azure Function is cost-saving because we can choose to pay only for the time our code is being executed.

If you would like to learn more about Azure Functions, here is the list of references I use in this learning journey.

You can check out my code for TimerTriggerCSharp1 above at our Github repository: https://github.com/sg-dotnet/FacebookGroupFeedsProcessor.

Never Share Your Secrets (Secret Manager and Azure Application Settings)

secret-manager-tool-azure-app-service-2

It’s important to keep app secrets out of our codes. Most of the app secrets are however still found in .config files. This way of handling app secrets becomes very risky when the codes are on public repository.

Thus, they are people put some dummy text in the .config files and inform the teammates to enter their respective app secrets. Things go ugly when this kind of “common understanding” among the teammates is messed up.

i-made-a-mistake-cannot-be-reversed
The moment when your app secrets are published on Github public repo. (Image from “Kono Aozora ni Yakusoku o”)

Secret Manager Tool

So when I am working on the dotnet.sg website, which is an ASP .NET Core project, I use the Secret Manager tool.It offers a way to store sensitive data such as app secrets in our local development machine.

To use the tool, firstly, I need to add it in project.json as follows.

{
    "userSecretsId": "aspnet-CommunityWeb-...",
    ...
    "tools": {
        ...
        "Microsoft.Extensions.SecretManager.Tools": "1.0.0-preview2-final"
    }
}

Due to the fact that the Secret Manager tool makes use of project specific configuration settings kept in user profile, we need to specify a userSecretsId value in the project.json as well.

After that, I can start storing the app secrets in the Secret Manager tool by entering the following command in the project directory.

$ dotnet user-secrets set AppSettings:MeetupWebApiKey ""

Take note that currently (Jan 2017) the values stored in the Secret Manager tool are not encrypted. So, it is just for development only.

As shown in the example above, the name of the secret is “AppSettings:MeetupWebApiKey”. This is because in the appsettings.json, I have the following.

{
    "AppSettings": {
        "MeetupWebApiKey": ""
    },
    ...
}

Alright, now the API key is stored in the Secret Manager tool, how is it accessed from the code?

By default, appsettings.json is already loaded in startup.cs. However, we still need to add the following bolded lines in startup.js to enable User Secrets as part of our configuration in the Startup constructor.

public class Startup
{
    public Startup(IHostingEnvironment env)
    {
        var builder = new ConfigurationBuilder()
            .SetBasePath(env.ContentRootPath)
            .AddJsonFile("appsettings.json", optional: true, reloadOnChange: true)
            .AddJsonFile($"appsettings.{env.EnvironmentName}.json", optional: true);
            
        if (env.IsDevelopment())
        {
            builder.AddUserSecrets();
        }

        builder.AddEnvironmentVariables();

        Configuration = builder.Build();
    }
    ...
}

Then in the Models folder, I create a new class called AppSettings which will be used later when we load the app secrets:

public class AppSettings
{
    public string MeetupWebApiKey { get; set; }

    ...
}

So, let’s say I want to use the key in the HomeController, I just need to do the following.

public class HomeController : Controller
{
    private readonly AppSettings _appSettings;

    public HomeController(IOptions appSettings appSettings)
    {
        _appSettings = appSettings.Value;
    }

    public async Task Index()
    {
        string meetupWebApiKey = _appSettings.MeetupWebApiKey;
        ...
    }
    
    ...
}

Azure Application Settings

Just now Secret Manager tool has helped us on managing the app secrets in local development environment. How about when we deploy our web app to Microsoft Azure?

For dotnet.sg, I am hosting the website with Azure App Service. What so great about Azure App Service is that there is one thing called Application Settings.

Screen Shot 2017-01-29 at 11.19.42 PM.png
Application Settings option is available in Azure App Service.

For .NET applications, the settings in the “App Settings” will be injected into the AppSettings at runtime and override existing settings. Thus, even though I have empty strings in appsettings.json file in the project, as long as the correct values are stored in App Settings, there is no need to worry.

Thus, when we deploy web app to Azure App Service, we should never put our app secrets, connection strings in our .config and .json files or even worse, hardcode them.

Application Settings and Timezone

Oh ya, one more cool feature in App Settings that was introduced in 2015 is that we can change the server time zone for web app hosted on Azure App Service easily by just having a new entry as follows in the App Settings.

WEBSITE_TIME_ZONE            Singapore Standard Time

The setting above will change the server time zone to use Singapore local time. So DateTime.Now will return the current local time in Singapore.

References

If you would like to read more about the topics above, please refer to following websites.

Deploy ASP .NET Core Directly via Git

secret-manager-tool-azure-app-service-2

You can deploy ASP .NET Core web apps to Azure App Service directly using Git.

This is actually part of the Continuous Deployment workflow for apps in Azure App Service. Currently, Azure App Service integrate with not only Github, but also Visual Studio Team Services, BitBucket, Dropbox, OneDrive, and so on.

screen-shot-2017-01-30-at-1-14-16-pm
Available deployment source options in Azure App Service.

Although dotnet.sg source code is on Github, choosing the “GitHub” option cannot detect its repository. This is because the Github option only lists the repositories on my personal Github account. The dotnet.sg repo whereas is under the sg-dotnet Github Organization account. Hence, I have to choose “External Repository” as the deployment source instead.

Screen Shot 2017-01-30 at 1.21.03 PM.png
Setting up External Repository (Git) as deployment source in Azure App Service.

After that, whenever there is a new commit, if we do “Sync”, it will create a new deployment record, as shown in the screenshot below. We can anytime revert back to the previous deployment by right-clicking on the desired deployment record and select “Redeploy”.

Screen Shot 2017-01-30 at 1.13.35 PM.png
Deployment history in Azure App Service.

Kudu

So what if we want to customize the deployment process?

Before going into that, the first thing we need to say hi to is Kudu. What is Kudu? Kudu is the engine behind Git deployment in Azure App Service. It is also a set of troubleshooting and analysis tools for use with Azure App Service. It can capture hang dump for worker process for performance analyzing purposes.

On Kudu, we can also download the deployment script, deploy.cmd. We can then edit the file with any custom step we have and put the file under the root of repository.

There is another simpler way which is using a file with the filename “.deployment” at the root of repository. Then in the content of the file, we can specify our command to run during deployment as follows.

[config]
command = THE COMMAND TO RUN FOR DEPLOYMENT

To learn more about Kudu, please watch the following video clip from Channel 9.

References

If you would like to read more about the topics above, please refer to following websites.

Machine Learning in Microsoft Azure

Let me begin with a video showing how Machine Learning helps to improve our life.

The lift is called ThyssenKrupp Elevator, an example of Predictive Maintenance. For more information about it, please read an article about how the system works and the challenges of implementing it on different types of lift.

I first learnt about the term “Machine Learning” when I was taking the online Stanford AI course in 2011. The course basically taught us about the basics of Artificial Intelligence. So, I got the opportunity to learn about Game Theory, object recognition, robotic car, path planning, machine learning, etc.

We learnt stuff like Machine Leaning, Path Planning, AI in the online Stanford AI course.
We learnt stuff like Machine Leaning, Path Planning, AI in the online Stanford AI course.

Meetup in Microsoft

I was very excited to see the announcement from Azure Community Singapore saying that there would be a Big Data expert to talk about Azure Machine Learning in the community monthly meetup.

Doli was telling us story about Azure Machine Learning.
Doli was telling us story about Azure Machine Learning. (Photo Credit: Azure Community Singapore)

The speaker is Doli, Big Data engineer working in Malaysia iProperty Group. He gave us a good introduction to Azure Machine Learning, and then followed by Market Basket Analysis, Regression, and a recommendation system works on Azure Machine Learning.

I found the talk to be interesting, especially for those who want to know more about Big Data and Machine Learning but still new to them. I will try my best to share with you here what I have learned from Doli’s 2-hour presentation.

Ano… What is Machine Learning?

Could we make the computer to learn and behave more intelligently based on the data? For example, is it possible that from both the flight and weather data, we can know which scheduled flights are going to be delayed? Machine Learning makes it possible. Machine Learning takes historical data and make prediction about future trend.

This Sounds Similar to Data Mining

During the meetup, there was a question raised. What is the difference between Data Mining and Machine Learning?

Data Mining is normally carried out by a person to discover the pattern from a massive, complicated dataset. However, Machine Learning can be done without human guidance to predict based on previous patterns and data.

There is a very insightful discussion on Cross Validated that I recommend for those who want to understand more about Data Mining and Machine Learning.

Supervised vs. Unsupervised Learning

Two types of Machine Learning tasks are highlighted in Doli’s talk. Supervised and unsupervised learning.

Machine Learning - Supervised vs Unsupervised Learning
Machine Learning – Supervised vs Unsupervised Learning

In supervised learning, new data is classified based on the training data which are accompanied with labels to help the system to learn by example. The web app how-old.net which went viral recently is using supervised learning. There is an interesting discussion on Quora about how how-old.net works. In the discussion, the Microsoft Bing Senior Program Manager, Eason Wang, also shared his blog post about this how-old.net project that he works on.

Gmail is also using supervised learning to find out which emails are spam or need to be prioritized. In the slides of Introduction to Apache Mahout, it uses YouTube Recommendation an example of supervised learning. This is because the recommendation given by YouTube has taken videos explicitly liked, added to favourites, rated by the user.

I love watching anime so YouTube recommended me some great anime videos. =P
I love watching anime so YouTube recommended me some great anime videos. =P

Unlike supervised learning, unsupervised learning is trying to find structure in unlabeled data. Clustering, as one of the unsupervised learning techniques, is grouping data into small groups based on similarity such that data in the same group are as similar as possible and data in different groups are as different as possible. An example for unsupervised learning is called the k-means Clustering.

Clearly, the prediction of Machine Learning is not about perfect accuracy.

Azure Machine Learning: Experiment!

With Azure Machine Learning, we are now able to perform cloud-based predictive analysis.

Azure Machine Learning is a service that developer can use to build predictive analytic models with training datasets. Those models then can be deployed for consumption as web service in C#, Python, and R. Hence, the process can be summarized as follows.

  1. Data Collection: Understanding the problem and collecting data
  2. Train: Training the model
  3. Analyze: Validating and tuning the data
  4. Deploy: Exposing the model to be consumed

Data Collection

Collecting data is part of the Experiment stage in Machine Learning. In case some of you wonder where to get large datasets, Doli shared with us a link to a discussion on Quora about where to find those public accessible large datasets.

In fact, there are quite a number of sample datasets available in Azure Machine Learning Studio too. During the presentation, Doli also showed us how to use Reader to connect to a MS SQL server to get data.

Get data either from sample dataset or from reader (database, Azure Blob Storage, data feed reader, etc.)
Get data either from sample dataset or from reader (database, Azure Blob Storage, data feed reader, etc.).

To see the data of the dataset, we can click on the output port at the bottom of the box and then select “Visualize”.

Visualize the dataset.
Visualize the dataset.

After getting the data, we need to do pre-processing, i.e. cleaning up the data. For example, we need to remove rows which have missing data.

In addition, we will choose relevant columns from the dataset (aka features in machine learning) which will help in the prediction. Choosing columns requires a few rounds of experiments before finding a good set of features to use for a predictive model.

Let's clean up the data and select only what we need.
Let’s clean up the data and select only what we need.

Train and Analyze

As mentioned earlier, Machine Learning learns from a dataset and apply it to new data. Hence, in order to evaluate an algorithm in Machine Learning, the data collected will be split into two sets, the Training Set for Machine Learning to train the algorithm and Testing Set for prediction.

Doli said that the more data we use to train the model the better. However, they are many people having different opinions. For example, there is one online discussion about the optimal ratio between the Training Set and Testing Set. Some said 3:2, some said 1:1, and some said 3:1. I don’t know much about Statistical Analysis, so I will just make it 1:1, as shown in the tutorial in Machine Learning Studio.

Randomly split the dataset into two halves: a training set and a testing set.
Randomly split the dataset into two halves: a training set and a testing set.

So, what “algorithm” are we talking about here? In Machine Learning Studio, there are many learning algorithms to choose from. I won’t go into details about which algo to choose here. =)

Choose learning algorithm and specify the prediction target.
Choose learning algorithm and specify the prediction target.

Finally, we just hit the “Run” button located at the command bar to train the model and make a prediction on the test dataset.

After the run is successfully completed, we can view the prediction results.

Visualize results.
Visualize results.

Deploy

From here, we can improve the model by changing the features, properties of algorithm, or even algorithm itself.

When we are satisfied with the model, we can publish it as a web service so that we can directly use it for new data in the future. Alternatively, you can also download an Excel workbook from the Machine Learning Studio which has macro added to compute the predicted values.

Read More and Join Our Meetup!

If you would like to find out more about Azure Machine Learning, there is a detailed step-by-step guide available on Microsoft Azure documentation about how to create an experiment in Machine Learning Studio. There is also a free e-book from Microsoft about Azure Machine Learning. Please take a look!

Oh ya, in case you would like to know more about how-old.net which is using Machine Learning, please visit the homepage of Microsoft Project Oxford to find out more about the Face APIs, Speech APIs, Computer Vision APIs, and other cools APIs that you can use.

Please correct me if you spot any mistake in my post because I am still very, very new to Machine Learning. Please join our meetup too, if you would like to know more about Azure.

Journey to Microsoft Azure: Good and Bad Times

I told my friends about problems I encountered on Microsoft Azure. One of my friends, Riza, then asked me to share my experience of hosting web applications on Azure during the Singapore Azure Community meetup two weeks ago.

Azure Community March Meetup in Microsoft Singapore office.
Azure Community March Meetup in Microsoft Singapore office. (Photo credit: Riza)

Problems with On-Premise Servers

Our web applications were hosted on-premise for about 9 years. Recently, we realized that our systems were running slower and slower. The clients kept receiving timeout exception. At the same time, we also ran out of storage space. We had to drive all the way to data centre which is about 15km away from our office just to connect one 1TB external hard disk to our server.

Hence, in one of our company meetings in June, we finally decided to migrate our web applications and databases to the cloud. None of the developers, besides me, knew about cloud hosting. Hence, we all agreed to use Microsoft Azure, the only cloud computing platform that I was familiar with.

Self Learning Microsoft Azure on MVA

When I first heard that the top management of our company had the intentions to migrate web applications to cloud last year, I already started to learn Azure on Microsoft Virtual Academy (MVA) at my own time and pace.

MVA is an online learning platform for public to get free IT training, including some useful introductory courses to Microsoft Azure, as listed below.

  1. Establish Microsoft Azure IaaS Technical Fundamentals
  2. Windows Azure for IT Pros Jump Start
  3. Microsoft Azure IaaS Deep Dive Jump Start
  4. SQL Server in Windows Azure Virtual Machines Jump Start

If you have noticed, the courses above are actually mostly related to IaaS. This is because IaaS was the most straightforward way for us who were going to migrate systems and databases from on-premise to the cloud. If we had chosen PaaS, we would need to redo our entire code base.

You can enjoy the fun shows presented by David and David on MVA
You can enjoy the fun shows presented by David and David on MVA

If you are more into reading books, you can also checkout some free eBooks about Microsoft Azure available on MVA. Personally, I didn’t read any of the book because I found watching MVA training videos was far more interesting.

I learnt after work and during weekends. I started learning Azure around March and the day we did the migration from on-premise to Azure was July. So I basically had a crash course of Azure in just four months.

Now I will say that the learning approach is not recommended. If you are going to learn Azure, it’s important to understand key concepts by reading books and talking to people who are more experience with Microsoft Azure and networking. Otherwise, you might encounter some problems that were hard to be fixed in later stage.

Migration at Midnight

Before doing a migration, we had to do some preparation work.

Firstly, we called our clients one by one. This is because we also hosted clients’ websites on our server. So, we need to inform them to update A record in their DNS. Later, we found out that, in fact, they should be using CNAME so that change of IP address on our side shouldn’t affect them.

Secondly, we prepared a file called app_offline.htm. This is a file to put in the root folder of our web applications hosted on our on-premise server. It would show a page telling our online users that our application was under maintenance no matter the user visited which web page.

Website is under maintenance. Sorry about that!
Website is under maintenance. Sorry about that!

Finally, we did backup for all our databases which were running on our on-premise servers. Due to the fact that our databases were big, it took about 20-30 minutes for us to just do a backup of one database. Of course, this could only be done right before we migrated to the cloud.

We chose to do the migration at midnight because we had many online transactions going on at daytime. In our company, only my senior and I were in charge of doing the migration. The following schedule listed the main activities during our midnight migration.

  • 2am – 3am: Uploading app_offline.htm and backing up databases
  • 3am – 4am: Restoring databases on Azure
  • 4am – 5am: Uploading web applications to Azure and updating DNS in Route 53

Complaints Received on First Day after Migration

We need to finish the migration by 5am because that is when our clients start logging in to our web applications. So, everything was done in a rush and thus we received a number of calls from our clients after 6am on the day.

Some clients complaining that our system became very slow. It turns out that this has to do with us not putting our web application and databases in the same virtual network (v-net). Without putting them in the same v-net, every time our web application called the databases, they had to go through the Internet, instead of the internal connection. Thus the connection was slow and expensive (Azure charged us for outbound data transfer).

We also received calls complaining their websites were gone. That was actually caused by them not updating their DNS records fast enough.

Another interesting problem is part of our system was rejected by our client’s network because they only allowed traffics from certain IP address to access. So, we had to give them the new IP address of our Azure server before everything can work at their side again.

Downtime: The Impact and Microsoft Responses

The web applications have been running for about 8 months on Azure environment since July 2014. We encountered roughly 10 downtimes. Some are because we setup wrongly. Some are due to the Azure platform errors, as reported by Microsoft Azure team.

Our first downtime happened on 4 August 2014, from 12pm to 1:30pm. It’s expected to have high volume to our websites at noon. So, the downtime caused us to loss a huge amount of online sales. The cause of the downtime was later reported by Microsoft Azure team as all our deployments were in the affected cluster in Southeast Asia data centre.

Traffic Manager Came to Rescue

That was when we started to plan to host the backup of all our web applications in another Azure data centre. We then use traffic manager to do a failover load balancing. We planned to carry that out so that when our primary server went down, the backup server was still be there running fine.

Azure Traffic Manager helps to redirect traffic to deployments in another DC when current DC fails to work.
Azure Traffic Manager helps to redirect traffic to deployments in another DC when current DC fails to work.

In the reply Microsoft Azure team sent us, they also mentioned that uptime SLA of virtual machine requires 2 or more instances. Hence, they highly recommended to implement the Availability set configuration for our deployment. Before that, we always thought that it’s sufficient to have one instance running. However, the planned maintenance in Azure was, in fact, quite frequent and sometimes the maintenance took a long time to complete.

Database Mirroring: DB Will Always be Available

So, in addition to the traffic manager, we also applied database mirroring to our setup. We then had three database servers, instead of just one. One as principal, one as witness, and one as mirror. Regarding steps on how we set that up can be find in my another post.

Elements in my simple database mirroring setup.
Elements in my simple database mirroring setup.

With all these setup, we thought the downtime would not happen again. However, soon we realized that the database mirroring was not working.

When the principal was down, there was auto failover. However, none of our web application could connect to the mirror. Also, when the original principal was back online, it would still be a mirror until I did a manual failover. After a few experiments with Microsoft engineers, we concluded that it could be due to the fact that our web applications were not in the same virtual network as the database instances.

Availability Set: At Least One VM is Running

Up to this point, I haven’t talked about configuring two virtual machines in an availability set. That is to make sure that in the same data centre, when one of the virtual machines goes down, another will still be up and running. However, for our web applications, due to the fact that they were all using old version of .NET framework, Azure Redis Cache Service couldn’t even help.

Our web applications use session state a lot. Hence, without Redis, an external session state provider, we had no choice but to use SQL Server as the external session state provider. Otherwise, we would be limited to run web applications on only one instance.

Soon, we found out that we couldn’t even use SQL Server mode for session state because some of the values stored in our session are not serialisable. We had no other option but to rely on Traffic Manager at that moment.

In October 2014, few days after we encountered our third downtime, Microsoft Azure announced the new distribution mode in Azure Load Balancer, called Source IP Affinity. We were so happy when we heard that because that means sticky session would be possible on Azure. Soon, we configured the second instance successfully in the same availability set.

Source IP Affinity
Source IP Affinity

High Availability

After all these have been done, there were still downtime or restarts for one of the virtual machine. However, thanks to load balancer and traffic manager, our websites were still up and running. Regarding the random restarts of virtual machines, Microsoft Azure team had investigated the issue and identified that some of them were due to platform bugs.

There are still more work needs to be done to achieve high availability for our web applications on Azure. If you are interested to find out more about high availability and disaster recovery on Azure, please read this article from Microsoft Azure.

Migrating Back to On-Premise?

When we were still using on-premise, we had only one web server and one database server. However, when we moved to Azure, we had to setup seven servers. So, it’s a challenge to explain to managers on the increase of the cost.

Sometimes, our developers would be also asked by manager if moving back to on-premise was a better option. I have no answer for that. However, if we migrated back to on-premise and there was a downtime happening, who would be in charge of fixing the problems rapidly?

Hence, what we can do now as developers, is to learn as much as we can on how to improve the performance as well as the stability of our web application on Azure. In addition, we will also need to seek help from Microsoft Azure team, if necessary, to introduce new cloud solution to our web applications.

Claudia Madobe, the heroine of Microsoft Azure, is cute but how much do we really know about her?
Claudia Madobe, the heroine of Microsoft Azure, is cute but how much do we really know about her? (Image Credit: Microsoft)

500TB Storage for Database Backup

When your database is big, sometimes just its backup will be around 20GB in size already. Hence, keeping them on disk is always not a solution even though Microsoft Azure provides a data disk with 1TB.

Fortunately, Microsoft Azure offers a scalable and larger storage. It is called Microsoft Azure Storage, a storage with 500TB capacity limit. The good thing about it is we only need to pay for the amount we are using in the storage.

Hence, Chun Siong from Microsoft Singapore suggested my company to try out this service to store our database backups. It turns out that it can be easily done in just 3 steps.

Step 1: Create Azure Storage Account and Retrieve Access Keys

To create a new Azure Storage Account, I simply login to the Azure Management Portal and then choose the Quick Create option of the Storage under Data Services section. I am able to specify the affinity group and replication rule for the Storage Account.

Creating a Storage Account.
Creating a Storage Account.

After the Storage is created, I can retrieve access keys which will be used later in SQL Server to access the Storage Account.

Retrieve the access keys to the Storage Account.
Retrieve the access keys to the Storage Account.

Finally, I just need to create a Container in the Storage Account. All the database backup files will be put inside the Container later.

Created a container in the Storage Account.
Created a container in the Storage Account.

Step 2: Create SQL Server Credentials

I then execute the following T-SQL statement to create credentials so that SQL Server later can connect to the Storage Account.

CREATE CREDENTIAL mycredential 
WITH IDENTITY= 'chunlindbbackup', 
SECRET = '<storage account access key>'

The Storage Account access key here can be either Primary or Secondary access key retrieved in Step 1 above.

Step 3: Backup Database

I create a scheduled job in SQL Server Agent to do backup of my database daily. The URL is the URL of the container created in Step 1.

BACKUP DATABASE mydatabase 
TO URL = 'https://chunlindbbackup.blob.core.windows.net/dbbackup/mydatabase_' + REPLACE(CONVERT(VARCHAR ,GETDATE(),126) ,':','_')+ '.bak'
WITH CREDENTIAL = 'mycredential', INIT, NAME = 'Backup of Database mydatabase'

So yup, now the database backups will be stored on the Storage Account directly.

Restore Database Backup from Azure Storage

To restore a database backup from Storage Account, if the backup file is small, I can just simply execute the following T-SQL statements. The URL of the database backup can be found in the Container in Azure Management Portal.

RESTORE DATABASE mydatabase_test 
FROM URL = 'https://chunlindbbackup.blob.core.windows.net/dbbackup/mydatabase_2014-10-14T13_16_01.243.bak' 
WITH RECOVERY,
MOVE 'mydatabase_db_Data' TO 'F:\db\mydatabase_test_Data.mdf',
MOVE 'mydatabase_db_Log' TO 'F:\db\mydatabase_test_Log.ldf',
CREDENTIAL = 'mycredential'
GO

Unfortunately, the backup that I have is too big. So, I can only download it from Azure Management Portal to the database server first before restoring the database. The download is quite fast.

Download backup file from Storage Account
Download backup file from Storage Account

In case, you wonder why I do not use tool like Azure Storage Explorer, no, it did not work. It would crash also if the backup file was too big.

Pricing

Oh ya, just in case you would like to know the pricing of Azure Storage, you can check it out here: http://azure.microsoft.com/en-us/pricing/details/storage/.

Pricing of Azure Storage in Southeast Asia.
Pricing of Azure Storage in Southeast Asia.

Monitoring Azure VM with System Center Advisor

MS System Center Advisor + Azure VM

In order to proactively avoid problems in our Microsoft Azure Virtual Machines, it’s necessary to have the system admin to receive alerts for unpatched, misconfigured, or unsupported configurations. System Center Advisor from Microsoft can do this. System Center Advisor is a free web service which monitors and analyses installation of Microsoft Server 2008 (and later versions).

Alerts and regular assessment of server configurations
Alerts and regular assessment of server configurations

Activate and Deploy System Center Advisor

Before we can configure System Center Advisor, we need to enable the service on the Advisor website. To do that, we just login to the website with the same Microsoft account  to activate it. After that, we need to deploy a software, which is part of Advisor, on our server on Azure. The software needs to be installed locally in the virtual machine.

Activated account before the deployment of Advisor software on the server
Activated account before the deployment of Advisor software on the server

To deploy Advisor on the server, we need to install gateways and agents on our selected servers. Due to the fact that we are going to only install a stand-alone Advisor to give the system admin a way to access the alerts in Advisor web portal, we just need to install gateways and agents on selected servers.

The agent is responsible for collecting data about the server and storing it locally on the server. For every 24 hours, the agent will then pass the information to the gateway which is in charge of sending the information to the Advisor account.

Plan for Advisor Deployment
Plan for Advisor Deployment

After the Advisor configuration is completed, within the next 24 hours, we should already be able to see the data being shown in the Advisor web portal.

Conclusion

The entire installation process is very simple. There are also a few related online articles that I found, as listed below.