[KOSD Series] Discussion about Cosmos DB Performance

KOSD, or Kopi-O Siew Dai, is a type of Singapore coffee that I enjoy. It is basically a cup of coffee with a little bit of sugar. This series is meant to blog about technical knowledge that I gained while having a small cup of Kopi-O Siew Dai.

kosd-cosmos-db.png

During a late dinner with my friend on 12 January last month, he commented that he encountered a very serious performance problem in retrieving data from Cosmos DB (pka DocumentDB). It’s quite strange because, in our IoT project which also stores millions of data in Cosmos DB, we never had this problem.

Two weeks later, on 27 January, he happily showed me his improved version of the code which could query the data in about one to two seconds.

Yesterday, after having a discussion, we further improved the code. Hence, I’d like to write down this learning experience here.

Preparation

Due to the fact that we couldn’t demonstrate using the real project code, I thus created a sample project getting data from database and collection on my personal Azure Cosmos DB account. The database contains one collection which has 23,967 records of Student data.

The Student class and the BaseEntity class that it inherits from are as follows.

public class Student : BaseEntity
{
    public string Name { get; set; }

    public int Age { get; set; }

    public string Description { get; set; }
}
public abstract class BaseEntity
{
    [JsonProperty(PropertyName = "id")]
    public string Id { get; set; }

    public string Type { get; set; }

    public DateTime CreatedAt { get; set; } = DateTime.Now;
}

You may wonder why I have Type defined.

Type and Cost Saving

The reason of having Type is that, before DocumentDB was rebranded as Cosmos DB in May 2017, the DocumentDB pricing is based on collections. Hence, the more collection we have in the database, the more we need to pay.

confused-about-documentdb-pricing.png

DocumentDB was billed per collection in the past. (Source: Stack Overflow)

To overcome that, we squeeze the different types of entities in the same collection. So, in the example above, let’s say we have three classes — Students, Classroom, Teacher that inherit from BaseEntity, then we will put the data of the three classes in the same collection.

Then here comes a problem: How do we know which document in the collection is Student, Classroom or Teacher? There is where the property Type will help us. So in our example above, the possible value for Type will be Student, Classroom, and Teacher.

Hence, when we add a new document through repository design pattern, we have the following method.

public async Task<T> AddAsync(T entity)
{
    ...

    entity.Type = typeof(T).Name;

    var resourceResponse = await _documentDbClient.CreateDocumentAsync(UriFactory.CreateDocumentCollectionUri(_databaseId, _collectionId), entity);

    return resourceResponse.StatusCode == HttpStatusCode.Created ? (dynamic)resourceResponse.Resource : null;
}

Original Version of Query

We used the following code to retrieve data of a class from the collection.

public async Task<IEnumerable<T>> GetAllAsync(Expression<Func<T, bool>> predicate = null)
{
    var query = _documentDbClient.CreateDocumentQuery<T>(UriFactory.CreateDocumentCollectionUri(_databaseId, _collectionId));

    var documentQuery = (predicate != null) ?
        (query.Where(predicate)).AsDocumentQuery():
        query.AsDocumentQuery();

    var results = new List<T>();
    while (documentQuery.HasMoreResults)
    {
        results.AddRange(await documentQuery.ExecuteNextAsync<T>());
    }

    return results.Where(x => x.Type == typeof(T).Name).ToList();
}

This query will run very slow because the line where it filters the class is after querying data from the collection. Hence, in the documentQuery, it may already contain data of three classes (Student, Classroom, and Teacher).

Improved Version of Query

So one obvious way is to move the line of filtering by Type above. The improved version of code now looks as such.

public async Task<IEnumerable<T>> GetAllAsync(Expression<Func<T, bool>> predicate = null)
{
    var query = _documentDbClient
        .CreateDocumentQuery<T>(UriFactory.CreateDocumentCollectionUri(_databaseId, _collectionId))
        .Where(x => x.Type == typeof(T).Name);

    var documentQuery = (predicate != null) ?
        (query.Where(predicate)).AsDocumentQuery():
        query.AsDocumentQuery();

    var results = new List<T>();
    while (documentQuery.HasMoreResults)
    {
        results.AddRange(await documentQuery.ExecuteNextAsync<T>());
    }

    return results;
}

By doing so, we managed to reduce the query time significantly because all the actual filtering will be done at Cosmos DB side. For example, there was one query I managed to reduce the query time of it from 1.38 minutes to 3.42 seconds using the 23,967 records of Student data.

Multiple Predicates

The code above however has a disadvantage. It cannot accept multiple predicates.

I thus changed it to be as follows so that it returns IQueryable.

public IQueryable<T> GetAll()
{
    return _documentDbClient
        .CreateDocumentQuery<T>(UriFactory.CreateDocumentCollectionUri(_databaseId, _collectionId))
        .Where(x => x.Type == typeof(T).Name);
}

This has another inconvenience is there whenever I call GetAll, I need to remember to load the data with HasMoreResults as shown in the code below.

var studentDocuments = _repoDocumentDb.GetAll()
    .Where(s => s.Age == 8)
    .Where(s => s.Name.Contains("Ahmad"))
    .AsDocumentQuery();

var results = new List<T>();
while (studentDocuments.HasMoreResults)
{
    results.AddRange(await studentDocuments.ExecuteNextAsync<T>());
}

Conclusion

This is just an after-dinner discussion about Cosmos DB between my friend and me. If you have any better idea of designing repository for Cosmos DB (pka DocumentDB), please let us know. =)

Advertisements

TCP Listener on Microsoft Azure for IoT Devices

cloud-service-worker-role-automation-runbook.png

After working on the beacon projects back half a year ago, I was given a new task which is building a dashboard for displaying data collected from IoT devices. The IoT devices basically are GPS tracker with a few other additional sensors such as temperature and shaking detection.

I’m new to IoT field, so I’m going to share in this article what I had learnt and challenges I faced in this project so that it would benefit to juniors who are going to do similar things.

Project Requirements

We plan to have the service to receive data from the IoT devices to be on Microsoft Azure. There will be thousands or even millions of the same devices deployed eventually, so choosing cloud platform to help us scaling up easily.

We also need to store the data in order to display it on dashboard and reports for business use cases.

Challenge 1: Azure IoT Hub and The Restriction of Device Firmware

In the documentation of the device protocol, there is a set of instructions as follows.

First when device connects to server, module sends its IMEI as login request. IMEI is sent the same way as encoding barcode. First comes short identifying number of bytes written and then goes IMEI as text (bytes).

After receiving IMEI, server should determine if it would accept data from this module. If yes server will reply to module 01 if not 00.

I am not sure who wrote the documentation but I am certain that his English is not that easy to comprehend in the first read.

Anyway, this is a good indication that Azure IoT Hub will be helpful because it provides secure and reliable C2D (Cloud-to-Device) and D2C communication with HTTP, AMQP, and MQTT support.

However, when I further read the device documentation, I realized that the device could only send TCP packets over in a protocol the device manufacturer defined. In addition, the device doesn’t allow us to update its firmware at this moment, making it to send data using protocols accepted by Azure IoT Hub is impossible.

There is a fierce discussion about this on Stack Overflow. Unfortunately, none of the respondents understood what the OP was trying to say.

So, I have to say bye-bye to Azure IoT Hub and move on to build TCP Listener myself on Azure.

Challenge 2: Hosting TCP Listener on Azure

There is a great code sample on how to build a TCP listener in C# to listen for connections from TCP network clients.

So, where could we put this code at?

Could we use Azure App Service, such as Functions or Web Apps? Unfortunately, no. This is because only 80/TCP and 443/TCP are exposed publicly and the only protocol that works is HTTP. In addition, App Service is all IIS, the web server provides the entire platform, there is no room for long running processes or threads that can sit and wait for communication on another port outside of IIS.

The only easy option we have now is to use Azure Cloud Service with Worker Role. Worker Role does not use IIS and it can run our app standalone.

creating-worker-role.png

Creating a new Cloud Service project with one Worker Role on Visual Studio 2017.

A default template of WorkerRole class will be provided.

public class WorkerRole : RoleEntryPoint
{
    private readonly CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();
    private readonly ManualResetEvent runCompleteEvent = new ManualResetEvent(false);

    public override void Run()
    {
        Trace.TraceInformation("TrackerTcpListener is running");

        try
        {
            this.RunAsync(this.cancellationTokenSource.Token).Wait();
        }
        finally
        {
            this.runCompleteEvent.Set();
        }
    }

    public override bool OnStart()
    { 
        // Set the maximum number of concurrent connections
        ServicePointManager.DefaultConnectionLimit = 12;

        // For information on handling configuration changes
        // see the MSDN topic at https://go.microsoft.com/fwlink/?LinkId=166357.

        bool result = base.OnStart();

        Trace.TraceInformation("TrackerTcpListener has been started");

        return result;
    }

    public override void OnStop()
    {
        Trace.TraceInformation("TrackerTcpListener is stopping");

        this.cancellationTokenSource.Cancel();
        this.runCompleteEvent.WaitOne();

        base.OnStop();

        Trace.TraceInformation("TrackerTcpListener has stopped");
    }

    private async Task RunAsync(CancellationToken cancellationToken)
    {
        // TODO: Replace the following with your own logic.
        while (!cancellationToken.IsCancellationRequested)
        {
            Trace.TraceInformation("Working");
            await Task.Delay(1000);
        }
    }
}

It’s obvious that the first method we are going to work on is the RunAsync method with a “TODO” comment.

However, before that, we need to define an IP Endpoint for this TCP listener so that we can tell the IoT device to send the packets to the specified port on the IP address.

worker-role-endpoints.png

Configuring Endpoints of a Cloud Service.

With endpoints defined, we can then proceed to modify the code.

private async Task RunAsync(CancellationToken cancellationToken)
{
    try
    {
        TcpClient client;

        while (!cancellationToken.IsCancellationRequested)
        {
            var ipEndPoint = RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["TcpListeningEndpoint1"].IPEndpoint;
            
            var listener = new System.Net.Sockets.TcpListener(ipEndPoint) { ExclusiveAddressUse = false };
            listener.Start();

            // Perform a blocking call to accept requests.
            client = listener.AcceptTcpClient();

            // Get a stream object for reading and writing
            NetworkStream stream = null;

            try
            {
                stream = client.GetStream();

                await ProcessInputNetworkStreamAsync(stream);
            }
            catch (Exception ex)
            {
                // Log the exception
            }
            finally
            {
                // Shutdown and end connection
                if (stream != null)
                {
                    stream.Close();
                }

                client.Close();

                listener.Stop();
            }
        }
    }
    catch (Exception ex)
    {
        // Log the exception
    }
}

The code for the method ProcessInputNetworkStreamAsync above is as follows.

private async Task ProcessInputNetworkStreamAsync(string imei, NetworkStream stream)
{
    Byte[] bytes = new Byte[5120];
    int i = 0;
    byte[] b = null;
    var receivedData = new List<string>();

    while ((i = stream.Read(bytes, 0, bytes.Length)) != 0)
    {
        receivedData = new List<string>();

        for (int reading = 0; reading < i; reading++)
        {
            using (MemoryStream ms = new MemoryStream())
            {
                ms.Write(bytes, reading, 1);
                b = ms.ToArray();
            }
            
            receivedData.Add(ConvertHexadecimalByteArrayToString(b));
        }

        Trace.TraceInformation("Received Data: " + string.Join(",", receivedData.ToArray()));

        // Respond from the server to device
        byte[] serverResponse = ConvertStringToHexadecimalByteArray("<some text to send back to the device>");
        stream.Write(serverResponse, 0, serverResponse.Length);
    }
}

You may wonder what I am doing above with ConvertHexadecimalByteArrayToString and ConvertStringToHexadecimalByteArray methods. They are needed because the packets used in the TCP protocol of the device is in hexadecimal. There is a very interesting discussion about how to do the conversion on Stack Overflow, so I won’t repeat it here.

Challenge 3: Multiple Devices

The code above is only handling one port. Unfortunately, the IoT device doesn’t send over the IMEI number or any other identification number of the device when the actual data pack is sent to the server. Hence, that means if there is more than one IoT device sending data to the same port, we will have no way to identify who is sending the data at the server side.

Hence, we need to make our TCP Listener to listen on multiple ports. The way I chose is to use List<Task> in the Run method as shown in the code below.

public override void Run()
{
    try
    {
        // Reading a list of ports assigned for trackers use
        ...

        var tasks = new List<Task>();
        
        foreach (var port in trackerPorts)
        {
            tasks.Add(this.RunAsync(this.cancellationTokenSource.Token, port));
        }
 
        Task.WaitAll(tasks.ToArray());
    }
    finally
    {
       this.runCompleteEvent.Set();
    }
}

Challenge 4: Worker Role Not Responding Irregularly

This turns out to be the biggest challenge in using Worker Role. After receiving data from the IoT devices for one or two days, the server was not recording any further new data even though the devices are working fine. So far, I’m still not sure about the cause even though there are people encountering similar issues as well.

Hence, I have to find a way to automatically restart the Worker Role for me. Thus, I decided to use PowerShell script to reboot the instance. There is a sample code on Microsoft Technet Gallery – Script Center which does similar thing.

I proceed to use Azure Automation which provides Runbooks to help handling the creation, deployment, monitoring, and maintenance of Azure resources. The Powershell Workflow Runbook that I use for rebooting the worker role daily is as follows.

workflow Reboot-CloudService
{
    Write-Output "Started!"
    
    $azureSubscriptionId = Get-AutomationVariable -Name "AzureSubscriptionId"
    $cloudServiceName = Get-AutomationVariable -Name "CloudServiceName"
    $workerRoleInstanceName = Get-AutomationVariable -Name "WorkerRoleInstanceName" 
    
    $myCredential = Get-AutomationPSCredential -Name "Chun Lin"
    Add-AzureAccount -Credential $myCredential
    
    Select-AzureSubscription -SubscriptionId $AzureSubscriptionId

    Write-Output "Restarting for cloud service: $cloudServiceName."

    ReSet-AzureRoleInstance -ServiceName $cloudServiceName -Slot "Production" -InstanceName $workerRoleInstanceName -Reboot

    Write-Output "Restarted successfully!"
}

In case you wonder where I defined the values for variables such as AzureSubscriptionId, CloudServiceName, and WorkerRoleInstanceName, as well as automation PowerShell credential, there are all easily found in the Azure Portal under “Share Resources” section of Azure Automation Account.

variables-and-credentials-in-automation.png

Providing credentials and variables for the Runbook.

After setting up the Runbook, we need to define schedules in Automation Account and then link it to the Runbook.

setting-schedules-for-automation.png

Setting up schedule and linking it to the Runbook.

There is another tool in the Azure Portal that I find it to be very useful to debug my PowerShell script in the Runbook. It is called the “Test Pane”. By using it, we can easily find out if the PowerShell script is correctly written to generate desired outcome.

test-pane.png

Test Pane available in Runbook.

After that, we can easily get a summary of how the job runs on Azure Portal, as shown in the following screenshot.

azure-automation.png

Job Statistics of Azure Automation.

Yup, that’s all what I had learnt in the December while everyone was enjoying the winter festivals. Please comment if you find a better alternative to handle the challenges above. Thanks in advance and happy new year to you!

References

Create a Docker Image from CentOS Minimal ISO

virtual-box-centos-docker.png

When we are dockerizing an ASP .NET Core application, there will be a file called Dockerfile. For example, the Dockerfile in my previous project, Changshi, has the following content.

FROM microsoft/aspnetcore:2.0
ARG source
WORKDIR /app
EXPOSE 80
COPY ${source:-obj/Docker/publish} .
ENTRYPOINT ["dotnet", "changshi.dll"]

The Dockerfile basically is a set of instructions for Docker to build images automatically. The FROM instruction in the first line initializes a new build stage and sets the Parent Image for subsequent instructions. In the Dockerfile above, it is using microsoft/aspnetcore, the official image for running compiled ASP .NET Core apps, as the Parent Image.

If we need to control the contents of the image, then one way that we can do is to create a Base Image. So, in this post, I’m going to share about my journey of creating a Docker image from CentOS Minimal ISO.

Step 1: Setting up Virtual Machine on VirtualBox

We can easily get the minimal ISO of CentOS on their official website.

download-centos-iso.png

Minimal ISO is available on CentOS Download Page.

After successfully downloading the minimal ISO, we need to proceed to launch the Oracle VM VirtualBox (Download here if you don’t have one).

turn-off-hyperv.png

Switching off Hyper-V.

For Windows users who have Hyper-V enabled because of Docker for Windows, please disable it first otherwise you will either not able to start a VM with 64-bit guest OS even though your host OS is 64-bit Windows 10 or simply encounter a BSOD.

bsod.png

Please switch off Hyper-V before running CentOS 64-bit OS on VirtualBox.

Funny thing is that after switching off Hyper-V, Docker for Windows will make noise saying that it needs Hyper-V to be enabled to work properly. So currently I have to keep switching on and off the Hyper-V feature option depends on which tool I’m going to use.

the-conflict-of-virtualbox-and-docker-between-hyperv.png

VirtualBox vs. Docker for Windows. Pick one.

There is one important step on running CentOS on the VM. We need to remember to configure the Network of the VM to use network adapter attached to “Bridged Adapter”. This is to connect the VM through the host to whatever is our default network device that allocates IP addresses for our physical network. Doing so will help us to retrieve the Docker image tar file via SCP later.

Then in the Network & Host Name section of the installation, we shall see the IP address allocated to the VM.

centos-7-network-and-host-name.png

The IP Address should be available when Ethernet is connected.

To verify whether it works or not, we simply need to use the following command to check if an IP address is successfully allocated to the VM or not. In the minimal installation of CentOS 7, the command ifconfig is already not in use.

# ip a

We then can get the IP Address which is allocated to the VM. Sometimes, I need to wait for about 5 minutes before it can display the IP address successfully.

getting-ip-address.png

The IP address!

Step 2: Installing Docker on VM

After we get the IP address of the VM, we then can SSH into it. On Windows, I use PuTTY, a free SSH client for Windows, to easily SSH to the VM.

ssh-into-vm.png

SSH to the VM with the IP address using PuTTY.

We proceed to install EPEL repository before we can install Docker on the VM.

Since we are going to use wget to retrieve EPEL, we first need to install wget as following.

# yum install wget

Then we can use the wget command to download EPEL repository on the VM.

# wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

The file will be downloaded to the temp folder. So, to install it we will do the following.

# cd /tmp
# sudo yum install epel-release-latest-7.noarch.rpm

After the installation is done, there should be a success message as following showing on the console.

Installed:
    epel-release.noarch 0:7-11
Complete!

Now if we head to /etc/yum.repos.d, we will see the following files.

CentOS-Base.repo        CentOS-fasttrack.repo       CentOS-Vault.repo
CentOS-CR.repo          CentOS-Media.repo           epel.repo
CentOS-Debuginfo.repo   CentOS-Sources.repo         epel-testing.repo

In the CentOS-Base.repo, we need to enable the CentOS Plus repository which is by default disabled. To do so, we simply change the value of enabled to 1 under [centosplus] section.

Then we can proceed to install docker on the VM using yum.

# yum install docker

Step 3: Start Docker

Once docker is installed, we can then start the docker service with the following command.

# service docker start

So now if we list the images and containers inside the docker, the results should be 0 image and 0 container, as shown in the screenshot below.

docker-installed-without-images-and-containers (2)

No image and no container.

Step 4: Building First Docker Image

Thanks to the people in Moby Project, a collaborative project for the container ecosystem to assemble container-based systems, we have a script to create a base CentOS Docker image using yum.

The script is now available on Moby Project Github repository.

We now need to create a folder called scripts in the root and then create a file called createimage.sh in the folder. This step can be summarized as the following commands.

# mkdir scripts
# cd scripts
# vim createimage.sh

We then need to copy-and-paste the script from Moby Project to createimage.sh.

After that, we need to make createimage.sh executable with the following command.

# chmod +x createimage.sh

To run this script now, we need to do as follows, where centos7base is the name of the image file.

# ./createimage.sh centos7base

After it is done, we will see the centos7base image added in docker. The image is very, very small with only 271MB as its size.

first-docker-image.png

First docker image!

Step 5: Add Something (.NET Core SDK) to Container

Since now we have our first Docker image, then we can proceed to create a container with the following command.

# docker run -i -t  /bin/bash

We will be brought into the container. So now we can simply add something, such as the .NET Core SDK to the container by following the .NET Core installation steps for CentOS 7.1 (64-bit) which can be summarized as the following commands to execute.

# sudo rpm --import https://packages.microsoft.com/keys/microsoft.asc

# sudo sh -c 'echo -e "[packages-microsoft-com-prod]\nname=packages-microsoft-com-prod \nbaseurl=https://packages.microsoft.com/yumrepos/microsoft-rhel7.3-prod\nenabled=1\ngpgcheck=1\ngpgkey=https://packages.microsoft.com/keys/microsoft.asc" > /etc/yum.repos.d/dotnetdev.repo'

# sudo yum update
# sudo yum install libunwind libicu
# sudo yum install dotnet-sdk-2.0.0

# export PATH=$PATH:$HOME/dotnet

We then can create a new image from the changes we have done on the container using the following command where the centos_netcore is the repository name and 1.0 is its tag.

docker commit  [centos_netcore:1.0]

We will then realize the new image container will be quite big with 1.7GB as its size. Thanks to .NET Core SDK.

Step 6: Moving the New Image to PC

The next step that we are going to do is exporting the new image as a .tar file using the following command.

docker save  > /tmp/centos_netcore.tar

Now, we need to launch WinSCP to retrieve the .tar file via SCP (Secure Copy Protocol) to local host.

login-as-root-on-winscp.png

Ready to access the VM via SCP.

Step 7: Load Docker Image

So now we can shutdown the VM and enable back the Hyper-V because the subsequent steps will need Docker for Windows to work.

After restarting our local computer with Hyper-V enabled, we can launch Docker for Windows. After that, we load the image to the Docker using the following command in the directory where we keep the .tar file in local host.

docker load < centos_netcore.tar

Step 8: Running ASP .NET Core Web App on the Docker Image

Now, we can change the Dockerfile to use the new image we created.

FROM centos_netcore:1.0
ARG source
WORKDIR /app
EXPOSE 80
COPY ${source:-obj/Docker/publish} .
ENTRYPOINT ["dotnet", "changshi.dll"]

When we hit F5 to make it run in Docker, yup, we will get back the website.

No, just kidding. We will actually get an error message that says localhost doesn’t send any data.

localhost-did-not-send-any-data.png

Localhost did not send any data. Why?

So if we read the messages in Visual Studio Output Window, we will see one line of message saying that it’s unable to bind to http://localhost:5000 on the IPv6 loopback interface.

error--99-eaddrnotavail.png

Error -99 EADDRNOTAVAIL

According to Cesar Blum Silveira, Software Engineer from Microsoft ASP .NET Core Team, this problem is because “localhost will attempt to bind to both the IPv4 and IPv6 loopback interfaces. If IPv6 is not available or fails to bind for some reason, you will see that warning.

ipv6-problem-explanation.png

Explanation of Error -99 EADDRNOTAVAIL by Microsoft engineer. (Link)

Then I switch to view the output from Docker on the Output Window.

output-docker.png

Output from Docker

It turns out that the port on docker is port 80. So I tried to add the following line in Program.cs.

public static IWebHost BuildWebHost(string[] args) =>
    WebHost.CreateDefaultBuilder(args)
    .UseUrls("http://0.0.0.0:80") // Added this line
    .UseStartup()
    .Build();

Now, it works again with the beautiful web page.

launched-at-localhost

Success!

Containers, Containers Everywhere

containers-containers-everywhere.png
The whole concept of Docker images, containers, micro-services are still very new to me. Hence, if you spot any problem in my post, feel free to point out. Thanks in advance!

References

[KOSD Series] IP Addresses of Our Azure App Services that need to be Whitelisted by Our API Providers

KOSD, or Kopi-O Siew Dai, is a type of Singapore coffee that I enjoy. It is basically a cup of coffee with a little bit of sugar. This series is meant to blog about technical knowledge that I gained while having a small cup of Kopi-O Siew Dai.

kosd-azure_web_app-powershell

It is a common scenario for developers to integrate with different parties by using their APIs. Most of the time, the APIs are located in a locked-down network environment where only whitelisted IP addresses are allowed to access their APIs. We will then be asked to give the API providers the IP addresses of our servers.

If it’s our web back-end calling the APIs and we host our web applications on Microsoft Azure App Services, then how could we get the IP addresses?

As mentioned in a discussion about inbound IP address by Benjamin Perkins, the Escalation Engineer on the Azure team, there are about 4 outgoing IP addresses for an Azure Web Apps normally. To retrieve the outbound IP addresses of an Azure web app, we simply need to get it from the Properties of the web app on Azure Portal.

outbound-ip-addresses-in-azure-app-service.png

Locate the outbound IP addresses here.

We can also get the same result if we use the Azure Resource Explorer which is still in preview now.  Benjamin covered this in a video clip on his article too.

For PowerShell lovers, as pointed out by Adrian Calinescu, one of the commenters on Benjamin’s article, we can use PowerShell to find out the outbound IP addresses too. With the new Azure Cloud Shell, we can simply use the following command to retrieve directly the outbound IP addresses of an Azure web app on Azure Portal directly.

Get-AzureRmResource -ResourceGroupName  -ResourceType Microsoft.Web/sites -ResourceName  | select -expand Properties | Select-Object outboundIpAddresses
outbound-ip-addresses-in-azure-app-service-using-powershell.png

Managing Azure resources using shell directly on a browser.

For those who would like to have your own set of outbound IP addresses, please check out ASE (App Service Environment) which grants users control over inbound and outbound application network traffic.

Finally, we can also whitelist all the IP addresses of the Azure datacentres, which can be downloaded here.

azure-datacenter-ip-range-download.png

List of Microsoft Azure Datacentre IP addresses are available on Microsoft website.

References

[KOSD Series] First Attempt of Deploying ASP .NET Core to Azure Container Service

KOSD, or Kopi-O Siew Dai, is a type of Singapore coffee that I enjoy. It is basically a cup of coffee with a little bit of sugar. This series is meant to blog about technical knowledge that I gained while having a small cup of Kopi-O Siew Dai.

kosd-docker-azure_container_registry-vsts

Last month, after sharing the concepts and use cases of Domain Driven Development, Riza moved on to talk about Containers in the sharing session of Singapore .NET Developers Community.

microservices-not-equal-to-containers.png

Riza’s talking about Containers. Yes, microservices are not containers!

Learning Motivation

In the beginning of Riza’s talk, he mentioned GO-JEK, an Indonesia ride-hailing phone service. Due to their rapid growth, the traditional monolithic architecture can no longer support their business. Hence, they switched to use a modern approach which includes moving apps to containers.

Hence, after the meetup, I was very excited to find out more about micro-services and Docker containers. With the ability of .NET Core to be cross-platform, as a Azure lover, I am interested to find out more how I can deploy ASP .NET Core web app to a container in Azure. So, I decided to write this short article to share with my teammates about this that they can learn while drinking a cup of coffee.

Creating New Project with Docker Support

Since I am trying it out as personal project, I choose to start it with a new ASP .NET Core project. Then in the Visual Studio, I can easily turn it to be a Docker supporting app easily by checking the “Enable Docker Support” option.

enable-docker-support.png

Enable Docker Support

For existing web application projects, we will not have the screen above. Luckily, it is still easy to add Docker Support to an existing ASP .NET Core project on Visual Studio.

add-docker-support-to-existing-project

Enabling Docker Support in existing projects.

Then by clicking on the “F5” button to run the project, I manage to get the following screen (The background is customized by me). The message is displayed using the following line.

System.Runtime.InteropServices.RuntimeInformation.OSDescription;
launched-at-localhost.png

Yay, we managed to run the web app inside a Linux container locally.

Publishing to Microsoft Azure with Continuous Delivery

Without Continuous Delivery, we also can easily right-click the web application to publish it to the Container Registry on Azure.

publishing-to-container-registry

Creating a new Azure Container Registry which will have the Docker image published to.

Then, on Azure Portal, we will see three new resources added. Firstly, we will have the Container Registry.

Then, we will also have an app service site which is running the image downloaded from the Container Registry. Finally, we have an App Service Plan which needs to be at least B1 because free and shared SKUs are not available for apps running on Linux (The official Microsoft documentation says we should have the VM size of the App Service Plan to be S1 or larger though).

container-registry-on-azure.png

Container Registry for my new web app, Changshi.

To enable Continuous Delivery, I choose to use Github + Visual Studio Team Services (VSTS). By doing so, build and release will be automatically started whenever I check in code to Github.

build-on-vsts

Build history and details on VSTS.

Yup, this is so far what I have tried out in my first step of playing with containers. If you are interested, please check out the references listed below.

References

Load Balancing Azure Web Apps with Nginx

nginx-ubuntu-azurevm.png

This morning, my friend messaged me a Chinese article about how to do clustering with Linux + .NET Core + Nginx. As we are geek first, we are going to try it out with different approaches. While my friend was going to set up on RaspberryPi, as a developer who loves playing with Microsoft Azure, I proceed to do load balancing of Azure Web Apps in different regions with Nginx.

Setup Two Azure Web Apps

Firstly, I deployed the same ASP .NET Core 2 web app to two different Azure App Services. One of them is deployed at Australia East; another one is deployed at South India (Huuray, Microsoft opens Azure India to the world in April 2017!).

The homepage of my web app, Index.cshtml, is as follows to display the information in Request.Headers.

 

Index.png

Since WordPress cannot show the HTML code properly, I show the code as an image here.

 

In the code above, Request.Headers[“X-Forwarded-For”] is used to get the actual visitor’s IP address instead of the IP address of the Nginx load balancer. To allow this to work, we need to have the following codes added in Startup.cs.

app.UseForwardedHeaders(new ForwardedHeadersOptions
{
    ForwardedHeaders = 
        ForwardedHeaders.XForwardedFor | ForwardedHeaders.XForwardedProto
});
azure-regions.png

In this article, we will set up load balancer in Singapore for websites hosting in India and Australia.

Configure Linux Virtual Machine on Azure

Secondly, as described in the Chinese article mentioned above, the Nginx needs to be set up on a Linux server. The OS used in my case is Ubuntu 17.04.

installing-ubuntu-server-17-on-azure.png

Creating a new Ubuntu server running on Microsoft Azure virtual machine.

The Authentication Type that was chosen is the SSH Public Key option. Hence, we need to create public and private keys using OpenSSL tool. There is a tutorial from Microsoft showing steps on how to generate the keys using Git Bash and Putty.

Installing Nginx

After that, I installed Nginx by using the following command.

sudo apt-get install nginx

After installing it, in order to test whether Nginx is installed properly, I visited the public IP address of the virtual machine. However, it turns out that I couldn’t visit the server because the port 80 by default is not opened on the virtual machine.

Hence, the next step I need to do is opening port using Azure Portal by adding a new inbound security rule for the port 80 and then associate it to the subnet of the virtual network of the virtual machine.

Then when I revisited the public IP of the server, I could finally see the “Welcome to Nginx” success page.

successfully-opened-port-and-installed-nginx.png

Nginx is now successfully running on our Ubuntu server!

Mission: Load Balancing Azure Web Apps with Nginx

As the success page mentioned, further configuration is required. So, we need to edit the configuration file by first opening it up with the following command.

sudo nano /etc/nginx/sites-available/default

The first section that I added is the Cache Configuration.

# Cache configuration
proxy_temp_path /var/www/proxy_tmp;
proxy_cache_path /var/www/proxy_cache levels=1:2 keys_zone=my_cache:20m inactive=60m max_size=500m;

The proxy_temp_path is the path to the directory where the temporary files should be stored at when the response from the upstream server cannot fit into the configured buffers.

The proxy_cache_path is about in which directory the cache should be stored at. The levels=1:2 means that the cache will be stored in a single-character directory with a two-character subdirectory. The keys_zone parameter defines a my_cache cache zone which can store 20MB of keys at most but with the maximum size of the actual data to be 500MB. The inactive=60m means the maximum inactive time cache can be stored, which is 60 minutes in this case.

Next, upstream needs to be defined as follows.

# Cluster sites configuration
upstream backend {
    server dotnetcore-clustering-web01.azurewebsites.net fail_timeout=30s;
    server dotnetcore-clustering-web02.azurewebsites.net fail_timeout=30s;
}

For the default server configuration, we need to make a few modifications to it.

# Default server configuration
# 
server {
    listen 80 default_server;
    listen [::]:80 default_server;
    server_name localhost;
    
    ...
    
    location / {
        proxy_pass http://backend;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        try_files $uri $uri/ =404;
    }
}

Now, we just need to restart the Nginx with the following command.

sudo service nginx restart

Then when we visit the Ubuntu server again, we will realize that we sort of able to reach Azure Web Apps but not really so because it says 404!

404-on-azure.png

Oops, the Nginx routes the visitor to 404 land.

Troubleshooting 404 Error

According to another article which is written by Issac Lázaro, he said this was due to the fact that Azure App Service uses cookies to do ARR (Application Request Routing), hence we need to have the Ubuntu server to pass the header to the web apps by modifying our Nginx configuration to the following.

# Cluster sites configuration
upstream backend {
    server localhost:8001 fail_timeout=30s;
    server localhost:8002 fail_timeout=30s;
}
...

server {
    listen 8001;
    server_name web01;

    location / {
        proxy_set_header Host dotnetcore-clustering-web01.azurewebsites.net;
        proxy_pass http://dotnetcore-clustering-web01.azurewebsites.net;
    }
}

server {
    listen 8002;
    server_name web02;
    
    location / {
        proxy_set_header Host dotnetcore-clustering-web02.azurewebsites.net;
        proxy_pass http://dotnetcore-clustering-web02.azurewebsites.net;
    }
}

Then when we refresh the page, we shall see the website is loaded correctly with the content will be delivered from either web01 or web02.

success.png

Yay, we make it!

Yup, that’s all about setting up a simple Nginx to load balance multiple Azure Web Apps. You can refer to the following articles for more information about Nginx and load balancing.

References

  1. How to open ports to a virtual machine with the Azure portal
  2. Can’t start Nginx – Job for nginx.service failed
  3. Linux+.NetCore+Nginx搭建集群
  4. Understanding Nginx HTTP Proxying, Load Balancing, Buffering, and Caching
  5. Module ngx_http_upstream_module
  6. How To Set Up Nginx Load Balancing with SSL Termination

 

[KOSD Series] Code Review and VSTS

KOSD, or Kopi-O Siew Dai, is a type of Singapore coffee that I enjoy. It is basically a cup of coffee with a little bit of sugar. This series is meant to blog about technical knowledge that I gained while having a small cup of Kopi-O Siew Dai.

kosd-vsts-azure.png

Code reviews are a best practice for software development projects but it’s normally ignored in startups and SMEs because

  • the top management doesn’t understand the value of doing so;
  • the developers have no time to do code reviews and even unit testing.

So, in order to improve our code quality and management standards, we decided to introduce the idea of code reviewing by enforcing pull requests creating in our deployment procedure, even though our team is very small and we are working in a startup environment.

Firstly, we set up two websites on Azure App Service, one for UAT and another for the Production. We enabled Continuous Deployment feature for two of them by configuring Azure App Service integration with our Git repository on Visual Studio Team Services (VSTS).

Secondly, we have two branches in the Git repository of the project, i.e. master and development-deployment. Changes pushed to the branches will automatically be deployed to the Production and the UAT websites, respectively.

In order to prevent that our codes are being deployed to even the UAT site without code reviews, we created a new branch known as the development branch. The development branch allows all the relevant developers (in the example below, we call them Alvin and Bryan) to pull/push their local changes freely from/to it.

git-flow-on-vsts.png

Once any of the developers is confident with his/her changes, he/she can create a new pull request on VSTS.

creating-pull-request.png

Creating a new pull request on VSTS.

We then proceed to make use of the new capability on VSTS, which is to set policies for the branches. In the policy setting, we checked the option “Require a minimum number of reviewers” to prevent direct pushes to both master and development-deployment branches.

branch-policies.png

Enabled the code review requirement in each pull request to protect the branch.

So for every deployment to our UAT and Production websites, the checking step is in place to make sure that the deployments are all properly reviewed and approved. This is not just to protect the system but also to protect the developers by having a standardized quality checking across the development team.

This is the end of this episode of KOSD series. If you have any comment or suggestion about this article, please shout out. Hope you enjoy this cup of electronic Kopi-O Siew Dai. =)