Load Balancing Azure Web Apps with Nginx

nginx-ubuntu-azurevm.png

This morning, my friend messaged me a Chinese article about how to do clustering with Linux + .NET Core + Nginx. As we are geek first, we are going to try it out with different approaches. While my friend was going to set up on RaspberryPi, as a developer who loves playing with Microsoft Azure, I proceed to do load balancing of Azure Web Apps in different regions with Nginx.

Setup Two Azure Web Apps

Firstly, I deployed the same ASP .NET Core 2 web app to two different Azure App Services. One of them is deployed at Australia East; another one is deployed at South India (Huuray, Microsoft opens Azure India to the world in April 2017!).

The homepage of my web app, Index.cshtml, is as follows to display the information in Request.Headers.

 

Index.png

Since WordPress cannot show the HTML code properly, I show the code as an image here.

 

In the code above, Request.Headers[“X-Forwarded-For”] is used to get the actual visitor’s IP address instead of the IP address of the Nginx load balancer. To allow this to work, we need to have the following codes added in Startup.cs.

app.UseForwardedHeaders(new ForwardedHeadersOptions
{
    ForwardedHeaders = 
        ForwardedHeaders.XForwardedFor | ForwardedHeaders.XForwardedProto
});
azure-regions.png

In this article, we will set up load balancer in Singapore for websites hosting in India and Australia.

Configure Linux Virtual Machine on Azure

Secondly, as described in the Chinese article mentioned above, the Nginx needs to be set up on a Linux server. The OS used in my case is Ubuntu 17.04.

installing-ubuntu-server-17-on-azure.png

Creating a new Ubuntu server running on Microsoft Azure virtual machine.

The Authentication Type that was chosen is the SSH Public Key option. Hence, we need to create public and private keys using OpenSSL tool. There is a tutorial from Microsoft showing steps on how to generate the keys using Git Bash and Putty.

Installing Nginx

After that, I installed Nginx by using the following command.

sudo apt-get install nginx

After installing it, in order to test whether Nginx is installed properly, I visited the public IP address of the virtual machine. However, it turns out that I couldn’t visit the server because the port 80 by default is not opened on the virtual machine.

Hence, the next step I need to do is opening port using Azure Portal by adding a new inbound security rule for the port 80 and then associate it to the subnet of the virtual network of the virtual machine.

Then when I revisited the public IP of the server, I could finally see the “Welcome to Nginx” success page.

successfully-opened-port-and-installed-nginx.png

Nginx is now successfully running on our Ubuntu server!

Mission: Load Balancing Azure Web Apps with Nginx

As the success page mentioned, further configuration is required. So, we need to edit the configuration file by first opening it up with the following command.

sudo nano /etc/nginx/sites-available/default

The first section that I added is the Cache Configuration.

# Cache configuration
proxy_temp_path /var/www/proxy_tmp;
proxy_cache_path /var/www/proxy_cache levels=1:2 keys_zone=my_cache:20m inactive=60m max_size=500m;

The proxy_temp_path is the path to the directory where the temporary files should be stored at when the response from the upstream server cannot fit into the configured buffers.

The proxy_cache_path is about in which directory the cache should be stored at. The levels=1:2 means that the cache will be stored in a single-character directory with a two-character subdirectory. The keys_zone parameter defines a my_cache cache zone which can store 20MB of keys at most but with the maximum size of the actual data to be 500MB. The inactive=60m means the maximum inactive time cache can be stored, which is 60 minutes in this case.

Next, upstream needs to be defined as follows.

# Cluster sites configuration
upstream backend {
    server dotnetcore-clustering-web01.azurewebsites.net fail_timeout=30s;
    server dotnetcore-clustering-web02.azurewebsites.net fail_timeout=30s;
}

For the default server configuration, we need to make a few modifications to it.

# Default server configuration
# 
server {
    listen 80 default_server;
    listen [::]:80 default_server;
    server_name localhost;
    
    ...
    
    location / {
        proxy_pass http://backend;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        try_files $uri $uri/ =404;
    }
}

Now, we just need to restart the Nginx with the following command.

sudo service nginx restart

Then when we visit the Ubuntu server again, we will realize that we sort of able to reach Azure Web Apps but not really so because it says 404!

404-on-azure.png

Oops, the Nginx routes the visitor to 404 land.

Troubleshooting 404 Error

According to another article which is written by Issac Lázaro, he said this was due to the fact that Azure App Service uses cookies to do ARR (Application Request Routing), hence we need to have the Ubuntu server to pass the header to the web apps by modifying our Nginx configuration to the following.

# Cluster sites configuration
upstream backend {
    server localhost:8001 fail_timeout=30s;
    server localhost:8002 fail_timeout=30s;
}
...

server {
    listen 8001;
    server_name web01;

    location / {
        proxy_set_header Host dotnetcore-clustering-web01.azurewebsites.net;
        proxy_pass http://dotnetcore-clustering-web01.azurewebsites.net;
    }
}

server {
    listen 8002;
    server_name web02;
    
    location / {
        proxy_set_header Host dotnetcore-clustering-web02.azurewebsites.net;
        proxy_pass http://dotnetcore-clustering-web02.azurewebsites.net;
    }
}

Then when we refresh the page, we shall see the website is loaded correctly with the content will be delivered from either web01 or web02.

success.png

Yay, we make it!

Yup, that’s all about setting up a simple Nginx to load balance multiple Azure Web Apps. You can refer to the following articles for more information about Nginx and load balancing.

References

  1. How to open ports to a virtual machine with the Azure portal
  2. Can’t start Nginx – Job for nginx.service failed
  3. Linux+.NetCore+Nginx搭建集群
  4. Understanding Nginx HTTP Proxying, Load Balancing, Buffering, and Caching
  5. Module ngx_http_upstream_module
  6. How To Set Up Nginx Load Balancing with SSL Termination

 

Advertisements

[KOSD Series] Code Review and VSTS

KOSD, or Kopi-O Siew Dai, is a type of Singapore coffee that I enjoy. It is basically a cup of coffee with a little bit of sugar. This series is meant to blog about technical knowledge that I gained while having a small cup of Kopi-O Siew Dai.

kosd-vsts-azure.png

Code reviews are a best practice for software development projects but it’s normally ignored in startups and SMEs because

  • the top management doesn’t understand the value of doing so;
  • the developers have no time to do code reviews and even unit testing.

So, in order to improve our code quality and management standards, we decided to introduce the idea of code reviewing by enforcing pull requests creating in our deployment procedure, even though our team is very small and we are working in a startup environment.

Firstly, we set up two websites on Azure App Service, one for UAT and another for the Production. We enabled Continuous Deployment feature for two of them by configuring Azure App Service integration with our Git repository on Visual Studio Team Services (VSTS).

Secondly, we have two branches in the Git repository of the project, i.e. master and development-deployment. Changes pushed to the branches will automatically be deployed to the Production and the UAT websites, respectively.

In order to prevent that our codes are being deployed to even the UAT site without code reviews, we created a new branch known as the development branch. The development branch allows all the relevant developers (in the example below, we call them Alvin and Bryan) to pull/push their local changes freely from/to it.

git-flow-on-vsts.png

Once any of the developers is confident with his/her changes, he/she can create a new pull request on VSTS.

creating-pull-request.png

Creating a new pull request on VSTS.

We then proceed to make use of the new capability on VSTS, which is to set policies for the branches. In the policy setting, we checked the option “Require a minimum number of reviewers” to prevent direct pushes to both master and development-deployment branches.

branch-policies.png

Enabled the code review requirement in each pull request to protect the branch.

So for every deployment to our UAT and Production websites, the checking step is in place to make sure that the deployments are all properly reviewed and approved. This is not just to protect the system but also to protect the developers by having a standardized quality checking across the development team.

This is the end of this episode of KOSD series. If you have any comment or suggestion about this article, please shout out. Hope you enjoy this cup of electronic Kopi-O Siew Dai. =)

MS SQL on AWS: Amazon RDS

amazon-rds-ms-sql-server

There are some startups and SMEs hosting their databases on AWS. However, most of them choose to use Amazon EC2 because doing so is similar to running a SQL Server on-premise at data centres. So, to them, it’s something that they are familiar with back in the old days. However, doing so actually increases their cost of hosting services on AWS. The companies also need to hire experts to do database administration such as database backup and recovery and OS patching.

Hence, if I’m given the opportunity, I usually recommend the small companies with limited resources to consider Amazon RDS (or Azure SQL) first. Amazon RDS is a fully managed service which provides cost-efficient and resizable capacity while automating time-consuming database administration tasks.

Multi-AZ Deployments for MS SQL Server

Starting from May 2014, Amazon RDS also provides a highly available database solution with the synchronous Multi-AZ replication for MS SQL. Multi-AZ deployments for MS SQL database instances use SQL Server Mirroring.

Currently, Amazon RDS only supports Standard Edition and Enterprise Edition of SQL Server 2008 R2, 2012, 2014, and 2016. Amazon RDS also does not support Multi-AZ with Mirroring for the following regions yet:

  • US West (N. California);
  • Asia Pacific (Singapore);
  • European Union (Frankfurt);
  • AWS GovCloud (US);
  • Asia Pacific (Sdyney): Supported for DB instances in VPCs only;
  • Asia Pacific (Tokyo): Supported for DB instances in VPCs only;
  • South America (São Paulo): Supported for all DB instance classes except m1/m2.

It’s quite unfortunate that Singapore Region is one of them.

use-multi-az-deployment-for-production-sql-server-se.png

In N. Virginia Region, we’re able to specify to use Multi-AZ Deployment in Production SQL Server SE.

DB Instance Class

We can specify the DB Instance Class that allocates the computational, network, and memory capacity required by planned workload of the database instance.

Standard (db.m4) instances offer a balance of compute, memory, and network resources, and are a good choice for many applications.

Memory Optimized (db.r3) instances are designed to deliver fast performance for workloads that process large data sets in memory. The instances are well suited for the applications, such as high performance relational databases, in-memory analytics, and enterprise applications (for example, Microsoft SharePoint).

Burst Capable (db.t2) instances are instances that provide baseline performance level with the ability to burst to full CPU usage.

Storage Types

Most of the Amazon RDS are using Amazon EBS (Elastic Block Store) volumes for database and log storage. There are currently two main Storage Types available when setting up MS SQL database instances, as listed below.

General Purpose (SSD) storage, aka gp2, offers cost-effective storage which is suitable for a broad range of database workloads. Hence, it’s ideal for small to medium-sized databases. It provides baseline of 3 IOPS/GB and ability to burst to 3,000 IOPS for extended periods of time. Its volume can range from 20GB to 4TB for MS SQL database instances. However, provisioning less than 100 GB of General Purpose (SSD) storage for high throughput workloads could result in higher latencies upon exhaustion of the initial General Purpose (SSD) I/O Credit balance.

Provisioned IOPS (SSD) storage, aka io1, is suitable for I/O intensive database workloads which pay attention to storage performance and consistency in random access I/O throughput. It provides flexibility to provision I/O ranging from 1,000 to 30,000 IOPS. MS SQL can have provisioned IOPS volumes between 100GB (Express/Web edition) or 200GB (Standard/Enterprise edition) and 4TB.

Allocated Storage and I/O Credits

General Purpose (SSD) storage performance is controlled by the volume size. Larger volumes have higher base performance levels and can accumulate I/O Credits faster. The more storage, the greater the base performance is and the faster it replenishes the credit balance.

For General Purpose (SSD) storage, the DB instance has an initial I/O Credits balance of 5.4 million. When the storage requires more than the base performance I/O level, it uses I/O credits in the credit balance to burst to the required performance level, up to a maximum of 3,000 IOPS. If the storage uses all of its I/O credit balance, its maximum performance will remain at the base performance level until I/O demand drops below the base level and unused credits are added to the I/O credit balance at the baseline performance rate of 3 IOPS/GB of volume size. Hence, we can use the formula below to calculate the Burst Duration.

burst-duration-formula.png

burst-duration-tabular.png

Thus, for production application that requires fast and consistent I/O performance, it’s recommended to use Provisioned IOPS (SSD) storage that is optimized for I/O intensive, online transaction processing workloads that have consistent performance requirements. Note that we cannot decrease storage allocated for a DB instance.

For MS SQL Server, Amazon RDS does not currently support increasing storage. Hence, we need to provision storage based on anticipated future storage growth. If we predict it wrongly, then we need to increase the storage of an existing SQL Server DB instance by first exporting the data, creating a new database instance with increased storage, and then importing the data into the new database instance.

Specifying Database Instance Specification

After understanding key concepts above, we can then proceed to setup our database instance.

specifying-db-instance-specifications.png

Although there is Free Tier available but allocating storage > 20GB or adding provisioned IOPS will disqualify the databse instance from being eligible for the Free Tier.

Network and Security: VPC (Virtual Private Cloud)

Amazon RDS database instances can be hosted on either EC2-VPC platform or the legacy EC2-Classic platform, the original platform used by Amazon RDS. Amazon VPC launches AWS resources, such as database instances, into a virtual private cloud.

Nowadays, if we are creating a database instance in a region that we have not used before, we normally are already on the EC2-VPC platform.

rds-supported-platforms.png

We are already on EC2-VPC platform.

There are many scenarios for accessing a database instance in a VPC. Today, I will only focus on having an EC2 web server to access the database instance in the same VPC.

web-server-and-db-instance-in-the-same-vpc.png

A database instance in a VPC accessed by an EC2 instance in the same VPC (Source: AWS Documentation)

In such scenario, Amazon RDS database instance normally needs to be available to the web server, and not to the public Internet. Hence, we can create a VPC with both public and private subnets. The web server will be hosted in the public subnet so that it is accessible by the public. The database instance is hosted in the private subnet so that it won’t be available to the public Internet, providing greater security.

The Security Group used to restrict access to the database instances can have a custom rule that allows TCP access using the port 1433 and an IP address we will use to access the database instance for development or other purposes. In addition, we also need to set the Public Accessible option to Yes first (It is recommended to set the option to No for production database instance to limit the potential thread with no public routes).

Encryption of Database Instances using Key Management Service (KMS)

Amazon RDS for MS SQL supports the encryption of database instances with encryption keys managed in AWS KMS. Once the data is encrypted, Amazon RDS handles authentication of access and decryption of the data transparently without having the need to change our database client applications.

enable-database-encryption.png

Currently, encryption of database instances (Data-in-Rest Protection) is not available for those which are running SQL Server Express Edition.

Backup and Maintenance

Amazon RDS automatically backup our database instances. It creates a storage volume snapshot of our database instance, backing up the entire database instance and not just individual databases. We can setup and modify our preferred Backup Window from time to time. During the automatic backup window, storage I/O might be suspended briefly while the backup process initializes (typically under a few seconds). For SQL Server, I/O activity is suspended briefly during backup for Multi-AZ deployments.

By default, Amazon RDS has a 30-minute backup window randomly selected from an 8-hour block (Singapore region will be 14:00–22:00 UTC).

Periodically, Amazon RDS also automatically does maintenance work such as, updating the databse instance’s or database cluster’s OS. We can choose to manually apply maintenance, or wait for the automatic maintenance process initiated during our preferred maintenance window. There is one thing to take note is that the maintenance window determines when pending operations start, but does not limit the total execution time of these operations.

By default, Amazon RDS also has a 30-minute maintenance window randomly selected from an 8-hour block (Singapore region will be 14:00–22:00 UTC).

maintenance-window-collide-with-backup-window.png

We’re not allowed to make the maintenance window and the backup window overlap.

CloudWatch

Amazon RDS sends metrics to CloudWatch for each active database instance every minute. Detailed monitoring is enabled by default.

cloudwatch.png

Amazon RDS Metrics

When setting up the database instance, there is an option for us to specify whether to enable Enhanced Monitoring or not. Enhanced Monitoring is not exactly like CloudWatch. CloudWatch gathers metrics about CPU utilization from the hypervisor for a database instance, and Enhanced Monitoring gathers its metrics from an agent on the instance.

enable-enhanced-monitoring.png

Enhanced monitoring requires permission to act on our behalf to send OS metric information to CloudWatch Logs.

Conclusion

It’s true that AWS allows us to deploy our MS SQL Server database on either Amazon RDS and Amazon EC2. However, it’s very crucial to analyze our needs and our application before deciding which one to use. In general, it is still recommended to consider Amazon RDS first so that developers can focus on high-level tasks and business logic implementation.

That’s all for my first trip to Amazon RDS. As a frequent user of Microsoft Azure, I never host MS SQL Server on AWS platform. So, if there is any mistake made in this article, kindly feedback to me. Thanks in advance!

Further Reading

Deploying Microsoft SQL Server on Amazon Web Services

Magical Experience with Beacons

One month ago on 27th of March, my friend passed me a box of Estimote Proximity Beacons. That day marks the beginning of my journey towards a greater understanding of beacons and IoT.

Since the day I joined travel industry, I have always been thinking of providing a fun travel experience with beacon technology. When I joined Changi Airport team in 2015, I proposed to my manager the possibility of applying beacons in the airport. The idea was rejected. Now, I finally get the chance to build something with the small little Estimote Proximity Beacons.

estimote-beacons.png

We forcefully opened up the beacons and replaced the batteries.

Claiming Beacons

Every Estimote beacons are shipped with an unique ID which we can modify. By default, the beacon ID is in iBeacon format and consists of 3 values:

The three values are hierarchical. The purpose of UUID is to distinguish our beacons from all other beacons in the network. Major and Minor values allow us to label the beacons with higher accuracy.

ibeacon-format

An example of how a chain of retail shops will deploy and label their beacons. (Source: Estimote Developer Docs)

The iBeacon ID can be changed. One way is to use the Estimote app to do it. Since I wasn’t the owner of the beacons, my first step is to claim the beacon using the app. After I successfully claim the beacons, I can then proceed to retrieve detailed info of the beacons and modify their info.

claiming-beacons-and-changing-broadcasting-power.png

Claiming beacon and modifying its info, such as its range (by default it’s ~3.5m).

Google Beacon Platform

After configuring our beacons, we can then proceed to claim the ownership of our beacons on the Google Beacon Registry. There is a mobile app called Beacon Tools available from Google to help us registering our beacons on Google Beacon Registry. There is a very interesting video interviewing Peter Lewis in the Coffee with a Googler season talking about the steps of beacon registration.

google-beacon-registry.png

Peter shares about Google Beacon Registry and Google Beacon Platform. (Source: YouTube)

After that, we can associate a lot of information with our beacons. To do so, we first are recommended to use Google Beacon Dashboard. There is a very simple tutorial guiding us to use the Google Beacon Dashboard to associate the attachments with the beacons.

attachments

My beacon project, Icy Marshmallow, and the attachments of a beacon in the project.

Read Attachments

I’m using the Nearby Messages API to retrieve the attachments from the beacons. I did a small little Android app (which is properly configured following the recommended steps) with the codes as shown below to achieve this.

package gclprojects.icymarshmallow;

...
import com.google.android.gms.common.ConnectionResult;
import com.google.android.gms.common.api.GoogleApiClient;
import com.google.android.gms.Nearby;
import com.google.android.gms.nearby.messages.Message;
import com.google.android.gms.nearby.messages.MessageListener;
import com.google.android.gms.nearby.messages.Strategy;
import com.google.android.gms.nearby.messages.SubscribeOptions;

public class MainActivity extends AppCompatActivity 
        implements GoogleApiClient.ConnectionCallbacks,
        GoogleApiClient.OnConnectionFailedListener {
    
    private GoogleApiClient mGoogleApiClient;
    private MessageListener mMessageListener;
    ...

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        ...

        mGoogleApiClient = new GoogleApiClient.Builder(this)
                .addApi(Nearby.MESSAGES_API)
                .addConnectionCallbacks(this)
                .enableAutoManage(this, this)
                .build();

        mMessageListener = new MessageListener() {
            @Override
            public void onFound(final Message message) {
                // Called when a new message is found.
                // Use message.getType().toString() to read the attachment Type
                // Use new String(message.getContent()) to read the attachment Value
            }
        }
    }

    @Override
    protected void onStart() {
        super.onStart();
        mGoogleApiClient.connect();
    }

    @Override
    protected void onStop() {
        super.onStop();
        mGoogleApiClient.disconnect();
    }

    @Override
    public void onConnected(@Nullable Bundle bundle) {
        if (mGoogleApiClient != null && mGoogleApiClient.isConnected()) {
            subscribe();
        }
    }

    ...

    private void subscribe() {
        SubscribeOptions options = new SubscribeOptions.Builder()
                .setStrategy(Strategy.BLE_ONLY)
                .build();

        Nearby.Messages.subscribe(mGoogleApiClient, mMessageListener);
    }
}

With the codes above, when beacon gets detected by the mobile app, the onFound method gets called for each of the attachment associated with the beacons. If we print the variable message into Log, we shall see something as follows.

Message{namespace='icy-marshmallow', type='string', content=[29 bytes]}

As shown above, the Value of the attachment is base64 encoded. So to read it, we just need to use new String(message.getContent()).

In the subscribe method, since we are only interested in messages attached to BLE (Bluetooth Low Energy) beacons, we use Strategy.BLE_ONLY.

Problem #1: Unsubscribe Method

When the app is running and another app comes into the foreground, we also need to stop subscribing to messages from the beacons. Otherwise, when we navigate back to the app, the messages can no longer be received even though we re-trigger the subscribe method.

So, I added the following codes.

@Override
protected void onPause() {
    if (mGoogleApiClient != null && mGoogleApiCLient.isConnected) {
        unsubsribe();
    }

    super.onPause();
}

@Override
protected void onResume() {
    if (mGoogleApiClient != null && mGoogleApiClient.isConnected) {
        subscribe();
    }

    super.onResume();
}

private void unsubscribe() {
    Nearby.Messages.unsubscribe(mGoogleApiClient, mMessageListener);
}

Problem #2: Stop Receiving Messages After Few Minutes

Another problem I notice is that the messages will stop be “found” after one to two minutes. However, if I re-trigger the mobile app, then I can start seeing the messages being detected for another one or two minutes.

To solve this issue, I use a simple timer which helps to check whether it has been quite some time the app doesn’t detect the beacons. If it’s more than 1 minute, then the timer will do a unsubscribe-then-subscribe-again action. This will help the mobile app to keep receiving the messages from the beacons. It also solve the problem of the mobile app re-visiting the beacons.

Problem #3: Geo-Location

This is not a real problem if we don’t need the geo-location information of the beacons. However, if we need to know the geo-location of the beacon, one simple way is to just use the LocationManager which provides periodic updates of the mobile geographical location.

package gclprojects.icymarshmallow;

...
import android.location.Location;
import android.location.LocationListener;
import android.location.LocationManager;

public class MainActivity extends AppCompatActivity 
        implements GoogleApiClient.ConnectionCallbacks,
        GoogleApiClient.OnConnectionFailedListener {
    LocationManager locationManager;
    ...

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        ...

        LocationListener locationListener = new LocationListener() {
            public void onLocationChanged(Location location) {
                // Record down the latitude and longitude of the mobile
            }

            ...
        }

        locationManager = (LocationManager) this.getSystemService(Context.LOCATION_SERVICE);
    
        int permissionCheck = ContextCompat.checkSelfPermission(this, Manifest.permission.ACCESS_FINE_LOCATION);
        if (permissionCheck == PackageManager.PERMISSION_GRANTED) {
            locationManager.requestLocationUpdates(LocationManager.NETWORK.PROVIDER, 0, 0, locationListener);

            ...
        }
    }
}

Writing Data to Firebase

This step is optional unless the data collected needs to be stored for future use.

I use the following codes to write the beacon data to Firebase database.

package gclprojects.icymarshmallow;

...
import com.google.firebase.database.DatabaseReference;
import com.google.firebase.database.FirebaseDatabase;

public class MainActivity extends AppCompatActivity 
        implements GoogleApiClient.ConnectionCallbacks,
        GoogleApiClient.OnConnectionFailedListener {
    private DatabaseReference mDatabase;
    ...

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        ...
        
        mDatabase = FirebaseDatabase.getInstance().getReference();

        mMessageListener = new MessageListener() {
            @Override
            public void onFound(final Message message) {
                ...

                Beacon beaconInfo = new Beacon(...);

                Format formatter = new SimpleDateFormat("yyyy-MM-dd-HH:mm:ss");
                mDatabase.child("Person A")
                        .child(formatter.format(new Date())
                        .setValue(beaconInfo);
            }
        }
    }
}

...

public class Beacon { ... }
firebase.png

Successfully recorded the data from beacons in my Firebase database!

To integrate our Android app with Firebase, our friendly Android Studio comes with a tool called the Firebase Assistance which will help us connect to the Firebase. The assistance also comes with short getting-started tutorial to show us how to cinfigure and add realtime database to our mobile app.

beacon-in-changi-airport.png

Spot the beacon. =)

Installing Beacons in Changi Airport

Installing beacons in our Changi Airport is always one of my dreams to enhance the experience of millions of travelers flying in and out of the airport. In fact, currently the Armsterdam city is already making use of beacon technology to build a powerful beacon networks to give the people a better experience when they are walking around in the city. So why can’t we do the same in our friendly Changi Airport? =)

IoT Hub First Peek

The Internet of Things (IoT) is here today, and it begins with the data, devices, and services already at work in your organization. When your “things” are connected to each other and to the cloud, you create new ways to improve efficiency, enable innovation, and transform your business.

This line is printed on the front page of a Microsoft booklet distributed during the lunchtime workshop “Connecting and Building the Internet of Things (IoT)” conducted by Gerald Goh, Microsoft Technical Evangelist. Gerald shared with us technologies such as AMQP, MQTT, Message Broker in Azure, Device Explorer, and so on.

gerald-goh.png

Gerald is sharing Azure IoT Hub during the lunchtime workshop.

IoT hasn’t gone totally mainstream, however, and we have yet to feel its impact. In many ways it is roughly where the big data movement was few years ago — consisting mainly of a buzzword that’s not yet widely understood.

Nevertheless, Gerald’s workshop does give me, a web developer who doesn’t know much about this field, a helpful quick start about IoT. After reading and experimenting, I learn more about the capability of Microsoft Azure in IoT and thus I’d like to share with you about what I’ve learnt so far about Azure IoT Hub.

Message Broker

I’m working in Changi Airport. In the airport, we have several shops serving the travelers and visitors. Most of the shops have a back-end system that integrates several systems such as the retail system, e-commerce website, payment system, Changi Rewards system, inventory management system, the finance system.

So there will be cases where, when a customer buys something at the shop, the retail system needs to send as request to the payment system. Then when the purchase is successful, another purchase request will be sent to the inventory management system and the finance system.

I’m not too sure how the shops link different systems, especially this kind of point-to-point integration will cause a large number of connections among the systems. Hence, the developers of their system may find Message Broker useful.

Message Broker is a physical component that handles the communication between systems. A system sends a message to the message broker, providing the logical name of the receiving systems. The message broker will then search for the receiving systems and then passes the message to them.

message-broker

A message broker mediating the communication between systems. (Image Credit: Message Broker – MSDN)

Messaging Protocols: AMQP and MQTT

Sending a message between systems seems to be an easy task, however, doing it in a reliable and secure manner can be a challenging work.

As shown in the article “Scalable Eventing over Mesos!”, Autodesk is using AMQP (Advanced Message Queuing Protocol) as messaging protocols between two parties with the following main characteristics as goals.

  • Security
  • Reliability
  • Interoperability
  • Standard
  • Open
autodesk-messaging-protocol

AMQP communication between two parties (Image Credit: Autodesk)

AMQP 1.0 is the current specification version. It is also the primary protocol of Azure Event Hubs and Azure Service Bus Messaging after the SBMP (Service Bus Messaging Protocol), the TCP-based protocol which is used inside of .NET client library, is phased out.

Besides AMQP, MQTT (Message Queue Telemetry Transport) is another open protocol based on TCP/IP for asynchronous message queuing which has been developed and matured over past few years.

ibm.png

Dr Andy Stanform-Clark from IBM invented the MQTT protocol. (Image Source: IBM – Wikipedia)

While AMQP is designed to provide the full vibrancy of messaging scenarios, MQTT is designed as an extremely lightweight publish/subscribe message transport for small and simple devices sending small messages on low-bandwidth networks. Hence, MQTT is said to be ideal for mobile applications because of its low power usage and minimized data packets.

MQTT is also simple because it just has five API methods:

  • Connect to an MQTT broker;
  • Disconnect from an MQTT broker;
  • Subscribe to an MQTT topic filter;
  • Unsubscribe from an MQTT topic filter;
  • Publish MQTT messages.

If you are interested to know more about the comparison of AMQP and MQTT, there is a detailed white paper from StormMQ discussing the difference between AMQP and MQTT.

Brokered Messaging – Service Bus Messaging

When two or more systems want to exchange information, they need a communication facilitator. This is where Microsoft Azure Service Bus comes into picture.

Azure Service Bus is a reliable information delivery service, which is similar to a postal service in the physical world.

One of the messaging patterns offered in Azure Service Bus is called Service Bus Messaging, or Brokered Messaging. By using it, both senders and receivers do not have to be available at the exact same time.

AMQP 1.0 support is available in the Service Bus SDK since its version 2.1. Since the Service Bus .NET client library by default using a dedicated SOAP-based protocol, to use AMQP 1.0, we need to specify in the Service Bus Connection String as highlighted below in bold.

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
    <appSettings>
        <add 
            key="Microsoft.ServiceBus.ConnectionString" 
            value="Endpoint=sb://[namespace].servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=[SAS key];TransportType=Amqp" /> 
    appSettings> 
configuration>

In AMQP transport mode, the client library of sender will serialize the brokered message into an AMQP message so that the message can be received and interpreted by a receiver running on a different platform.

Azure Event Hub

When our event-based messaging needs to be handled at a very huge scale, we can either continue to pay even more to use Azure Service Bus or we can switch to use Event Hub. Event Hub is a cheaper way for us to be able to deal with huge bursts of messages and retain messages for a longer period of time.

event-hub-is-cheaper.png

Event Hub is cheaper, reliable and also fully managed. (Full video: Azure Service Bus Event Hubs 101 with Dan Rosanova)

Although Event Hub does not support MQTT, it does support AMQP (and HTTP) where there could be at most 5,000 concurrent AMQP connections.

Event Hubs event and telemetry handling capabilities, such as ingesting millions of events per second, make it especially usefu for IoT scenarios. However, since it is ingestion only thus Event Hub has no facility for sending traffic, for example, from the cloud back to the devices (C2D).

Azure IoT Hub

Since Event Hubs only enable event ingress, i.e. C2D, Azure offers another service, IoT Hub, for both C2D and D2C (Device-to-Cloud) communications which are reliable and secure. Not only allowing bi-directional communication, IoT Hub also supports AMQP, HTTP, and MQTT.

IoT Hub has an identity registry storing information about devices which are given the permission to connect to the IoT Hub. Before a device can connect to an IoT Hub, there must be an entry for that device in the identity registry of the IoT Hub.

In a Hello World tutorial of connecting stimulated device to IoT Hub using C#, there is a way to add device and retrieve device identity programmatically as shown below.

private static async Task AddDeviceAsync()
{
    string deviceId = "gclRasPi2";
    Device device;

    try
    {
        device = await registryManager.AddDeviceAsync(new Device(deviceId));
    }
    catch (DeviceAlreadyExistsException)
    {
        device = await registryManager.GetDeviceAsync(deviceId);
    }

    Console.WriteLine("Generated device key: {0}", device.Authentication.SymmetricKey.PrimaryKey);
}

The Registry Manager, which is connecting to the IoT Hub using a Connection String with proper Policy, will add an device identity with the Device ID “gclRasPi2” to the Device Explorer in Azure.

azure-iot-hub-device-explorer.png

The device “gclRasPi2” is now in the Device Explorer.

After doing so, a message then can be sent from (stimulated) device to the IoT Hub. For example, the device wants to send data about the temperature and humidity at that moment using MQTT, we can use the following code.

var deviceClient = DeviceClient.Create(
    iotHubUri, 
    new DeviceAuthenticationWithRegistrySymmetricKey("gclRasPi2", deviceKey), 
    TransportType.Mqtt);

var telemetryDataPoint = new
{
    deviceId = "gclRasPi2",
    temperature = currentTemperature,
    humidity = currentHumidity
};

var messageString = JsonConvert.SerializeObject(telemetryDataPoint);

var message = new Message(Encoding.ASCII.GetBytes(messageString));
message.Properties.Add("temperatureAlert", (currentTemperature > 30) ? "true" : "false");

await deviceClient.SendEventAsync(message);

To read the message, please follow the steps shared by the tutorial on setting up to read data-point messages.

Message Routing

Besides reading normal data-point messages, what really interests me is another tutorial about message processing with Message Routing.

iot-hub-routing

Message Routing (Image Source: Microsoft Azure Blog)

According to the tutorial, we first need to setup a Service Bus queue in the same Azure subscription and region as our IoT Hub.

service-bus-queue.png

Created a Queue in the Service Bus.

We can then add an Endpoint in the IoT Hub for the queue we just created. As shown in the following screenshot, there is a message saying that “You may have up to 1 endpoint on the IoT hub.” This is because I am using the free IoT Hub. For its paid versions, only at most 10 custom endpoints are allowed.

Interestingly, each Azure subscription can only have at most 10 IoT Hubs, and only 1 free IoT Hub.

iot-hub-endpoint.png

Adding a new endpoint to the IoT Hub.

After adding endpoint, we need to setup the Message Routing. For free version, we can only have 5 routing rules.

iot-hub-route.png

Creating new route with query string following special syntax.

In the query string, I used temperatureAlert = “true” as the condition. Also, as shown on the screenshot above, there is a line saying “Messages which do not match any rules will be written to the ‘Events (messages/events)’ endpoint.” Hence, the following two console applications will show different results: The left one is connecting to the messages/events endpoint while the right one is showing messages that match the CustomizedMessageRoutingRule created above.

consoles-results.png

Only data with temperatureAlert = “true” will be sent to the “CustomizedMessageRoute”.

Now if we visit the Service Bus Queue page and IoT Hub page again, we will see some updates on the numbers.

queue-results.png

Usage statistics in Service Bus Queue.

iot-hub-usage.png

2% of 8k messages sent from the stimulated device console application.

Conclusion

That’s all about my first try of Azure IoT Hub after attending the workshop delivered by Gerald. It’s a great lunchtime workshop.

For those who are interested, there is an article on Microsoft sharing the benefits of using Azure IoT Hub service, you can read it to understand more.

This is just the beginning of my IoT learning journey. There are still more things for me to learn, such as Azure Stream Analysis and Microsoft Azure IoT Suite which is briefly brought up in the booklet mentioned above.

If you spot any mistake in this article or you have more to talk about IoT and in particular IoT in Azure ecosystem, please share with me. =)

Rewrite

This is a post about code refactoring, the long journey of removing technical debts in the system I have built at work.

Since late 2015, I have been working on a system which is supposed to be extensible and scalable to support our branches in different countries. With the frequent change of user requirements and business strategies, mistakes and system-wide code refactoring are things that I can’t avoid.

Today, I’m going to list down some key lessons and techniques I learned in this one year plus of journey of clearing technical debts.

…we are not living in a frozen world of requirements and changes. So you can’t and shouldn’t avoid refactoring at all.

The risk in avoiding would definitely bring up messy codes, and tough maintenance of your software, you can leave it behind but somebody else would suffer.

— Nail Yuce, Author of the redbook “Collaborative Application Lifecycle Management”

Databases Organization

Originally, we have three branches in three countries. The branches are selling the same product, laptop. It’s business decision that these three branches should behave individually with different names so that customers generally can’t tell that these three branches are operated by one company.

Let’s say these three branches have names as such, Alpha, Beta, and Gamma. Instead of setting up different databases for the branches, I put the tables of these three branches in the same database for two reasons:

  1. Save cost;
  2. Easy to maintain because there will be only one database and one connection string.

These two points turn out to be invalid few months after I designed the system in such a way.

I’m using Azure SQL, actually separating databases won’t incur higher cost because of the introduction of Elastic Database Pool in Microsoft Azure. It’s also not easy to maintain because to put three business entities in one database, I have two ways to do it.

  1. Have a column in each table specifying the record is from/for which branch;
  2. Prefix the table names.

I chose the 2nd way. Prefix the table names. Hence, instead of a simple table name such as Suppliers, I now have three tables, APSupplier, BTSupplier, and GMSupplier, where AP, BT, and GM are the abbreviation of the branch names.

This design decision leads to the second problem that I am going to share with you next.

Problem of Prefixing Table Names

My senior once told me that experience was what made a software developer valuable. The “experience” here is not about technical experience because technology simply moves forward at a very fast pace, especially in web development.

The experience that makes a software developer valuable is more about system design and decision making in the software development process. For example, now I know prefixing table names in my case is a wrong move.

TooYoungTooSimple

Experience helps in building a better system which is not “too simple and naive”.

There are actually a few obvious reasons of not prefixing.

For those who are using Entity Framework and love intellisense feature in Visual Studio IDE, they will know my pain. When I’m going to search for Supplier table, I have to type the two-letter branch abbreviation first then search for the Supplier table. Hence in the last one year, our team spent lots of man hours going through all these AP, BT, and GM things. Imaging the company starts to grow to 20 countries, we will then have AP, BT, GM, DT (for Delta), ES (for Epsilon), etc.

To make things worse, remember that three branches are actually just selling the laptops with the similar business models? So what would we get when we use inheritance for our models? Many meaningless empty sub-classes.

public abstract class BaseSupplier
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string PersonInCharge { get; set; }
    public string Email { get; set; }
    public bool Status { get; set; }
}

public class APSupplier : BaseSupplier { }

public class BTSupplier : BaseSupplier { }

public class GMSupplier : BaseSupplier { }

So if we have branches in 20 countries (which is merely 10% of the total number of countries in the world), then our software developers’ life is going to be miserable because they need to maintain all these meaningless empty classes.

Factory Design Pattern and Template Methods

However, the design above actually at the same time also makes our system to be flexible. Now, imagine that one of the branches requests for a system change which requires addition of columns in its own Supplier table, we can simply change the corresponding sub-class without affecting the rest.

It-Is-Not-Bad-If-You-Know-How-To-Use_It

Design Patterns are good if we know how to use them properly. (Image Source: Rewrite)

This leads us to the Factory Design Pattern. Factory Design Pattern allows us to standardize the system design for each of the branch in the same system while at the same time allowing for individual branch to define their own business models.

public abstract class SupplierFactory
{
    public static SupplierFactory GetInstance(string portal)
    {
        return Activator.CreateInstance(Type.GetType($"Lib.Factories.Supplier.{portal}SupplierFactory")) as SupplierFactory;
    }

    protected abstract BaseSupplier CreateInstanceOfSupplier();

    protected abstract void InsertSupplierToDatabase(BaseSupplier newSupplier);

    public abstract IQueryable RetrieveSuppliers();

    public async Task AddNewSupplierAsync(SupplierManageViewModel manageVM)
    {
        ...
        var newSupplier = CreateInstanceOfSupplier();
        newSupplier.Name = manageVM.Name;
        newSupplier.PersonInCharge = manageVM.PersonInCharge;
        newSupplier.Email= manageVM.Email;
        newSupplier.Status = manageVM.Status;
        InsertSupplierToDatabase(newSupplier);
    }

    ...
}

For each of the branch, then I define their own SupplierFactory which inherits from this abstract class SupplierFactory.

public class AlphaSupplierFactory : SupplierFactory
{
    private AlphaDbContext db = new AlphaDbContext();

    public override IQueryable RetrieveSuppliers()
    {
        return db.APSuppliers;
    }

    protected override void InsertSupplierToDatabase(BaseSupplier newSupplier)
    {
        db.APSuppliers.Add((APSupplier)newSupplier);
    }

    ...
 }

As shown in the code above, firstly, I no longer use the abbreviation for the prefix of the class name. Yup, having abbreviation hurts.

Secondly, I have also split the big database to different smaller databases which store each branch’s info separately.

The standardization of the workflow is done using Template Method such as AddNewSupplier method shown above. The creation of new supplier will be using the same workflow for all branches.

Reflection

public static SupplierFactory GetInstance(string portal)
{
    return Activator.CreateInstance(Type.GetType($"Lib.Factories.Supplier.{portal}SupplierFactory")) as SupplierFactory;
}

For those who wonder what I am doing with the Activator.CreateInstance in the GetInstance method, I use it to create an instance of the specified type that type’s default constuctor with portal acts as an indicator on which sub-class the code should pick using reflection with the Type.GetType method. So the values for portal will be Alpha, Beta, Gamma, etc. in my case.

This unfortunately adds one more burden to our development team to pay attention to the naming convention of the classes.

Fluent API: ToTable

All these unnecessary complexity is finally coming to an end after my team found out how to make use of the Fluent API. Fluent API provides several important methods to configure entities which help to override Code First conventions. ToTable method is one of them.

ToTable helps mapping entity to the actual table name. Hence, now we can fix our naming issues in our databases with the codes below in each of the branch’s database context class.

protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
    base.OnModelCreating(modelBuilder);

    modelBuilder.Entity().ToTable("APSuppliers");
    ... // mappings for other tables
 }

With this change, we can furthermore standardize the behavior and workflow of the system for each of our branch. However, since we only start to apply this after we have expanded our business to about 10 countries, there will be tons of changes waiting for us if we are going to apply Fluent API to refactor our codes.

Technical Debts

I made a few mistakes in the beginning of system design because of lacking of experience building extensible and scalable systems and lack of sufficient time and spec given to do proper system design.

This naming issue is just one of the debts we are going to clear in the future. Throughout the one year plus of system development, the team also started to realize many other ways to refactor our codes to make it more robust.

As a young team in a non-software-development environment, we need to be keen on self-learning but at the same time understand that we can’t know everything. We will keep on fighting to solve the crucial problems in the system and at the same time improving the system to better fit the changing business requirements.

I will discuss more about my journey of code refactoring in the future posts. Coming soon!

I-Was-Such-An-Idiot-Back-Then.png

Reading my old codes sometimes makes me slap myself. (Image Source: Rewrite)

Exploring Azure Functions for Scheduler

azure-function-documentdb.png

During my first job after finishing my undergraduate degree in NUS, I worked in a local startup which was then then the largest bus ticketing portal in Southeast Asia. In 2014, I worked with a senior to successfully migrate the whole system from on-premise to Microsoft Azure Virtual Machines, which is the IaaS option. Maintaining the virtual machines is a painful experience because we need to setup the load balancing with Traffic Manager, database mirroring, database failover, availability set, etc.

In 2015, when I first worked in Singapore Changi Airport, with the support of the team, we made use of PaaS technologies such as Azure Cloud Services, Azure Web Apps, and Azure SQL, we successfully expanded our online businesses to 7 countries in a short time. With the help of PaaS option in Microsoft Azure, we can finally have a more enjoyable working life.

Azure Functions

Now, in 2017, I decided to explore Azure Functions.

Azure Functions allows developers to focus on the code for only the problem they want to solve without worrying about the infrastructure like we do in Azure Virtual Machines or even the entire applications as we do in Azure Cloud Services.

There are two important benefits that I like in this new option. Firstly, our development can be more productive. Secondly, Azure Functions has two pricing models: Consumption Plan and App Service Plan, as shown in the screenshot below. The Consumption Plan lets us pay per execution and the first 1,000,000 executions are free!

Screen Shot 2017-02-01 at 2.22.01 PM.png

Two hosting plans in Azure Functions: Consumption Plan vs. App Service Plan

After setting up the Function App, we can choose “Quick Start” to have a simpler user interface to get started with Azure Function.

Under “Quick Start” section, there are three triggers available for us to choose, i.e. Timer, Data Processing, and Webhook + API. Today, I’ll only talk about Timer. We will see how we can achieve the scheduler functionality on Microsoft Azure.

Screen Shot 2017-02-05 at 11.16.40 PM.png

Quick Start page in Azure Function.

Timer Trigger

Timer Trigger will execute the function according to a schedule. The schedule is defined using CRON expression. Let’s say if we want our function to be executed every four hours, we can write the schedule as follows.

0 0 */4 * * *

This is similar to how we did in the cron job. The CRON expression consists of six fields. The first one is second (0-59), followed by minute (0 – 59), followed by hour (0 – 23), followed by day of month (1 – 31), followed by month (1 – 12) and day of week (0-6).

Similar to the usual Azure Web App, the default time zone used in Azure Functions is also UTC. Hence, if we would like to change it to use another timezone, what we need to do is just add the WEBSITE_TIME_ZONE application setting in the Function App.

Companion File: function.json

So, where do we set the schedule? The answer is in a special file called function.json.

In the Function App directory, there always needs a function.json file. The function.json file will contain the configuration metadata for the function. Normally, a function can only have a single trigger binding and can have none or more than one I/O bindings.

The trigger binding will be the place we set the schedule.

{
    "bindings": [
        {
            "name": "myTimer",
            "type": "timerTrigger",
            "direction": "in",
            "schedule": "0 0 */4 * * *"
        },
        ...
    ],
    ...
}

The name attribute is to specify the name of the parameter used in the C# function later. It is used for the bound data in the function.

The type attribute specifies the binding time. Our case here will be timerTrigger.

The direction attribute indicates whether the binding is for receiving data into the function (in) or sending data from the function (out). For scheduler, the direction will be “in” because later in our C# function, we can actually retrieve info from the myTimer parameter.

Finally, the schedule attribute will be where we put our schedule CRON expression at.

To know more about binding in Azure Function, please refer to the Azure Function Developer Guide.

Function File: run.csx

2nd file that we must have in the Function App directory is the function itself. For C# function, it will be a file called run.csx.

The .csx format allows developers to focus on just writing the C# function to solve the problem. Instead of wrapping everything in a namespace and class, we just need to define a Run method.

#r "Newtonsoft.Json"

using System;
using Newtonsoft.Json;
...

public static async Task Run(TimerInfo myTimer, TraceWriter log)
{
    ...
}

Assemblies in .csx File

Same as how we always did in C# project, when we need to import the namespaces, we just need to use the using clause. For example, in our case, we need to process the Json file, so we need to make use of the library Newtonsoft.Json.

using Newtonsoft.Json;

To reference external assemblies, for example in our case, Newtonsoft.Json, we just need to use the #r directive as follows.

#r "Newtonsoft.Json"

The reason why we are allowed to do so is because Newtonsoft.Json and a few more other assemblies are “special case”. They can be referenced by simplename. As of Jan 2017, the assemblies that are allowed to do so are as follows.

  • Newtonsoft.Json
  • Microsoft.WindowsAzure.Storage
  • Microsoft.ServiceBus
  • Microsoft.AspNet.WebHooks.Receivers
  • Microsoft.AspNet.WebHooks.Common
  • Microsoft.Azure.NotificationHubs

For other assemblies, we need to upload the assembly file, for example MyAssembly.dll, into a bin folder relative to the function first. Only then we can reference is as follows.

#r "MyAssembly.dll"

Async Method in .csx File

Asynchronous programming is recommended best practice. To make the Run method above asynchronous, we need to use the async keyword and return a Task object. However, developers are advised to always avoid referencing the Task.Result property because it will essentially do a busy-wait on a lock of another thread. Holding a lock creates the potential for deadlocks.

Inputs in .csx File and DocumentDB

latest-topics-on-dotnet-sg-facebook-group

This section will display the top four latest Facebook posts pulled by Azure Function.


For our case, the purpose of Azure Function is to process the Facebook Group feeds and then store the feeds somewhere for later use. The “somewhere” here is DocumentDB.

To gets the inputs from DocumentDB, we first need to have 2nd binding specified in the functions.json as follows.

{
    "bindings": [
        ...
        {
            "type": "documentDB",
            "name": "inputDocument",
            "databaseName": "feeds-database",
            "collectionName": "facebook-group-feeds",
            "id": "41f7adb1-cadf-491e-9973-28cc3fca57df",
            "connection": "dotnetsg_DOCUMENTDB",
            "direction": "in"
        }
    ],
    ...
}

In the DocumentDB input binding above, the name attribute is, same as previous example, used to specify the name of the parameter in the C# function.

The databaseName and collectionName attributes correspond to the names of the database and collection in our DocumentDB, respectively. The id attribute is the Document Id of the document that we want to retrieve. In our case, we store all the Facebook feeds in one document, so we specify the Document Id in the binding directly.

The connection attribute is the name of the Azure Function Application Setting storing the connection string of the DocumentDB account endpoint. Yes, Azure Function also has Application Settings available. =)

Finally, the direction attribute must be “in”.

We can then now enhance our Run method to include inputs from DocumentDB as follows. What it does is basically just reading existing feeds from the document and then update it with new feeds found in the Singapore .NET Facebook Group

#r "Newtonsoft.Json"

using System;
using Newtonsoft.Json;
...

private const string SG_DOT_NET_COMMUNITY_FB_GROUP_ID = "1504549153159226";

public static async Task Run(TimerInfo myTimer, dynamic inputDocument, TraceWriter log)
{
    string sgDotNetCommunityFacebookGroupFeedsJson = 
        await GetFacebookGroupFeedsAsJsonAsync(SG_DOT_NET_COMMUNITY_FB_GROUP_ID);
    
    ...

    var existingFeeds = JsonConvert.DeserializeObject(inputDocument.ToString());

    // Processing the Facebook Group feeds here...
    // Updating existingFeeds here...

    inputDocument.data = existingFeeds.Feeds;
}

Besides getting input from DocumentDB, we can also have DocumentDB output binding as follows to, for example, write a new document to DocumentDB database.

{
    "bindings": [
        ...
        {
            "type": "documentDB",
            "name": "outputDocument",
            "databaseName": "feeds-database",
            "collectionName": "facebook-group-feeds",
            "id": "41f7adb1-cadf-491e-9973-28cc3fca57df",
            "connection": "dotnetsg_DOCUMENTDB",
            "createIfNotExists": true,
            "direction": "out"
        }
    ],
    ...
}

We don’t really use this in our dotnet.sg case. However, as we can see, there are only two major differences between DocumentDB input and output bindings.

Firstly, we have a new createIfNotExists attribute which specify whether to create the DocumentDB database and collection if they don’t exist or not.

Secondly, we will have to set the direction attribute to be “out”.

Then in our function code, we just need to have a new parameter with “out object outputDocument” instead of “in dynamic inputDocument”.

You can read more at the Azure Functions DocumentDB bindings documentation to understand more about how they work together.

Application Settings in Azure Functions

Yes, there are our familiar features such as Application Settings, Continuous Integration, Kudu, etc. in Azure Functions as well. All of them can be found under “Function App Settings” section.

Screen Shot 2017-02-18 at 4.40.24 PM.png

Azure Function App Settings

As what we have been doing in Azure Web Apps, we can also set the timezone, store the App Secrets in the Function App Settings.

Deployment of Azure Functions with Github

We are allowed to link the Azure Function with variety of Deployment Options, such as Github, to enable the continuous deployment option too.

There is one thing that I’d like to highlight here is that if you are also starting from setting up your new Azure Function via Azure Portal, then when in the future you setup the continuous deployment for the function, please make sure that you first create a folder having the same name as the name of your Azure Function. Then all the files related to the function needs to be put in the folder.

For example, in dotnet.sg case, we have the Azure Function called “TimerTriggerCSharp1”. we will have the following folder structure.

Screen Shot 2017-02-18 at 4.49.11 PM.png

Folder structure of the TimerTriggerCsharp1.

When I first started, I made a mistake when I linked Github with Azure Function. I didn’t create the folder with the name “TimerTriggerCSharp1”, which is the name of my Azure Function. So, when I deploy the code via Github, the code in the Azure Function on the Azure Portal is not updated at all.

In fact, once the Continuous Deployment is setup, we are no longer able to edit the code directly on the Azure Portal. Hence, setting up the correct folder structure is important.

Screen Shot 2017-02-18 at 4.52.17 PM.png

Read only once we setup the Continuous Deployment in Azure Function.

If you would like to add in more functions, simply create new folders at the same level.

Conclusion

Azure Function and the whole concept of Serverless Architecture are still very new to me. However, what I like about it is the fact that Azure Function allows us to care about the codes to solve a problem without worrying about the whole application and infrastructure.

In addition, we are also allowed to solve the different problems using the programming language that best suits the problem.

Finally, Azure Function is cost-saving because we can choose to pay only for the time our code is being executed.

If you would like to learn more about Azure Functions, here is the list of references I use in this learning journey.

You can check out my code for TimerTriggerCSharp1 above at our Github repository: https://github.com/sg-dotnet/FacebookGroupFeedsProcessor.