MS SQL on AWS: Amazon RDS

amazon-rds-ms-sql-server

There are some startups and SMEs hosting their databases on AWS. However, most of them choose to use Amazon EC2 because doing so is similar to running a SQL Server on-premise at data centres. So, to them, it’s something that they are familiar with back in the old days. However, doing so actually increases their cost of hosting services on AWS. The companies also need to hire experts to do database administration such as database backup and recovery and OS patching.

Hence, if I’m given the opportunity, I usually recommend the small companies with limited resources to consider Amazon RDS (or Azure SQL) first. Amazon RDS is a fully managed service which provides cost-efficient and resizable capacity while automating time-consuming database administration tasks.

Multi-AZ Deployments for MS SQL Server

Starting from May 2014, Amazon RDS also provides a highly available database solution with the synchronous Multi-AZ replication for MS SQL. Multi-AZ deployments for MS SQL database instances use SQL Server Mirroring.

Currently, Amazon RDS only supports Standard Edition and Enterprise Edition of SQL Server 2008 R2, 2012, 2014, and 2016. Amazon RDS also does not support Multi-AZ with Mirroring for the following regions yet:

  • US West (N. California);
  • Asia Pacific (Singapore);
  • European Union (Frankfurt);
  • AWS GovCloud (US);
  • Asia Pacific (Sdyney): Supported for DB instances in VPCs only;
  • Asia Pacific (Tokyo): Supported for DB instances in VPCs only;
  • South America (São Paulo): Supported for all DB instance classes except m1/m2.

It’s quite unfortunate that Singapore Region is one of them.

use-multi-az-deployment-for-production-sql-server-se.png
In N. Virginia Region, we’re able to specify to use Multi-AZ Deployment in Production SQL Server SE.

DB Instance Class

We can specify the DB Instance Class that allocates the computational, network, and memory capacity required by planned workload of the database instance.

available-instance-class-for-ms-sql.png
DB Instance Classes available in MS SQL 2016 on AWS.

Standard (db.m4) instances offer a balance of compute, memory, and network resources, and are a good choice for many applications.

Memory Optimized (db.r3) instances are designed to deliver fast performance for workloads that process large data sets in memory. The instances are well suited for the applications, such as high performance relational databases, in-memory analytics, and enterprise applications (for example, Microsoft SharePoint).

Burst Capable (db.t2) instances are instances that provide baseline performance level with the ability to burst to full CPU usage.

Storage Types

Most of the Amazon RDS are using Amazon EBS (Elastic Block Store) volumes for database and log storage. There are currently two main Storage Types available when setting up MS SQL database instances, as listed below.

General Purpose (SSD) storage, aka gp2, offers cost-effective storage which is suitable for a broad range of database workloads. Hence, it’s ideal for small to medium-sized databases. It provides baseline of 3 IOPS/GB and ability to burst to 3,000 IOPS for extended periods of time. Its volume can range from 20GB to 4TB for MS SQL database instances. However, provisioning less than 100 GB of General Purpose (SSD) storage for high throughput workloads could result in higher latencies upon exhaustion of the initial General Purpose (SSD) I/O Credit balance.

Provisioned IOPS (SSD) storage, aka io1, is suitable for I/O intensive database workloads which pay attention to storage performance and consistency in random access I/O throughput. It provides flexibility to provision I/O ranging from 1,000 to 30,000 IOPS. MS SQL can have provisioned IOPS volumes between 100GB (Express/Web edition) or 200GB (Standard/Enterprise edition) and 4TB.

amazon-ebs-pricing.png
Amazon Elastic Block Store (EBS) Pricing for Singapore region.

Allocated Storage and I/O Credits

General Purpose (SSD) storage performance is controlled by the volume size. Larger volumes have higher base performance levels and can accumulate I/O Credits faster. The more storage, the greater the base performance is and the faster it replenishes the credit balance.

For General Purpose (SSD) storage, the DB instance has an initial I/O Credits balance of 5.4 million. When the storage requires more than the base performance I/O level, it uses I/O credits in the credit balance to burst to the required performance level, up to a maximum of 3,000 IOPS. If the storage uses all of its I/O credit balance, its maximum performance will remain at the base performance level until I/O demand drops below the base level and unused credits are added to the I/O credit balance at the baseline performance rate of 3 IOPS/GB of volume size. Hence, we can use the formula below to calculate the Burst Duration.

burst-duration-formula.png

burst-duration-tabular.png

Thus, for production application that requires fast and consistent I/O performance, it’s recommended to use Provisioned IOPS (SSD) storage that is optimized for I/O intensive, online transaction processing workloads that have consistent performance requirements. Note that we cannot decrease storage allocated for a DB instance.

For MS SQL Server, Amazon RDS does not currently support increasing storage. Hence, we need to provision storage based on anticipated future storage growth. If we predict it wrongly, then we need to increase the storage of an existing SQL Server DB instance by first exporting the data, creating a new database instance with increased storage, and then importing the data into the new database instance.

Specifying Database Instance Specification

After understanding key concepts above, we can then proceed to setup our database instance.

specifying-db-instance-specifications.png
Although there is Free Tier available but allocating storage > 20GB or adding provisioned IOPS will disqualify the databse instance from being eligible for the Free Tier.

Network and Security: VPC (Virtual Private Cloud)

Amazon RDS database instances can be hosted on either EC2-VPC platform or the legacy EC2-Classic platform, the original platform used by Amazon RDS. Amazon VPC launches AWS resources, such as database instances, into a virtual private cloud.

Nowadays, if we are creating a database instance in a region that we have not used before, we normally are already on the EC2-VPC platform.

rds-supported-platforms.png
We are already on EC2-VPC platform.

There are many scenarios for accessing a database instance in a VPC. Today, I will only focus on having an EC2 web server to access the database instance in the same VPC.

web-server-and-db-instance-in-the-same-vpc.png
A database instance in a VPC accessed by an EC2 instance in the same VPC (Source: AWS Documentation)

In such scenario, Amazon RDS database instance normally needs to be available to the web server, and not to the public Internet. Hence, we can create a VPC with both public and private subnets. The web server will be hosted in the public subnet so that it is accessible by the public. The database instance is hosted in the private subnet so that it won’t be available to the public Internet, providing greater security.

The Security Group used to restrict access to the database instances can have a custom rule that allows TCP access using the port 1433 and an IP address we will use to access the database instance for development or other purposes. In addition, we also need to set the Public Accessible option to Yes first (It is recommended to set the option to No for production database instance to limit the potential thread with no public routes).

Encryption of Database Instances using Key Management Service (KMS)

Amazon RDS for MS SQL supports the encryption of database instances with encryption keys managed in AWS KMS. Once the data is encrypted, Amazon RDS handles authentication of access and decryption of the data transparently without having the need to change our database client applications.

enable-database-encryption.png
Currently, encryption of database instances (Data-in-Rest Protection) is not available for those which are running SQL Server Express Edition.

Backup and Maintenance

Amazon RDS automatically backup our database instances. It creates a storage volume snapshot of our database instance, backing up the entire database instance and not just individual databases. We can setup and modify our preferred Backup Window from time to time. During the automatic backup window, storage I/O might be suspended briefly while the backup process initializes (typically under a few seconds). For SQL Server, I/O activity is suspended briefly during backup for Multi-AZ deployments.

By default, Amazon RDS has a 30-minute backup window randomly selected from an 8-hour block (Singapore region will be 14:00–22:00 UTC).

Periodically, Amazon RDS also automatically does maintenance work such as, updating the databse instance’s or database cluster’s OS. We can choose to manually apply maintenance, or wait for the automatic maintenance process initiated during our preferred maintenance window. There is one thing to take note is that the maintenance window determines when pending operations start, but does not limit the total execution time of these operations.

By default, Amazon RDS also has a 30-minute maintenance window randomly selected from an 8-hour block (Singapore region will be 14:00–22:00 UTC).

maintenance-window-collide-with-backup-window.png
We’re not allowed to make the maintenance window and the backup window overlap.

CloudWatch

Amazon RDS sends metrics to CloudWatch for each active database instance every minute. Detailed monitoring is enabled by default.

cloudwatch.png
Amazon RDS Metrics

When setting up the database instance, there is an option for us to specify whether to enable Enhanced Monitoring or not. Enhanced Monitoring is not exactly like CloudWatch. CloudWatch gathers metrics about CPU utilization from the hypervisor for a database instance, and Enhanced Monitoring gathers its metrics from an agent on the instance.

enable-enhanced-monitoring.png
Enhanced monitoring requires permission to act on our behalf to send OS metric information to CloudWatch Logs.

Conclusion

It’s true that AWS allows us to deploy our MS SQL Server database on either Amazon RDS and Amazon EC2. However, it’s very crucial to analyze our needs and our application before deciding which one to use. In general, it is still recommended to consider Amazon RDS first so that developers can focus on high-level tasks and business logic implementation.

That’s all for my first trip to Amazon RDS. As a frequent user of Microsoft Azure, I never host MS SQL Server on AWS platform. So, if there is any mistake made in this article, kindly feedback to me. Thanks in advance!

Further Reading

Deploying Microsoft SQL Server on Amazon Web Services

500TB Storage for Database Backup

When your database is big, sometimes just its backup will be around 20GB in size already. Hence, keeping them on disk is always not a solution even though Microsoft Azure provides a data disk with 1TB.

Fortunately, Microsoft Azure offers a scalable and larger storage. It is called Microsoft Azure Storage, a storage with 500TB capacity limit. The good thing about it is we only need to pay for the amount we are using in the storage.

Hence, Chun Siong from Microsoft Singapore suggested my company to try out this service to store our database backups. It turns out that it can be easily done in just 3 steps.

Step 1: Create Azure Storage Account and Retrieve Access Keys

To create a new Azure Storage Account, I simply login to the Azure Management Portal and then choose the Quick Create option of the Storage under Data Services section. I am able to specify the affinity group and replication rule for the Storage Account.

Creating a Storage Account.
Creating a Storage Account.

After the Storage is created, I can retrieve access keys which will be used later in SQL Server to access the Storage Account.

Retrieve the access keys to the Storage Account.
Retrieve the access keys to the Storage Account.

Finally, I just need to create a Container in the Storage Account. All the database backup files will be put inside the Container later.

Created a container in the Storage Account.
Created a container in the Storage Account.

Step 2: Create SQL Server Credentials

I then execute the following T-SQL statement to create credentials so that SQL Server later can connect to the Storage Account.

CREATE CREDENTIAL mycredential 
WITH IDENTITY= 'chunlindbbackup', 
SECRET = '<storage account access key>'

The Storage Account access key here can be either Primary or Secondary access key retrieved in Step 1 above.

Step 3: Backup Database

I create a scheduled job in SQL Server Agent to do backup of my database daily. The URL is the URL of the container created in Step 1.

BACKUP DATABASE mydatabase 
TO URL = 'https://chunlindbbackup.blob.core.windows.net/dbbackup/mydatabase_' + REPLACE(CONVERT(VARCHAR ,GETDATE(),126) ,':','_')+ '.bak'
WITH CREDENTIAL = 'mycredential', INIT, NAME = 'Backup of Database mydatabase'

So yup, now the database backups will be stored on the Storage Account directly.

Restore Database Backup from Azure Storage

To restore a database backup from Storage Account, if the backup file is small, I can just simply execute the following T-SQL statements. The URL of the database backup can be found in the Container in Azure Management Portal.

RESTORE DATABASE mydatabase_test 
FROM URL = 'https://chunlindbbackup.blob.core.windows.net/dbbackup/mydatabase_2014-10-14T13_16_01.243.bak' 
WITH RECOVERY,
MOVE 'mydatabase_db_Data' TO 'F:\db\mydatabase_test_Data.mdf',
MOVE 'mydatabase_db_Log' TO 'F:\db\mydatabase_test_Log.ldf',
CREDENTIAL = 'mycredential'
GO

Unfortunately, the backup that I have is too big. So, I can only download it from Azure Management Portal to the database server first before restoring the database. The download is quite fast.

Download backup file from Storage Account
Download backup file from Storage Account

In case, you wonder why I do not use tool like Azure Storage Explorer, no, it did not work. It would crash also if the backup file was too big.

Pricing

Oh ya, just in case you would like to know the pricing of Azure Storage, you can check it out here: http://azure.microsoft.com/en-us/pricing/details/storage/.

Pricing of Azure Storage in Southeast Asia.
Pricing of Azure Storage in Southeast Asia.

Setting Up MS SQL Server on Azure Virtual Machine

MS SQL Server 2012 + Azure VM

So, now we have an ASP .NET web application running on Microsoft Azure. What we are going to do next is to host our MS SQL Server on the cloud also.

There are two options available in Microsoft Azure to host our SQL database. One is the well-known Azure SQL Database, an implementation of Platform as a Service for a relational database service in the cloud. The other one option is introduced after the new Infrastructure as a Service capabilities of Microsoft Azure. It is now possible to easily deploy instances of MS SQL Server in Azure Virtual Machine.

Azure SQL Database or SQL Server in Azure VM?

Personally, I prefer to directly deploy SQL Server in the virtual machine. At least the entire process looks about the same as what I have already done in our on-premise database server. So, having SQL Server deployed on Azure virtual machine actually means that the developers do not need to make huge changes to our existing applications. In addition, it’s also because migrating existing applications to the cloud normally needs to emulate on-premises behaviour. In short, choosing SQL Server in Azure virtual machine saves the time on migration.

The following is a nice decision diagram that I found on MSDN blog for us to choose which option to use. Also, there is a comparison summary between those two options, Azure SQL Database or SQL Server in Azure Virtual Machine.

To use Azure SQL Database or SQL Server in Azure VM?
To use Azure SQL Database or SQL Server in Azure VM? (Image Credit: MSDN Windows Azure Blog)

 

Creating the Virtual Machine with MS SQL Server Installed

There entire process of creating a virtual machine to host the MS SQL Server is similar to the creation of virtual machine for Windows Server. The only main difference is probably the part of choosing an appropriate image. There are a few editions of SQL Server 2012 for us to choose. You can find a comprehensive comparison among them on MSDN website, again.

Choose "SQL Server 2012" image to deploy MS SQL Server on the new virtual machine.
Choose “SQL Server 2012” image to deploy MS SQL Server on the new virtual machine.

The following table shows the pricing of each edition running on Azure VM as well as the disk sizes available. Here I only pay attention to the memory intensive instances, i.e. A5, A6, and A7. They have larger RAM and disk sizes for the virtual machine and they are thus considered optimal for hosting databases and other high-throughput application. The data shown in the table is applicable for virtual machines deployed in Asia Pacific Southeast, i.e. Singapore.

Asia Pacific Southeast (Singapore) VM pricing for each edition of SQL Server
Asia Pacific Southeast (Singapore) VM pricing for each edition of SQL Server (screenshot taken on 20 April 2014)

Connect to SQL Server Database Engine on Azure VM

After the virtual machine is up and running, we can immediately RDP in to the VM. Then in there, we just need to launch Microsoft SQL Server Management Studio to access the database with the Windows Authentication.

Running SQL Server Management Studio on the virtual machine.
Running SQL Server Management Studio on the virtual machine.

Open TCP Port 1433

SQL Server typically uses TCP port 1433 for remote connections to the database. So, we need to add an endpoint as well as to open the port in the virtual machine firewall for this. However, to avoid security attack, it’s recommended to specify a different Public Port when creating the endpoint in Azure.

1433: A TCP port normally used by MS SQL Server for remote connection to the database.
1433: A TCP port normally used by MS SQL Server for remote connection to the database.

SQL Server Authentication

We need to change the server authentication to “SQL Server and Windows Authentication mode”. This enables us to create logins in SQL Server which are not based on Windows user accounts. Both the login ID and passwords will be stored in the SQL Server. This allows SQL Server to continue supporting our third-party applications that require SQL Server Authentication. After that, we just right-click on the server in Microsoft SQL Server Management Studio Object Explorer to restart the server.

SQL Server and Windows Authentication Mode
SQL Server and Windows Authentication Mode

Connecting Application to the SQL Server

To connect your ASP .NET web application with the database, in web.config, you can just key in the server name, port number together with login ID and password in the following connection string that is used to connect the instance of the SQL Server running on Azure VM.

<add key=”strDBconn” value=”Data Source=****.cloudapp.net,<port-number>;Initial Catalog=<database name>;UID=<login ID>;PWD=<login password>” />

Conclusion

The steps taken to deploy a Microsoft SQL Server on Azure virtual machine are quite straight-forward. There is also an official detailed documentation about provisioning a SQL Server Azure Virtual Machine. I like one of its diagrams which shows the two main connection paths. The complete diagram is shown below.

SQL Server Azure VM Connection Paths
SQL Server Azure VM Connection Paths (Image Credit: Microsoft Azure Documentation Center)

In addition, there are some other online resources which has more detailed discussion on several topics, such as