MS SQL on AWS: Amazon RDS

amazon-rds-ms-sql-server

There are some startups and SMEs hosting their databases on AWS. However, most of them choose to use Amazon EC2 because doing so is similar to running a SQL Server on-premise at data centres. So, to them, it’s something that they are familiar with back in the old days. However, doing so actually increases their cost of hosting services on AWS. The companies also need to hire experts to do database administration such as database backup and recovery and OS patching.

Hence, if I’m given the opportunity, I usually recommend the small companies with limited resources to consider Amazon RDS (or Azure SQL) first. Amazon RDS is a fully managed service which provides cost-efficient and resizable capacity while automating time-consuming database administration tasks.

Multi-AZ Deployments for MS SQL Server

Starting from May 2014, Amazon RDS also provides a highly available database solution with the synchronous Multi-AZ replication for MS SQL. Multi-AZ deployments for MS SQL database instances use SQL Server Mirroring.

Currently, Amazon RDS only supports Standard Edition and Enterprise Edition of SQL Server 2008 R2, 2012, 2014, and 2016. Amazon RDS also does not support Multi-AZ with Mirroring for the following regions yet:

  • US West (N. California);
  • Asia Pacific (Singapore);
  • European Union (Frankfurt);
  • AWS GovCloud (US);
  • Asia Pacific (Sdyney): Supported for DB instances in VPCs only;
  • Asia Pacific (Tokyo): Supported for DB instances in VPCs only;
  • South America (São Paulo): Supported for all DB instance classes except m1/m2.

It’s quite unfortunate that Singapore Region is one of them.

use-multi-az-deployment-for-production-sql-server-se.png

In N. Virginia Region, we’re able to specify to use Multi-AZ Deployment in Production SQL Server SE.

DB Instance Class

We can specify the DB Instance Class that allocates the computational, network, and memory capacity required by planned workload of the database instance.

Standard (db.m4) instances offer a balance of compute, memory, and network resources, and are a good choice for many applications.

Memory Optimized (db.r3) instances are designed to deliver fast performance for workloads that process large data sets in memory. The instances are well suited for the applications, such as high performance relational databases, in-memory analytics, and enterprise applications (for example, Microsoft SharePoint).

Burst Capable (db.t2) instances are instances that provide baseline performance level with the ability to burst to full CPU usage.

Storage Types

Most of the Amazon RDS are using Amazon EBS (Elastic Block Store) volumes for database and log storage. There are currently two main Storage Types available when setting up MS SQL database instances, as listed below.

General Purpose (SSD) storage, aka gp2, offers cost-effective storage which is suitable for a broad range of database workloads. Hence, it’s ideal for small to medium-sized databases. It provides baseline of 3 IOPS/GB and ability to burst to 3,000 IOPS for extended periods of time. Its volume can range from 20GB to 4TB for MS SQL database instances. However, provisioning less than 100 GB of General Purpose (SSD) storage for high throughput workloads could result in higher latencies upon exhaustion of the initial General Purpose (SSD) I/O Credit balance.

Provisioned IOPS (SSD) storage, aka io1, is suitable for I/O intensive database workloads which pay attention to storage performance and consistency in random access I/O throughput. It provides flexibility to provision I/O ranging from 1,000 to 30,000 IOPS. MS SQL can have provisioned IOPS volumes between 100GB (Express/Web edition) or 200GB (Standard/Enterprise edition) and 4TB.

Allocated Storage and I/O Credits

General Purpose (SSD) storage performance is controlled by the volume size. Larger volumes have higher base performance levels and can accumulate I/O Credits faster. The more storage, the greater the base performance is and the faster it replenishes the credit balance.

For General Purpose (SSD) storage, the DB instance has an initial I/O Credits balance of 5.4 million. When the storage requires more than the base performance I/O level, it uses I/O credits in the credit balance to burst to the required performance level, up to a maximum of 3,000 IOPS. If the storage uses all of its I/O credit balance, its maximum performance will remain at the base performance level until I/O demand drops below the base level and unused credits are added to the I/O credit balance at the baseline performance rate of 3 IOPS/GB of volume size. Hence, we can use the formula below to calculate the Burst Duration.

burst-duration-formula.png

burst-duration-tabular.png

Thus, for production application that requires fast and consistent I/O performance, it’s recommended to use Provisioned IOPS (SSD) storage that is optimized for I/O intensive, online transaction processing workloads that have consistent performance requirements. Note that we cannot decrease storage allocated for a DB instance.

For MS SQL Server, Amazon RDS does not currently support increasing storage. Hence, we need to provision storage based on anticipated future storage growth. If we predict it wrongly, then we need to increase the storage of an existing SQL Server DB instance by first exporting the data, creating a new database instance with increased storage, and then importing the data into the new database instance.

Specifying Database Instance Specification

After understanding key concepts above, we can then proceed to setup our database instance.

specifying-db-instance-specifications.png

Although there is Free Tier available but allocating storage > 20GB or adding provisioned IOPS will disqualify the databse instance from being eligible for the Free Tier.

Network and Security: VPC (Virtual Private Cloud)

Amazon RDS database instances can be hosted on either EC2-VPC platform or the legacy EC2-Classic platform, the original platform used by Amazon RDS. Amazon VPC launches AWS resources, such as database instances, into a virtual private cloud.

Nowadays, if we are creating a database instance in a region that we have not used before, we normally are already on the EC2-VPC platform.

rds-supported-platforms.png

We are already on EC2-VPC platform.

There are many scenarios for accessing a database instance in a VPC. Today, I will only focus on having an EC2 web server to access the database instance in the same VPC.

web-server-and-db-instance-in-the-same-vpc.png

A database instance in a VPC accessed by an EC2 instance in the same VPC (Source: AWS Documentation)

In such scenario, Amazon RDS database instance normally needs to be available to the web server, and not to the public Internet. Hence, we can create a VPC with both public and private subnets. The web server will be hosted in the public subnet so that it is accessible by the public. The database instance is hosted in the private subnet so that it won’t be available to the public Internet, providing greater security.

The Security Group used to restrict access to the database instances can have a custom rule that allows TCP access using the port 1433 and an IP address we will use to access the database instance for development or other purposes. In addition, we also need to set the Public Accessible option to Yes first (It is recommended to set the option to No for production database instance to limit the potential thread with no public routes).

Encryption of Database Instances using Key Management Service (KMS)

Amazon RDS for MS SQL supports the encryption of database instances with encryption keys managed in AWS KMS. Once the data is encrypted, Amazon RDS handles authentication of access and decryption of the data transparently without having the need to change our database client applications.

enable-database-encryption.png

Currently, encryption of database instances (Data-in-Rest Protection) is not available for those which are running SQL Server Express Edition.

Backup and Maintenance

Amazon RDS automatically backup our database instances. It creates a storage volume snapshot of our database instance, backing up the entire database instance and not just individual databases. We can setup and modify our preferred Backup Window from time to time. During the automatic backup window, storage I/O might be suspended briefly while the backup process initializes (typically under a few seconds). For SQL Server, I/O activity is suspended briefly during backup for Multi-AZ deployments.

By default, Amazon RDS has a 30-minute backup window randomly selected from an 8-hour block (Singapore region will be 14:00–22:00 UTC).

Periodically, Amazon RDS also automatically does maintenance work such as, updating the databse instance’s or database cluster’s OS. We can choose to manually apply maintenance, or wait for the automatic maintenance process initiated during our preferred maintenance window. There is one thing to take note is that the maintenance window determines when pending operations start, but does not limit the total execution time of these operations.

By default, Amazon RDS also has a 30-minute maintenance window randomly selected from an 8-hour block (Singapore region will be 14:00–22:00 UTC).

maintenance-window-collide-with-backup-window.png

We’re not allowed to make the maintenance window and the backup window overlap.

CloudWatch

Amazon RDS sends metrics to CloudWatch for each active database instance every minute. Detailed monitoring is enabled by default.

cloudwatch.png

Amazon RDS Metrics

When setting up the database instance, there is an option for us to specify whether to enable Enhanced Monitoring or not. Enhanced Monitoring is not exactly like CloudWatch. CloudWatch gathers metrics about CPU utilization from the hypervisor for a database instance, and Enhanced Monitoring gathers its metrics from an agent on the instance.

enable-enhanced-monitoring.png

Enhanced monitoring requires permission to act on our behalf to send OS metric information to CloudWatch Logs.

Conclusion

It’s true that AWS allows us to deploy our MS SQL Server database on either Amazon RDS and Amazon EC2. However, it’s very crucial to analyze our needs and our application before deciding which one to use. In general, it is still recommended to consider Amazon RDS first so that developers can focus on high-level tasks and business logic implementation.

That’s all for my first trip to Amazon RDS. As a frequent user of Microsoft Azure, I never host MS SQL Server on AWS platform. So, if there is any mistake made in this article, kindly feedback to me. Thanks in advance!

Further Reading

Deploying Microsoft SQL Server on Amazon Web Services

Advertisements

AWSome Day – Learning AWS from Experts and IAM

AWS + IAM

It’s fortunate to work in a company which encourages employees to attend courses, workshops, and training to expand their skill set. Last month, when I told my boss about AWSome Day, a training event hold by AWS expert technical instructors, my boss immediately gave me one day leave (without deducting my annual leave) to attend the event. In addition, I’m glad to have awesome teammates who helped me to handle my work on that day so that I could concentrate during the event. Thus, I would like to write a series of blog posts to share about what I’ve learnt in AWSome Day.

Amazon AWSome Day

This is the second time the AWSome Day was organized in Singapore. Based on last year AWS Summit attendees, a lot of them were looking for more professional training from AWS, and thus AWSome Day once again came to Singapore. This year, the event is at Raffles City Convention Centre, which is just a 5-minute walk from my office. Oh my tian, that is so convenient!

AWSome Day, Awesome Place - Raffles City Convention Centre

AWSome Day, Awesome Place – Raffles City Convention Centre

The registration started at 8am. After that, Richard Harshman, the Head of AWS ASEAN, gave an opening keynote. He shared with us how AWS had removed barrier of entry to start a business online and to increase innovation. My friend who worked in MNC once told me that he was given access to powerful servers to do crazy stuff. I am not as lucky as him. I am working in a startup which does not have sufficient financial capability for that. Hence, I agreed with Richard that AWS (and other cloud computing services as well) does reduce the cost of innovation and experimentation.

Richard also shared with us a story how with the help of AWS, some startup in Malaysia managed to get a few million of visits monthly without an in-house system admin. Yup, our company also does not have a sysadmin. Normally, the work of sysadmin is done by the developers. Hence, we are always looking for a way to reduce the time used on sysadmin tasks so that developers have more time to focus on improving the applications to serve our customers better. So, cloud computing infrastructure with board and deep services to support online workload helps high volume and low margin businesses like ours.

Currently, our company is using both AWS and Microsoft Azure. So, when Richard shared a graph how both AWS and Microsoft are now leaders in cloud computing service, I was glad that we made a right choice to use services from both of them.

After the opening keynote, we had a short coffee break and then we began the 6-hour journey of AWS training which was done by Denny Daniel, Technical Trainer at AWS. Since the training covers many interesting topics, I will not blog all of them here because most of the readers will just tl;dr. I will only write what I learnt and I found useful in my career. So, if you are interested in the event, why not join the future training offered by AWS Singapore? =)

Episode 01: Who am I? I am, I am… I am Identity and Access Management (IAM)!

One of the main concerns about hosting our applications on clouds is security. One of the security tools provided by AWS is called Identity and Access Management, or IAM. It enables the system admin to manage users and their access rights in AWS. Hence, in AWS, each user accessing AWS will have their own security credentials and individual permissions to each AWS service and resource.

Create User

Create User

After users have been created, we will be given a one-time opportunity to download and keep the user security credentials (Access Key ID and Secret Access Key). Since the keys are displayed only for one time, once the secret key is lost, we must delete the access key and then create a new key.

IAM is secured by default. It means that, by default, IAM users do not have permission to create or modify Amazon EC2 resources. Hence, an IAM policy, which is just a JSON document specifying the rules, is needed.

Besides creating users, we are able to create groups. Thus, instead of assigning each similar user a same set of access control policies, we can also assign the users to a group and then bind the access control policies to the group. This undoubtedly eases the user management. In addition, AWS even allows us to customize the permissions based on a given template!

There are many, many permission templates available when creating a user group.

There are many, many permission templates available when creating a user group.

Another thing that I find interesting is how IAM works with tags.

In order to  manage Amazon EC2 resources effectively, we can now tag the resources ourselves with a combination of a key and a value. For example, we can tag our instances in EC2 by owner. So, we can have one instance tagged with “Environment=Production” and another instance tagged with “Environment=Test”. After that, we then can grant IAM user permission to the instances by using the tag with condition key ec2:ResourceTag/Environment.

Finally, in the event, Denny also shared with us a YouTube video about the best practices of using IAM. I am not sure if I got the one he was referring to. Anyway, the following video is what I found on YouTube.

The video is a bit long. So for those who say tl;dw, I summarize the 10 tips below.

  1. Create individual users. Do not just use root credential. Do not have one user account where everybody in the team uses to do everything;
  2. Manage permissions with groups so that only one change needed to update permissions for multiple users. Even now you only have one user in the team, it’s encourage to create a group for that user because at some point there will be new users who are going to need the same permissions;
  3. Grant leas privilege. Only grant the permissions that are required by the users to do their jobs. Less chance of people making mistakes. Avoid assigning asterisk (*) policy for permissions which means full access unless the account is for admin;
  4. Use a policy to force users having a strong password;

    Password Policy

    Password Policy

  5. Enable Multi-Factor Authentication (MFA) for privileged users;

    Enable MFA.

    Enable MFA.

  6. Use IAM roles for Amazon EC2 instances;
  7. Use IAM roles to share access without the need to share security credentials;
  8. Rotate security credentials regularly. Access keys need to be rotated. Make sure the old access keys have been deleted after the rotation;
  9. Restrict privileged access further with conditions. There are 2 types of conditions. One is AWS common condition, such as date, time, MFA, secure transport (allow traffic coming over SSL only), source of IP, etc. Another one is service-specified condition. Some services provide hundreds of conditions that we can control;
  10. Reduce or remove the use of root account.
"What? You are always using root credential?" The best practice of all: Don't use root access.

“What? You are always using root credential?” The best practice of all: Don’t use root access. (Image Credit: Is the Order a Rabbit?)

Next Episode

There are many topics about AWS covered during the event. IAM is just a small part of it. However, with just IAM alone, I already feel that there are too many areas in IAM waiting for me to discover. Hence, I will continue to write more about what I’ve learned in the future blog posts.

Also, due to the fact that I am new to AWS, if you spot anything wrong in my posts, feel free to correct me in the comment section below. =)