How Aurora Serverless made the database server obsolete • The Register
Sponsored Feature Application development has transformed in recent years, with software designed for the cloud serving highly volatile user workloads at scale across the globe. Serverless computing has evolved to support these applications, completely eliminating traditional performance and capacity management issues.
The venerable relational database model, which still supports millions of applications around the world today, must keep pace with advances in serverless computing. That’s why Amazon created a serverless version of its Aurora cloud-native relational database system to support changing customer demands.
“More and more customers want to build applications that matter to their end users instead of focusing on managing database infrastructure and capacity,” said Chayan Biswas, Senior Technical Product Manager at Amazon Web Services. “Serverless computing is a very easy way to do that.”
A History of Serverless Computing
Amazon first introduced its concept of Lambda serverless computing in 2014. It was a natural evolution of the virtualization trend that preceded it, which eliminated the need to run every application on a separate physical server by separating the operating systems from the hardware.
Virtual machines are much more efficient than dedicated hardware servers, compressing the compute footprint of applications. But many apps don’t run all the time, only needing to run in response to other events. This is especially true when breaking down monolithic applications into container-based services.
Lambda serverless computing uses Amazon’s microVM Firecracker framework under the hood. It allows developers to call a function without running it on a server and retrieve the result. The underlying framework takes care of the rest.
This offers at least two advantages. The first is that not running a dedicated virtual machine or container for the function reduces the running cost. The second is that the underlying container-based infrastructure can rapidly scale the capacity of the function, maintaining performance even as the volume of events calling it increases.
The birth of the serverless database
Lambda supports cloud-based applications, but customers wanted the next logical step: support for serverless databases. AWS released a serverless version of Aurora MySQL in 2018, followed by a version for PostgreSQL in 2019.
Amazon Aurora Serverless translates cost and performance benefits into the database, allowing customers to rapidly scale their relational data workloads without disruption. They also only pay for the database resources they use, which is especially useful for applications that don’t call the database frequently, like a low-volume blog or development and maintenance databases. test.
Beyond that, serverless database operations also free DBAs from having to provision and manage database capacity. It’s part of a workload that Amazon calls “undifferentiated heavy work,” which is mundane work that doesn’t fully utilize the skills of a DBA. Using serverless automation to sum it up allows DBAs to focus on more important tasks such as database optimization and data governance.
Before moving to Aurora Serverless, customers had to scale their databases by manually changing the type of virtual machine their system was running on. This created additional management overhead and also halted the database for 30 seconds, which was unacceptable to many users.
Instead, customers were constantly provisioning their databases for peak workloads. This was expensive and wasted resources, forcing them to pay for large virtual machines that would sit partially idle for long periods of time.
Aurora Serverless Operation
Amazon Aurora Serverless v1 changed everything by allowing customers to resize their VMs without disrupting the database. It would look for gaps in the transaction streams that would give it time to resize the VM. It would then freeze the database, switch to another virtual machine behind the scenes, and then restart the database.
It was a great start, says Biswas, but finding gaps in deals isn’t always easy. “When we have a very talkative database, we run a bunch of concurrent, overlapping transactions,” he explains. “If there is no gap between them, we cannot find the point where we can evolve.”
Therefore, the scaling process can take between five and 50 seconds. This could sometimes end up messing up the database if a suitable transaction discrepancy could not be found. This limited Aurora Serverless instances to sporadic and infrequent workloads.
“One of the comments we heard from customers was that they wanted us to scale Aurora Serverless Databases for their most demanding and mission-critical workloads,” Biswas explained. This included those with strict service level agreements and high availability needs.
Improved serverless database services
With that in mind, version two of Aurora Serverless brings significant improvements, including a new approach that allows it to scale to thousands of transactions in seconds. AWS achieved this by providing the database process with more resources, overprovisioning them under the hood. This eliminates the need to find a gap in database traffic, as the serverless process does not move between different virtual machines to scale.
This may seem like a losing proposition from Amazon’s side, as the company has to absorb the cost of this oversupply. AWS, however, is used to finding new internal efficiencies using its economy of scale. To improve scalability in Aurora Serverless v2, workload placements have become smarter.
The enterprise can now place workloads with complementary profiles on the same machine. A reporting workload that runs at night can run on the same virtual machine as a line-of-business application that runs during the day, for example. This is one of the benefits of the cloud’s multi-tenant operating model.
Serverless v2 also scales in finer increments. V1 customers could only double their provisioned amounts of the database compute unit, known as the Aurora Capacity Unit (ACU), when usage exceeded a set threshold. Aurora Serverless v2 allows increases in 0.5 ACU increments.
There are also improvements in other areas, including availability. Although high availability for storage is standard, Aurora Serverless v1 does not provide high availability for compute. V2 offers configuration across multiple Availability Zones. It will also support read replicas on these instances for faster record retrieval, as well as Aurora Global Database support for read-only workloads. This means faster data replication between regions and sub-minute failover times for increased reliability in an Aurora Serverless 2 environment.
Amazon also introduced technology that reconciles a fundamental difference between the serverless operating model and the principles of relational databases.
DynamoDB, Amazon’s managed NoSQL key-value database, is already serverless due to its underlying architecture. You can easily introduce autoscaling rules directly in the web interface when configuring DynamoDB tables.
Things are different with Aurora because of how relational databases make connections, Biswas says.
“In a serverless environment, a Lambda function runs, and then it’s done,” he points out. “Relational databases tend to be persistent.”
Relational databases are typically stateful, maintaining a single connection to an application over time so they don’t have to waste time reconfiguring each time the application makes a request. Serverless computing is a stateless concept that creates and deletes connections as needed.
Applications using modern container architectures are designed to scale quickly. If every container-based function opens a connection to a database, the relational engine will spend all of its time managing connections rather than answering queries.
At AWS re:Invent 2019, Amazon launched RDS Proxy, a service to solve the connection problem. The service, which entered general availability in June 2020, sits between application and database and pool connections. Instead of bombarding the database server, container-based applications connect to the proxy, which can distribute connections from the pool. It supports serverless Lambda functions, Kubernetes containers, or any other stateless application that does not natively support connection pooling.
AWS not only supports efficient access to Aurora Serverless from AWS Lambda functions; it supports the reverse. Lambda integration allows clients to invoke serverless functions from the database. This allows developers to write business logic in a Lambda function, which supports different languages, rather than writing stored procedures in a procedural dialect of SQL.
Lambda integration does more than give developers more flexibility. It also places computing power outside of the database, allowing it to focus on queries rather than hampering its performance by running embedded application logic. Finally, it simplifies application workflows. For example, a developer can request Aurora to invoke a machine learning model directly as a Lambda function rather than coding that request into their application.
Amazon continues to move forward with its serverless database applications. Amazon Aurora Serverless v2 with PostgreSQL and MySQL compatibility GA’d last week at the AWS Summit in San Francisco. “We will basically support all Aurora features with Aurora Serverless v2,” Biswas concludes. Soon, for many customers, the concept of a database server may become an anachronism.
This article is sponsored by AWS.