Wednesday, May 19, 2021

System design: Amazon Dynamo DB | Server side TTL | My 20 minutes study

May 19, 2021

Here is the article. 

AWS News Blog

New – Manage DynamoDB Items Using Time to Live (TTL)

AWS customers are making great use of Amazon DynamoDB. They love the speed and flexibility and build Ad Tech (reference architecture), Gaming (reference architecture), IoT (reference architecture), and other applications that take advantage of the consistent, single-digit millisecond latency. They also love the fact that DynamoDB is a managed, serverless database that scales to handle millions of requests per second to tables that are many terabytes in size.

Many DynamoDB users store data that has a limited useful life or is accessed less frequently over time. Some of them track recent logins, trial subscriptions, or application metrics. Others store data that is subject to regulatory or contractual limitations on how long it can be stored. Until now, these customers implemented their own time-based data management. At scale, this sometimes meant that they ran a couple of Amazon Elastic Compute Cloud (EC2) instances that did nothing more than scan DynamoDB items, check date attributes, and issue delete requests for items that were no longer needed. This added cost and complexity to their application.

New Time to Live (TTL) Management
In order to streamline this popular and important use case, we are launching a new Time to Live (TTL) feature today. You can enable this feature on a table-by-table basis, specifying an item attribute that contains the expiration time for the item.

Once the attribute has been specified and TTL management has been enabled (a single API call takes care of both operations), DynamoDB will find and delete items that have expired. This processing takes place automatically and in the background and does not affect read or write traffic to the table.

You can use DynamoDB streams (see DynamoDB Update – Triggers (Streams + Lambda) + Cross-Region Replication App for more info) to process or archive the actual deletions. Like other update records in a stream, the deletions are available on a rolling 24-hour basis. You can move the expired items to cold storage, log them, or update other tables using AWS Lambda and DynamoDB Triggers.

Here’s how you enable TTL for a table and specify the desired attribute:

The attribute must be in DynamoDB’s Number data type, and is interpreted as seconds per the Unix Epoch time system.

As you can see from the screen shot above, you can also enable DynamoDB Streams, and you can look at a preview of the items that will be deleted when you enable TTL.

You can also call the UpdateTimeToLive function from your code, or you can use the update-time-to-live command from the AWS Command Line Interface (CLI).

TTL at TUNE
AWS customer TUNE is already making good use of this feature as part of their HasOffers product.

HasOffers-Dashboard-Phone

HasOffers helps customers to analyze the effectiveness of their marketing campaigns, storing massive amounts of ad engagement data in the process. Once the customer-defined time window for the campaign has passed, the data is no longer needed and can be deleted. Before we made the TTL feature available to TUNE, they manually identified and then deleted the stale data. This was labor and compute-intensive, and also consumed some of the provisioned throughput for the table.

Now, they simply set an expiration time for each item and leave the rest to DynamoDB. The stale data disappears automatically, with no impact on the available throughput. As a result, TUNE has been able to purge 85 terabytes of stale data and has reduced their costs by over $200K per year, while also simplifying their application logic.

Things to Know
Here are a couple of things to keep in mind as you are thinking about putting TTL to use in your application.

TTL Attribute – The TTL attribute can be indexed or projected, but it cannot be an element of a JSON document. As I indicated earlier, it must have the Number data type. You can use IAM to regulate access to this attribute, just as you can do for any other one. Items that do not have the designated TTL attribute will not be considered for deletion. In order to avoid a possible accidental deletion due to a malformed TTL value, items that appear to be older than 5 years will not be deleted.

Tables – You can apply a TTL to a new or an existing table. The process of enabling TTL for a table can take up to an hour, and you can only make one change per table at a time.

Background Processing – The scans and the deletions take place in the background and do not count against the provisioned throughput. Deletion times will vary based on the number and nature of the expired items. After the expiration but before the actual deletion, the items remain in the table and will appear in reads and scans.

Indexes – Items are removed from any Local Secondary Indexes immediately, and from Global Secondary Indexes in the usual eventually consistent fashion.

Pricing – There is no charge for the internal scan operation or for the deletion. You will pay for storage until the item is actually deleted.

Available Now
This feature is available now and you can start using it today! To learn more, read about Time to Live in the DynamoDB Developer Guide.

— Jeff;

Modified 2/10/2021 – In an effort to ensure a great experience, expired links in this post have been updated or removed from the original post.

No comments:

Post a Comment