Introduction
AWS has two native backup solutions, Data Lifecycle Manager and AWS Backup. Both of these, like a lot of AWS services, started life as a Minimal Viable Product. When first deployed, they were better than nothing, but that was about it. If you wanted a decent backup, you needed to look for a third party. I previously wrote a blog post comparing them to another backup solution, which wasn’t favourable (LINK). So, how good are they now? Let me take you on a little trip down memory lane first and look at where they started and additions in the last two years.
Data Lifecycle Manager
Data Lifecycle Manager, or DLM, was the first of AWS’ backup solutions. It launched in July 2018 as a simple method to automate EBS snapshots. It performed this, somewhat limited, task quite well. The biggest issue was a narrow schedule, either 12-hrs or 24-hrs.
November 2018 saw the only other updates to DLM:
- The ability for DLM to copy EBS volume tags to EBS snapshots
- CloudFormation support for DLM policies
2019 Updates
February 2019 saw the first update to DLM, a series of shorter schedules. Snapshots could now be scheduled every 2, 3, 4, 6 or 8hrs, along with the previous 12 & 24-hrs. The different plans were a big win for shorter Recovery Point Objectives (RPO).
Up to now, DLM performed snapshots on a volume-by-volumebasis. If you had multiple EBS volumes attached to your instance, you needed to tag each one, and there was no consistency between those snapshots. In May, AWSintroduced instance-based DLM snapshots. Now, you could just tag an instance, and all the associated volumes were included. Even better, these would all be snapped at the same time, giving a crash-consistent instance snapshot.
There was a bit of a lull, then November added a couple of updates. Policies could now be tagged, and time-based retention added. Previously, retention was a count of versions. For example, if you had a DLM backup running every 12-hrs and wanted to retain the snapshots for one week, retention would need to be set at 14. That would be 2 x backups/day x 7 days. Now you just needed to set it to 1-week or 7-days. Intervals for time-based retention were days, weeks, months and years.
The final update for 2019 was cross-region DR. Once snapshots completed, they could be automatically copied to one or more (up to 3) regions. Each copy has a retention period.
On the whole, 2019 saw a significant improvement for DLM. It went from useable to something that could nicely protect your EBS based instances. It still had one big hole, no recovery option. DLM created snapshots well, but there wasn’t a recovery option. That was always a manual, or at least outside DLM, process.
2020 Updates
2020 saw more improvements to DLM. Nothing too major that year, but some nice improvements. The first update was in March with the addition of a 1hr backup interval. While DLM still had specific schedules, there are now enough options to suit most requirements. May threw that on its head with the addition of support for cron expressions. Further, the May update also added support for weekly, monthly and annual schedules.
In September, the scheduling got another boost with the ability to have multiple schedules within a single DLM policy. Multiple schedules were a welcome addition to simplify the management of long-term retention. You can now have your daily and monthly schedules in the one policy, making it easier to an instance or volume’s full retention.
November saw an excellent addition in the implementation of AMI creation and management within DLM. While AMIs aren’t needed for Linux based instances, they do make a big difference for recovery of Windows-based instances. Adding the option to automate the creation and management of these AMIs and associated snapshots is a positive step for recovery.
The final improvement for December, and this blog coverage, was the addition of cross-account copies. While December 2019 implemented cross-region copy, a cross-account copy is often a requirement for DR purposes. The cross-account copy is further improved with the option to encrypt the copied snapshot with a different Customer Master Key (CMK).
Overall, 2020 mostly gave up improvements in scheduling. While not a drastic change to DLM, these additions were some welcome polish to make it shine.
AWS Backup
While DLM launched in 2018 intending to automate EBS volume snapshots, there were several resources that DLM did not protect. In Jan 2019, AWS introduced AWS Backup to a limited number of regions. AWS Backup provided a central management platform for EBS volumes, RDS databases, DynamoDB tables, Storage Gateway volumes and EFS file systems. While most of these already had their own, somewhat limited, backup solutions, AWS Backup bought them all into a central console. The killer in this bundle was EFS backups. Previously, taking a backup of an EFS file system amounted to making a copy and involved other AWS services. It was expensive and clunky. Now, a couple of clicks could backup EFS. EFS backups were, and still are, huge!
There were not a lot of significant advances for AWS Backup in 2019. April, June and October saw expansion to more regions. May added CloudFormation support, July included the option to copy tags from resources to the recovery point. October gave the ability to use SNS for status notification and filter on status types.
The wrap up for 2019 was one of interest, but still very lacking compared to third party options. The only real “must-have” was the EFS backup.
2020 Updates
AWS Backup burst into 2020 with some nice updates. In 2019, EBS backups were done on a volume-by-volume basis, sort of like 2018 DLM. While DLM added instance-based backups in 2019, AWS Backup was still stuck in volume land. January finally added instance backups to AWS Backup. Thankfully, the instance backups also included AMIs.
January also added EFS Item-Level Recovery. AWS seems to be on a big win with their EFS options. The initial launch of EFS backups was massive, but recovery was the whole volume. That update got a lot more granular with restores, allowing for individual file or folder recovery.
The final January update was the addition of cross-region backup. Cross-region backups could either be done automatically or manually for all services other than DynamoDB. You could also perform recovery in the new region. Cross-region backups were a big boost for the disaster recovery capabilities of AWS Backup.
After a little break, June saw the addition of Amazon Aurora to the AWS Backup management portfolio. Aurora clusters were able to be managed by AWS Backup, including cross-region backup and recovery. June also saw a big jump in usability with the option of cross-account management within an Organisation. Companies with multiple accounts could centralise their backup with a single pane of glass in the master account.
This is more an EFS update, but July saw a change to EFS file systems with a recommended set of daily backups for 35-days using AWS Backup. This configuration is just a simple check box and could be applied to existing EFS file systems. RDS had this type of backup for quite some time.
The middle of September added the automatic copying of tags from nested EBS volumes. Nice, but not groundbreaking. A week later, AWS did announce something quite interesting, application-consistent backups for Microsoft workloads on EC2. Using an agent, which SSM can deploy, and Microsoft’s Volume Shadow Copy Services (VSS), Windows instances could have application-consistent backups. Application consistent backups also included Microsoft applications like SQL Server, Active Directory and Exchange Server, plus any other application that supported VSS.
November saw the final AWS Backup updates with the inclusion of FSx, both FSx Windows and FSx Lustre. Quite frankly, I was surprised it took that long considering FSx could take snapshots for a while. Still, better late than never.
The last update was an enhancement to the cross-account management. Previously, the cross-account update in June was just for centralised management. Backups were still limited to the one account. The final update for 2020 was the addition of cross-account backups and recovery. AWS still restricted cross account backup was to accounts within an Organization, but a welcome addition. Unfortunately, cross-account backups are not supported for DynamoDB or FSx file systems.
Summary
Way back in early 2019 when I did my last review of AWS backups, the native solutions were lacking. DLM showed promise and probably OK for non-prod, but they were a “wait and see” proposition. I wouldn’t have recommended either of them for production use. Now, they are worth a look. DLM is excellent for quickly and easily configuring instance and EBS volume backups. Given the lack of a restore feature, I’d still not use it for production, and probably not even for non-prod. DLM shines in those quick and dirty environments that you spin up to test something out, but you also want to quickly and easily take snapshots.
In my opinion, the interface for AWS Backup is still a bit clunky and restores aren’t super intuitive, but that’s about the only drawback. In the short(ish) time it’s been around, AWS Backup has come a long way. I’d now be happy to recommend this tool for production environments. Some third-party solutions may offer better UI options, other bells and whistles, or MSP options, but, as a general backup solution for your Organisation, AWS Backup is a viable option. The price of the solution is also attractive. While you do pay for the storage used, the AWS Backup service itself is free.