WHY YOUR DATA BACKUP STRATEGY IS NOT A DATA ARCHIVE STRATEGY
It’s widely proposed that 90 percent of the world’s data has been created in the last two years. Considering that the information onslaught is in no way slowing down, many corporate systems are approaching overload. This overload tightens the budget, but the service expectations keep rising, and the compliance mandates keep getting stricter.
Most companies rely on data management strategies to ease the pain. Many strategies focus on data backup as the mechanism to preserve digital data assets. However, for corporations that have mandated data retention rules, backup alone won’t be sufficient. Regulatory compliance may require data archiving in addition to data backup. Data archiving provides specific features that ensure full digital data preservation.
This paper will review data backup and data archiving, highlight why they are different, and discuss where each is necessary. A complimentary backup and archiving strategy helps ensure regulatory compliance while at the same time optimizing data backup and storage budgets.
DATA BACKUP WHAT IT IS
The terms backup and archive are often used interchangeably. But data backups and data archives actually serve different purposes. Generally speaking, backups are used for data protection, while archives are used for long-term data retention. Today, many businesses maintain backup images and archival images on different storage platforms with different price-performance characteristics.
Backup vs Archive
Backups are typically used for disaster recovery. If you experience a hardware failure, a ransomware attack, or a data center catastrophe like a flood or a fire, you can use a backup image to restore your applications and data to a well-known, previously working state. You can back up data to a storage array, to tape, to the cloud or to another data center using a wide variety of commercial and open-source data protection tools. Today’s backup and recovery products support a range of backup methods—full, incremental, differential, snapshot—each with advantages and disadvantages.
Archives are typically used for long-term, off-site data preservation. By law, some businesses must retain customer records for an extended period. In the financial services industry, for example, stockbrokers and dealers must preserve certain types of records for up to six years.
Some businesses also archive data to free up primary storage capacity. Studies show 80% of data is rarely accessed within months of creation. The latest policy-based data management tools let you automatically archive infrequently accessed files to a secondary storage platform to reclaim space. You can save money and avoid primary storage upgrades by moving dormant data from costly network-attached storage to a more economical storage platform like Wasabi.
Legacy Cloud Storage Services for Backup and Archive
Many businesses are now using cloud storage for backup and archival. Cloud storage services eliminate equipment expenses and hassles and provide pay-as-you-grow scalability. But legacy cloud vendors like AWS, Microsoft Azure and Google offer tiered storage services that are costly and complicated. Each storage tier is intended for a distinct purpose—primary storage, backup storage or long-term retention. Each has unique performance characteristics and pricing schedules (faster, more expensive storage for backup and slower, less expensive storage for archival). And to make matters worse, multiple pricing variables (egress fees, API fees, etc.) make it difficult to forecast costs and manage a budget.
WHY DO WE DO IT
Data backup is driven by internal company needs. Every company, regardless of size, needs a reliable backup solution, and this solution is typically part of a broader disaster recovery plan. The value provided by good disaster recovery planning becomes apparent when disaster strikes. In this situation, quality disaster preparedness directly affects the ongoing viability of a company. Backups are a critical component.
WHAT IT LOOKS LIKE
A data backup system is a storage infrastructure that is completely independent of any primary storage you already have. It may take the form of spinning disk (SAN/NAS/DAS) or it may take the form of robotic tape libraries. Smaller companies have additional options. Data backup is best achieved by a combination of on- and off-site storage. The off-site piece is commonly provided through a private or public cloud solution.
DATA ARCHIVING
WHAT IT IS
Data archiving (digital data preservation) is a formal endeavor to ensure that digital information of continuing value remains accessible, searchable and usable.
It combines policies, strategies and actions to ensure access to content, regardless of the challenges of media failure and technological change.
The goal of data archiving is the accurate rendering of authenticated content over time.
Data archiving focuses on preserving non-changing content that might not be essential to the daily operation of the business, but may be required for historical, compliance, legal or other reasons.
WHY DO WE DO IT
Data archiving is often driven by external factors (e.g. legal compliance). In most cases, archiving does not affect the day-to-day operations of a company. Instead, it is a required overhead that is performed by mandate or regulatory requirement. However, storage-heavy companies often benefit from lower costs of overall storage when they adopt a good archiving system (e.g. medical PACS data).
While it may have little impact on day-to-day operations, a good archive’s value comes if and when there is an event that requires the company to reproduce data in the archive. Correctly satisfying a legal discovery or compliance audit can save a company time and money.
ARCHIVING IS DIFFERENT THAN BACKUP
We have defined data backup, why we back up, and what a good data backup system looks like.
We have defined data archiving and why we archive. Now, before we examine what a good archiving system looks like, let’s briefly discuss why archive storage is different than backup storage.
When archiving is mandated by external or regulatory groups it is useful to understand what those groups may require from the solution. For example, a legal or regulatory event may trigger a data discovery process that requires the production of expired information in its native state. A compliance mandate may require retention of records in a format that is unalterable. Your team must be able to find, reproduce and provably verify data content over some defined length of time. Depending on the nature of the query, you may also be expected to be able to preserve data beyond its intended expiration date (such as for legal hold). For some liability issues, you may be required to provide records of who accessed specific data and when. Finally, because archived data may need to be accessed very far in the future, there are issues that must be addressed that are not typically part of the backup landscape.
Archive systems keep track of a lot of information that describes the data that they store. This information is commonly known as meta-information (information about the information).
WHAT IT LOOKS LIKE
Like a data backup system, a data archiving system is a storage infrastructure that is completely independent of any primary storage you already have. Also, like data backup, data archiving is best achieved by a combination of on- and off-site storage.
By conscious choice or by policy, data is placed into the archiving system. Once there, it may be safely removed from primary storage (optionally leaving a reference pointer in its place). Data placed into the archiving system will undergo a process whereby all the meta information about the file is generated. Then, the meta information and the file are written to the archiving system and kept according to the data retention policy.
Like a backup system, the storage for an archive system can take the form of disk or tape.
Since data written to an archive system may be removed from primary storage, you can immediately improve your primary storage space. The process (known as storage offloading or static-data offloading), also immediately improves your backup processes by reducing the amount of data to be protected. Data in the archiving system should be redundant and not require backup.
BACKUPS OF YOUR ARCHIVE
By its nature, WORM/immutable storage is impervious to some of the issues that require us to back up primary storage. Specifically, accidental file deletion, file data corruption and modification by malware are not possible on a WORM system. It is not possible to modify or delete the data. This is true even if you are the administrator.
However, hardware failures, site failures, and data decay can cause data loss from your archive. If your archiving system supports writing multiple copies of data, offsite replication, and autonomic healing, you don’t need to maintain a separate backup of your archive. By its nature, it already provides you with the best practices.
If you do not have multiple geographically separated copies of your archive, then you will need to protect your archive via a backup that has an offsite component. Having to back up your archive will mitigate some (but not all) advantages that archiving provides to your backup. In either case, you still maintain the advantages gained from storage offloading.
AT-A-GLANCE: COMPARE BACKUP TO ARCHIVING
DATA BACKUP
- What does it do? Active or inactive data copied to storage for the purpose of internal recovery.
- Why do it? To protect critical operational internal data and computing processes.
- When does it matter? Weather disaster, outages, employee error, criminal mischief, hacking, virus, malware, lost or outdated devices.
- Who benefits? Internal users, stakeholders, partners, customers.
- Recovery Profile: Files, folders, databases, system images saved at periodic intervals (hourly, daily, weekly, as required).
- Data Management Benefits: Version control, disaster recovery, business continuity, peace-of-mind.
- What’s at risk without these? Loss of trust, reduced employee productivity, potential demise of the company.
DATA ARCHIVING
- What does it do? Inactive data and meta data stored long-term for internal recovery or external discovery.
- Why do it? To protect critical operational, historical, legal or customer data in its native form for retrieval on request.
- When does it matter? Interruption of business to comply with lawsuit, regulatory audit or open records requests.
- Who benefits? Public entities, regulators, auditors, courts, clients, partners.
- Recovery Profile: Files, folders and complete data determined by the corporate retention policy or by regulatory mandate.
- Data Management Benefits: Compliance, legal protection, operational assurance, additional storage resources.
- What’s at risk without these? Loss of trust, reduced employee productivity, potential demise of the company.
0 Comments