Can Data Ever Be Deleted?
Inspired by Enron and other data cover-up fiascos, legislative bodies around the country appear to be taking the words of Joni Mitchell to heart:
Don't it always seem to go
That you don't know what you've got
Till it's gone.
In this case, though, they are thinking less of big yellow taxis and more about making it illegal for companies to jettison data. As a result, the already colossal global repository of data storage is being overwhelmed by a tidal wave of compliance-related storage demands.
"Compliant records data is presently estimated to grow over 60 percent per year, generating more than 1.6 PB of new storage capacity requirements in 2006," says Fred Moore, an analyst at Horison Information Strategies. "This represents the single fastest growing application segment of the storage industry."
So what should you do with company data? Should it all be stored and archived forever, or can it ever be deleted? Many of those interviewed said data should be kept forever; however, some companies overwhelmed by regulatory demands admit to regularly deleting data and taking other steps to ease the compliance burden.
Let's take a look at the various fates that your data could potentially face.
Some of the data retention requirements include: records relating to the manufacture, processing and packaging of food at least two years; records relating to manufacture, processing and packaging of drugs three years after distribution; records relating to the manufacturing of biological products five years after end of manufacturing of product; financial statements three years; member registration for broker dealers end of life of enterprise; trading account records six years after the termination of the account; medical records for up to two years after the patient's death in some cases.
"For some patients, this represents a retention period of over 100 years," says Moore. "And the Sarbanes-Oxley act requires every public company to save every record related to the audit process, including e-mails, for seven years."
It's no surprise, then, that e-mail storage alone is bringing some storage networks close to the breaking point. A few will break all together as the number of e-mails sent each day worldwide will exceed 36 billion in 2006. Those that attempt to retain and comply face a massive bill Horison estimates that compliance could account for as much as 5 percent of a typical IT budget. To put it another way, companies spent as much as $15.5 billion in 2005 on compliance.
Hold On To What You've Got
What data should you get rid of? The simple answer is anything that does not have any legal ramifications if you remove it. The sad thing is that nobody knows what that is. Any time you think about removing something, it's impossible to comprehend whether that data could become important in the future.
It can even be argued that e-mail about mundane matters such as the annual picnic should always be retained. If someone is injured at the event, for example, OSHA may be all over it and workers' comp might feel the company has to pay. If that person then dies, law enforcement will want to see that e-mail and lawyers will be subpoenaing every record under the sun, moon and stars.
Analyst Mike Karp
If you dump e-mail after five years and someone comes along with a lawsuit, you could be legally exposed on two fronts. On the one hand, you may have deleted data that could establish your guilt. On the other side of the coin, perhaps your only evidence for defense disappeared with a swift flick of an index finger on the delete key.
As a result, in some companies, compliance officers or legal execs are being called upon to sign off on deletions. Being lawyers, they tend to opt for the safe solution hold onto it.
It's not their problem that the disk arrays are overflowing or that database performance has sputtered to a halt. But that's the reality the storage administrator increasingly has to deal with.
"You should delete any information that you are certain you do not need to keep legally and that has no potential value in the future," says Tony Lock, an analyst at Bloor Research. "Frankly, this does not leave a lot of room to delete anything, and I suspect that soon it will be accepted that no data is ever truly erased."
Archive to Survive
If you can't dump it, the best way to keep it from slowing up the production environment is to archive it.
Saint-Gobain Crystals of Newbury, Ohio, for example, has instituted an archiving plan for its SAN and NAS environment utilizing BrightStor ArcServe Backup from Computer Associates (CA). Over 3TB of data is backed up every night, but the company maintains a separate archiving environment in order to enhance access to data and reduce the amount of infrequently accessed files. The CA software moves such files off the production servers after about a year. Later, they are relegated further down the storage totem pole.
"Files older than five years are copied to tapes and DVD drives, then labeled and stored in a fireproof safe at a remote site," says Mohamad Alkazaz, IT & telecommunications manager at Saint-Gobain Crystals. "This ensures easy, quick access to archived data across the WAN."
For e-mail, the company archives messages from the server that are older than 60 days. Workstations, however, are a different matter. Employees are taught to archive their own e-mail locally. Alkazaz also firmly recommends backing up before any attempt to delete or archive files, and to enforce disk quotas to prevent users from filling up network storage resources.
"Companies should get as much information as possible off Shark, TagmaStore or Symmetrix, and archive it to something cheaper," says Karp. "Such decisions, though, should be based on the value of the data, how much it is used and how much capacity you have."
Such information lifecycle management (ILM) approaches have evolved to meet compliance demands.
There certainly is a groundswell of opinion that data can no longer be deleted. However, some companies have never thought it wise to hold on to everything and are not about to.
At a user group meeting in Southern California hosted by the Association of Storage Networking Professionals, storage administrators traded war stories about data deletion. Fortune 100 firms, huge retail chains and telecom giants, it turns out, still purge e-mail and workstations every 28 to 60 days, according to the admissions of those gathered.
Another admitted that "archiving" was largely foisted off on the end user. The existing policy was that when you reach a certain threshold, IT automatically burned a CD with archived data and e-mail and sent it to the end user. The storage manager said this worked wonders on controlling disk space, although more recently he had been hearing rumblings that the practice might cause audit trouble up the line.
Horison's Moore tells another story of a large privately held insurance company in the Midwest that has a stated goal of getting rid of every bit of data it possibly can. The philosophy is to hold onto only the data that could possibly be needed. But the company takes a hard line so as to avoid the trap of retaining everything.
"There is a growing backlash against compliance," says Moore. "While the big boys may be able to afford it, the majority can't. To them, the risk of deleting it is far less than the price of complying."
He likens today's compliance push to the ISO fad of the nineties. Back then, companies rushed headlong into the arena to become ISO certified. They spent billions of dollars in documentation. Yet today, nobody mentions ISO.
"Companies need to reconnect with their delete keys," says Moore. "Many organizations now believe that the time is right to repeal Sarbanes-Oxley."
Or at least find a way to soften its demands on overburdened IT departments. This article was first published on Enterprisestorageforum.com.