Hyper Storage: We’re Turning Our Enterprises into Pack Rats

IBM recently released figures estimating that worldwide digital archive capacity is expected to increase at a nearly 60 percent compound annual growth rate by 2012, which means the total amount of data will have increased by 800 percent during that time.

Organizations are facing a sharp increase in the amount of digital content that must be managed with new compliance and discovery requirements. Worldwide file, database and email archive capacity will each skyrocket at an a compound annual growth rate of up to 73 percent — altogether totaling nearly two trillion full filing cabinets of information.

To put it bluntly, that’s a lot of bits and bytes to be managed. We all know that storage requirements — for video, audio files, images, and transactional data — are exploding. Layer on top of that the various compliance mandates that dictate that organizations hold on to certain types of data for extended periods of time, and you get the idea.

A couple of years back, I was chatting with the IT director for one of the nation’s largest insurance companies, who remarked that as a result of Sarbanes-Oxley, when it comes to e-mails — or anything else — the information stays. “Right now, we save everything — no matter how far back,” he said. And this is an organization that handles almost a million and a half e-mails a day. I asked him how they’re going to be able to store and manage all those files. Currently, everything was backed up on disk, and his department was investigating ways to move older files to tape.

Society and the legal system are turning enterprises into pack rats — havoing to spend time and money on the ludicrous exercise of saving and cataloging every scrap of correspondence and data that ever crossed their domains. As a result of this, and some very legitimate data requirements, multi-terabyte databases have become commonplace. Many databases are now cracking the 100-TB market, compared to a few that finally hit the one-terabyte mark a decade ago. Our research with the Independent Oracle Users Group found that multi-terabyte databases are fairly common these days.

I had the chance to speak with Craig Butler, IBM’s storage products marketing manager, on the occassion of the 50th annoversary of disk-based storage. Basically, Butler explained, storage devices have been able to keep up with exploding data demands because of a sort of “Moore’s Law of storage.” Over the past 50 years, capacity has grown by at least 40 percent a year, and even accelerated to 100 percent a year over the last decade. All this came with no change in price, and this is likely to keep pace into the near future.

However, nothing has come along that really can take the place of disk or tape, Butler said, and there is nothing emerging on the horizon.In fact, Butler predicts that we’ll still be relying on disk drives a decade from now — except that they’ll be smaller and more ubiquitous.

There are two challenges that we face with storage as time goes by, Butler said. First, there’s a need to be able to better manage and access the huge proliferation of data piling up in enterprises. “We need new search and retrieval techniques to find thee right data,” he said. “We have amassed all this capacity, but human beings don’t have time to search through it all.” Unstructured information is real challenge — take video files, for example. When the police need to review security cameras for suspects, soemone has to spend time — hours and days worth — watching analog video.

The other challenge is being able to access storage media decades from the time the data or image is captured. “A lot of valuable data could get stranded in an application that no longer exists, or in a file format that no longer exists, or a hard disk storage of some type that no longer works with new systems,” Butler said. IBM has been working on a research initiative that has been looking at storing data and artifacts with descriptive metadata, he said.

In terms of hardware, IBM has been exploring approaches such as molecular-size switches that could toggle molecules to different states — a sort of dense way to store ones and zeros.

Ultimately, Butler pointed out, its not disk technology that will restrain storage — it’s the organization and applications around it. “The energy is shifting from how we store our hard disk drives to the applications, security concerns, how we use all this data, and how we search through all this data,” he said. “Then there’s the privacy and ethical considerations of keeping all this data, because a lot of it is going to be about you and me. Who owns it? How do we keep it safe? We’ve got a lot of legal, ethical, and privacy concerns to sort through.”

You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Comments are closed.