Data De-Duplication and the Economy
By tasaro on Nov 25, 2008 | In Data Management, Business Issues for IT | 2 feedbacks »
I think it makes sense to focus on the economy - since we are in unprecedented times. One of the most important technologies as it relates to the economy is data de-duplication. I have been analyzing and writing about data de-duplication since 2003 and in that time it has proven to be real, compelling and valuable to end users. The only reason not to invest in data de-duplication is that you simply don't have that much data.
Data de-duplication has been most widely deployed with disk-to-disk (D2D) backup. If you have been considering a D2D backup solution with data de-dupe, then think about actually making it a priority this year. Yes, the stuff really works.
If you are just using tape, the economics may not be clear to you because the physical cost per MB of storage for tape is cheaper. But you have to look at the cost of "virtual" MB. The right approach is to measure the cost of the de-dupe solution based on the effective amount of data stored. A 10-to-1 de-dupe solution is backing up 20 TB of data on 2 TB of physical capacity. And achieving 10-to-1 is conservative.
And let's face it - tape sucks. You can keep your tape system but reduce the amount of times you use it - instead of daily or even weekly fulls - do your full backups once a month to tape. This reduces tape rotations, lowers operational costs and minimizes human error.
Data de-dupe is also finding its way into primary storage as well. IMO this makes a ton of sense. And the good news is that primary and backup de-dupe should be cumulative.
But first things first - D2D backup with data de-dupe is a no-brainer. Not that I am implying that if you don't implement it that you have no brain ;-)
Okay - for those of you that have objections to data de-dupe for D2D backup or primary - send me some comments and we can discuss these issues one by one.
I for one am a big proponent of data de-duplication and believe that it should and eventually will become pervasive. Data de-duplication is really a form of virtualization - virtual data - if you will. Data de-duplication changes the economics of storing data - it reduces power, cooling and floor space consumption - and it enables you to do more than you could otherwise. Data de-duplication is landscape changing and we are still in the early chapters with more to come. But don't worry about how it can help you in the future - look into it now. Data de-duplication solutions should be a high priority for the data center. Go get some.
2 comments
I couldn't agree with you more; dedupe is definitely game changing. Having tested nearly a dozen dedupe solutions and spoken to many end-users of the technology, I would caution users to not get caught up in the dedupe ratio 'wars'. Make your decision based on what will address your needs and integrate into your environment best rather than who claims the highest ratios. A 10:1 dedupe ratio means you've reduced your storage requirements by 90%. Beyond that returns rapidly diminish.
Tony.
Leave a comment
| « The Challenges of SRM | Risk versus Innovation » |




