Druvaa inSync Roadmap

We receive numerous requests for throwing some light on inSync’s roadmap, so here it is. We have tried to include most of the  suggestions we received from the users. At the same time, we did not go for some features as they do not fit in our vision for data protection. We discuss our view point about some of such  features towards the end of this blog entry.

Our focus, as always, is ”Light-weight, Simple, Fast and Trustable” backup solution.

Version 2.2 (Oct 10th, 2008)

  1. Admin configured backup folders – The admin can choose “must have” folders for backup for each profile. Can also choose if user can configure more folders.
  2. Browser Restore – Enable user to restore files and folders using just the browser, when he is not at his desk.
  3. Linux Port (beta) – Initially support Ubuntu 8+, openSuse 10+ and RHEL 5+
  4. Advanced Reporting – 6 different reporting option for flexible and detailed reporting.
  5. Dump user data locally (on server) while disabling the user.
  6. Restore user data on server – We plan to allow a dumping user data locally (on server) in case a user is in disabled state. This could be useful for archiving a user’s data before the deleting the user.
  7. Publish configuration API – Publish the server configuration API to enable third party software vendors to integrate inSync backup in their management console.

Version 3.0 (Dec 21th, 2009)

  1. Full PC Backup – Use de-duplication to effectively backup entire PC (operating system, application executables in addition to the application configuration and data)
  2. Bare-metal Restore – Use restore points created by full PC backup to restore a machine that does not have a working operating system.
  3. Performance Improvements – for large (1GB+) file incremental backup.
  4. Search in Restore – Search files in restore.

Excluded Features

The features which we believe should not be implemented even though some key players offer them -

  1. Disable inSync client’s desktop visibility – Don’t show the inSync client running on the the user’s PC to hide backup. In our opinion, this is not a solution. The right approach is to provide a light weight backup solution that does not hamper the PC performance and hence, the user does not want to disable it.
  2. Server initiated backup- It is not useful for the PC backup environment, especially for mobile laptops that are not always connected. we may consider this for the server version of inSync.
  3. Allow USB backup or tape backup on users PC – We believe that media based backup is inherently unusable. With falling disk prices and Druvaa’s data de-duplication technology, the best backup policy is to maintain backups on hard-disks.

Green-ness of Data De-duplication

The Storage Hunger

Sale of disk-bases storage system has already crossed 2500 Petabytes in 2008 and up by 58.1% YOY (One petabyte = 1 Million Gbs). These figures do not include the direct attached storage which comes pre-loaded with PCs or servers.[1]

This is understandable as 1TB (1000GB) storage NAS/SAN devices are now commodity. The top three vendors in this space are HP, IBM and EMC with market share of aprroximately 29%, 20% and 14% respectively.[2]

The overall consumption doubles when this storage is backed up :)

Energy Consumption

On an average a dataceter consumes 100 Watts/sq-feet of energy and the best solid state storage consumes about 5 watts for 1MB IOPs.[3]

This puts the total cost for mainiating (cooling + power) for 1 TB disk array about USD $2,500/annually. (16c for KWh, and 20 GB average daily usage).

This makes the annual energy consumption of newly bought storage = USD 5 Billion !!!

And backing this 5 Billion dollar inventory surely adds couple of more billions.

Data De-duplication

The data de-duplication technology saves single copy of duplicate data. There are two important aspects of any data de-duplication solution/product -

  1. Scope of duplicate discovery – File-level / Sub-File level / Block level
  2. Point of duplicate discovery – Source / Target

Most of the storage vendors which use data de-duplication provide block-level duplicate removal at target (i.e. when the data reached the storage). But, its not very difficult to image that source level removal of sub-file or block level duplicates would be much better for two reasons -

  1. Sending lesser/de-duplicated data saves time and bandwidth (apart from storage)
  2. Duplicate discovey would be much better as you have access to the structured data

Consindering Microsoft’s report on de-duplicate assessment [4], -

  1. 20-30% data duplicates are easily visible even in unstructured data source like ERP databases
  2. 40-80% data duplicates can be seen in file-servers and mail servers.
  3. 60-90% data duplicates can be seen between different PCs. (Just my observation and opinion)

On an average a conservative 30% data duplicate removal can save $1.6B on storage energy and $2B on bandwidth costs and backups.

De-duplication and Druvaa

We see Druvaa inSync as a product/platform to provide de-duplicated (at source) backup for PCs, PDAs and servers. The current version is available for just PCs and we can easily see up to 90% savings for time and cost (bandwidth and storage) for enterprises.

I just don’t see a reason why all storage and backup vendors wouldn’t do it. EMC and Netapp have already announced de-duplcation as additionally licenssible technology on their arrays (target based).[5] No major vendor except for EMC has announced agent/source based de-dup though.[6]

Surely, Druvaa has a good lead and cashing on it :)