The World Belongs to Those Who See Its Potential

Every time that I find myself hopping from airport to airport, I love seeingthe HSBC “think different” ad campaigns. Some interesting facts I read at Heathrow recently:

  • Brazilian football teams have earned over $1 billion in fees from selling players internationally.
  • On average, Russian billionaires are 19 years younger than those in America.
  • Indian e-tutors generate $20 million annually, teaching english to American students.
  • The Halal industry is worth $3 trillion worldwide.
  • Two Thirds of the Worlds Billionaires Started From Scratch

The last one is definitely my favorite. You can read more about the campaign here.

These captions put forwards a very interesting proposition. They provoke us to think differently. Who would have thought that selling audio oscillators would lay foundation for a billion dollar company (Hewlett Packard).

Disruption always starts small and the key to success is to keep your eyes open about the opportunities around you.

New Advanced Data Insights Feature: And How It Can Help You

The upcoming inSync upgrade v4.5 adds an exciting new feature called “Data Insights”. This one was on my list for a long time, and was actually one of the award winners in the most recent 24hrs Druva Hack-Day events.

inSync Data Insights

Endpoint data is more dispersed and diverse than ever before. Some of our customers are protecting up to 1PB of data with inSync, and one of the key requests we get is to have the tools understand “what” they are backing up and how can they defuse this rapidly expanding data.

This new addition is designed to answer most of these questions with the following features :

  • Data Analytics: This lets the admin gain insight into the composition of the backed up data and understand the key trends fueling the growth of data.
  • Global Search: Searches the meta-data to find files with certain keywords across the users. It also helps the admin gain access to the data refereed in the results
  • Reports: Helps schedule search and analytics and generate custom reports
  • APIs & Tools: APIs and different export format to integrate with any existing software

Essentially this is an eDiscoveryLite for enterprise endpoints, and yet again demonstrates our commitment to enterprise endpoints, an area we feel is extremely important but lacking in tools and support.

So far, the feedback from the Data Insights beta has been great. Interestingly, some customers came back with a different use case and requested if this could be combined with our SafePoint product (data loss prevention) or with data sharing, to help the admins better understand the propagation of data and how to secure it.

You can learn more about the product, which is currently free for existing customers, by going to the Advanced Data Insights page.

Thank You For Changing The World



He was truly the entrepreneur and CEO I admired the most. He helped us discover how technology can change a lifestyle. A visionary who defined new categories, and built one of the most profitable companies in the world.

Today, I could see tweets from Libya, Egypt, Yemen, all united by one thing. And could see an adult cry while typing on his iPad.

Thank you for leading the way. Your vision will continue to guide entrepreneurs like me in years to come.

R.I.P

Druva Momentum Continues with $12M Series-B “Scale-Up” Funding

I can still remember working from a “shared” garage office when Druva was a bootstrapped company with just 7 people.

After wasting about 6 months, the “eureka” moment came in July 2008 with the launch of inSync, and since then we haven’t had the time to look back.

Today I’m delighted to announce the closing of a new “scale-up” Series B round of funding for $12M lead by Nexus Venture Partners with participation from Sequoia Capital. The official announcement and more details can be found here.

This round will help us to continue our momentum and make some strategic investments. In the recent past Druva has demonstrated technological leadership with the introduction of unique innovative products such as SafePoint, support for iOS/Android and lead the way by being the first data protection company to offer strict enterprise SLAs for data backup to the cloud. Over the next year to 18 months, our key focus will be to significantly grow sales and marketing while still continuing the spirit and culture of innovation and reinvention that has brought us success so far.

In thinking about the trip so far, being an amateur hiker, it reminds me of hiking a steep ascent. As you go higher, the view just gets better, and I’m confident that we are soon going to be at the top.

High Availability and Amazon AWS

A lightning in Dublin knocked out Amazon and Microsoft data centres offline for few hours and it took sometime to get all the services restored.

Although it did affect Netflix, foursquare and few others, thankfully Druva cloud services were completely unaffected by this. Here is a small note on how we managed to keep our promised SLA.

I think its plain ignorance or mis-planing to assume 100% availability of underlying infrastructure. Just like any hardware, the AWS infrastructure is prone to failures, but the knowledge of these potential failure points can help improve availability.

Since its a backup service, we have have divided our cloud design into 3 parts based on the availability and durability guarantees :

  • Config (most available): Configuration data stored in Amazon RDS
  • Meta-Data : Druva Dedupe file-system spanning across Cassandra nodes
  • Data (most durable): Stored in S3

And some design changes we incorporated to avoid downtimes :

  • Multi-Zone replication: Both RDS and Cassandra nodes are replicated across 3 availability zones. We use Cassandra in full-consistency mode and heavily rely on its self-healing, in case of service failures.
  • Reduced Dependency on EBS: EBS is a software abstraction of an underlying SAN storage. And two independent EC2 instances may share same SAN for EBS. Given this we shifted our focus from EBS to local-storage for meta-data.
  • Extra space copies in S3: We so maintain some extra redundancy on top of S3 for most referenced blocks. This essentially is to avoid the random (but less frequent) S3 time-outs and improve durability of most concurrent data.

We surely paid more for improved availability, but there are simple design changes which can help save as well. For example the 3-way replication increased our compute(EC2) cost by over 200%, but because of extra spare we could increased the data stored per instance, which was earlier restricted to maintain a good cache-vs-on-disk ratio.

High Performance Deduplication

Time and again multiple enterprise customers, especially those who are migrating from competing solutions, ask us about scalability of Druva inSync. Since the launch of v4.0, inSync has scaled exceptionally well, especially for large deployments. The software has succeeded where majority of competing solutions have failed or turned off deduplication.

About a week back, (on request of a large customer) we started testing one of the competing solutions. We tested the software for 1 million files of total size of 2TB, of which 48% was duplicate. Insync finished the backup in about 22 hours and the competing software is still backing up.

InSync doesn’t support any “integration” with deduplication, but the whole software was designed around the deduplication and CDP. There is NO flag to turn off dedupe and there never will be.

This article focuses on my thoughts on how Druva succeeds where majority of others fail.

Why Source Deduplication Fails to Scale for Majority Vendors ?
The biggest bottleneck for performance scalability of deduplication is the random disk IO performance. Almost all dedupe systems include a database to store the block-hash index which needs to be checked for every hash check. A server class magnetic disk usually offers a latency of 8-12ms which restricts the hash matches to about 100/sec, throttling the dedupe performance drastically.

Now, when the data set is small the entire index can reside in memory and hence the hash checks as much faster. As the index grows, the I/O congestion brings down the software’s capacity to perform inline deduplication.
Consider this: Just about 1000 users can create over 10 Billion blocks for backup. And checking them with a rate of 100/sec could take 3.21 years.

Learnings from Storage Guys
Data domain had an interesting approach. They optimized their inline dedupe performance for backup streams. Since the backup was mostly for servers with few large files and the data streams were mostly long streams of data in tar format, Data domain used a simple index read-ahead algorithm to load the relevant parts of the index before the stream blocks hashes reached the server. Since the streams changed less than 10% across two simultaneous backups, the algorithm helped deduplicate them at a very fast pace.

Solid State Disks
A simple solution to the random-I/O problem is using SSDs to store the index. Although we did tweaked/changed certain features to support SSDs but the solution wasn’t complete because of the size limitation imposed by them.

Two Step Approach for Druva: No-SQL + HyperCache
The “Data Domain approach” did not work for us as our data was much more random and coming from different sources. But on the flip side we had much more knowledge of the data formats we were backing up.
The first step towards scalability was to get rid of the inbuilt SQL database which imposed a lot of latency because of SQL query serialization and execution. We replaced PostgreSQL with Oracle no-SQL BDB as an embedded database, which improved the performance and much simpler to maintain.

The second major innovation was HyperCache – a selective in-memory cache of index. Hypercache constitutes of both a positive and a negative cache, which remembers and caches both the most probable and the least probable hashes for on-going backup. HyperCache uses an ever learning algorithm and uses different parameters like time, frequency and probability of a hash to cache it.

The Result
The result was 85% reduction in disk I/O by using 4GB of RAM for every 1TB of data stored. The reduction in IO translates to 4X better scalability, and the solution can easily scale to thousands of users with linear improvement in scalability/performance.

Use of SSDs further improves the performance by 6X. InSync core has been modified to keep only the most concurrent part of the database index on SSDs and optimize it for solid state drives.

Sales Meetings and My iPad

Yesterday I got a ping from a PC magazine editor asking about my tablet usage and how I see the adoption in the enterprise. This was actually great timing (my brother had given me a gift of an iPad2 almost the same week it was released).

I’ve been travelling pretty heavily over the last 2 months, and had an opportunity to meet and learn from a lot of customers. The best companion during this travel has been my iPad. Here’s my small list of productivity apps/accessories which I use the most:

  • inSync iPad app – lets me access all files and all the versions backed up from my laptop
  • Smart cover – makes typing emails simple
  • Easy Sign app – makes signing documents easy (I realized this was the only reason I used my fax/scanner)
  • Kindle app – Bought 2 books: The idea book & The upside of irrationality
  • iPad VGA adapter – allows me to present directly from my iPad
  • Evernote – for taking meeting notes

I’m currently a happy beta tester of the new version of the inSync app for iPad which allows offline access of folders marked as “favorites”, and this was an absolute life saver when I was travelling in Europe and didn’t want to pay heavy roaming charges :)

I see tablets as great access devices, which in my opinion would affect two markets the most : desktop and printer. Although I am still adjusting to carrying 1 extra device, but IMO the laptop may soon become more of an in-office/limited-mobility device.

Enterprise Laptop Backup Market Heating Up ?

We did a webinar with Curtis Preston (Mr. Backup Blog) yesterday and the response was overwhelming. It’s good to see more and more companies focusing on laptop backup. Below are the slides for reference.

When we started the journey about 3 years ago, we were almost the only startup focused on Enterprise Laptop Backup and the only case study for us to follow was that of Connected (which was acquired by Iron Mountain and then eventually left to die). We have come a long way and have 600+ strong customers with some great names like PWC, NASA, Xerox, Schlumberger etc. When you have a fire at your home,  all you care is a good/reliable fire extinguisher and value the brand less, and I believe that’s what happened with us with all these great brands.

And now its good to see many other players joining the game. Last week we saw announcements by i365 Cloud (Seagate) and CommVault. Mozy and Carbonite have also done a good job in educating the consumer markets which is eventually spreading awareness in businesses as well.

With the February launch of Druva inSyne Enterprise and last week’s Backup Cloud announcement, we have yet again established our leadership in innovation.

Our salesforce stats reveal that about 63% of qualified leads eventually buy InSync and we almost never lose to our competitors (we lost just 8 against 320 opportunities in last 1 year). We’re super confident of improving this track record with upcoming customer reach programs and product announcements. Stay tuned.

Druva inSync now available for iPads and iPhones

I have a confession. I admit I’m a former Windows user turned Mac Geek (no, I don’t have a black turtleneck, you have to draw the line somewhere). Of course, being an avid Mac user I spend a lot of time on my iPad, checking email, reading blogs, reviewing documents, checking designs, surfing the web, downloading apps etc.

Druva inSync Login on iPad

Seems like I continuously have my head in my iPhone these days too, mostly for work (though I have been known to play the odd game or three of Fruit Ninja – highest score so far is 645).

 

Up to now, if I ever needed a file urgently, say a PowerPoint deck or a PDF, I would either pull that from my laptop, or have to dig it out of email.

With the recent release of inSync 4.1 support for iPhone & iPad, now all of the files that are on my laptop I can access via backups on my iPhone or iPad using the inSync remote client.

The inSync client installs just like any other app. Once installed it’s very easy to use, you configure it to point to your inSync server, add your username and password and voila, you now have access to all of your backed up data over time. Simple.

 

The 4.1 release of inSync Enterprise also included some important and exciting additions, such as HyperCache, multi-admin support, and Active Directory integration, making it extremely easy to import and maintain users. And this is just the tip of the iceberg, already we have plans for additional development to add more functionality and extend our protection beyond the laptop to iPads, iPhones and more. Exciting times ahead…

 

Druva inSync Enterprise Edition

It’s not often that a new release announcement comes on Valentines day, I am super excited to announce the new inSync v4.1 Enterprise release. It brings some great new features and scalability improvements, and yet again demonstrates our razor-sharp focus on end-point devices.

Druva announced and demonstrated the new release recently at the Tech Field Day event at San Jose and got a fantastic response.

 

 

New features and improvements include -

  1. App for iPhone/iPad for 1-click data access
  2. HyperCache: new in-memory dedupe subsystem to boost backup performance by 3X
  3. Support for solid state disks (SSD) to boost performance and scalability
  4. Multi-admin support with profile-level quotas
  5. Better AD integration

The new release, pricing structure and upgrade details should hit the website by Feb 21st.

The engineering team has spent more than 12 hours/day for last 5 months, and  I am sure you would appreciate the hard work. The next few posts will discuss this release and each of these features in detail. Stay tuned!