This is a followup to my recent column about Steve Wozniak’s warning on the perils of cloud computing, especially cloud storage. It might surprise many users to know there are firms that sell cloud storage and do not back it up. They rely on the disk RAID and some redundancy in the cloud to “protect” your data. If something happens to their data center, they could probably not recover your data.
Remember MailandNews.com? They did not have a viable business model. They also didn’t back up their servers. One day they had a big crash and relied on the RAID array to recover the data. It took two weeks and still not all of the data was recovered.
RAID is not a data backup technology.
What happens if your cloud storage firm goes out of business? Some companies will put your data on tapes and send them to you. Others will tell you to download it. If you’ve accumulated a lot of data, that could take some time, especially if everyone is downloading their data at the same time.
Firms like IBM provide a professional backup service. This means customer data is stored on both disk and at least one tape. If there is a requirement for offsite data storage, a second tape is produced and sent to wherever. All data is encrypted and the customer controls the encryption keys.
Thanks to Enron, the financial crisis, and other wrong doing, there are boatloads of regulations on how to secure business data. Professional data storage firms know how to do it and can pass audits. Other companies do not. If you are in business and fail to meet government regulations it will be you and not your cloud storage provider who will face fines and/or imprisonment. If you legally responsible for sensitive data, you had better make sure your service provider won’t let you down.
And that brings us back to the many cloud vendors we deal with regularly. Are DropBox, iCloud, Skydrive and others prepared to pass a PCI, HIPAA, SAS 70, or Sarbanes-Oxley style storage audit?
No.
That means your data is probably not backed-up.
Actually this is why I like Dropbox. Dropbox is only a sync’d copy of what on my computer. All I have to do I backup my computer regularly and my Dropbox is backed up as well.
A service like iCloud isn’t so straight forward. It appears you can manually copy files, but I haven’t seen if it’s possible to automatically backup iCloud data to your computer. For example, if I have an iPad set to automatically backup to iCloud instead of my computer, is there a way to backup the backup? A quick web search didn’t answer that question.
“RAID is not a data backup technology.”
This should be tattooed on the inside of every systems administrator’s eyelids.
I’ve been running datacenters for almost two decades and have seen this bite so many people, so many times…. Sigh.
–chuck
People who put the only copy of their data into the hands of a third party deserve what they’ll no doubt get at some point. Even if the disks don’t die, what if there’s a problem with your account so you can’t get in? The cloud company goes out of business? Etc, etc.
And even if none of that happens, what if the network is down? Or just very slow? I just returned from fireworks at the beach in The Hague, and the whole time my 3G connection was unusable because of the huge crowd. That’s just a routine example. Fibers break, businesses go bankrupt, there’s denial of service attacks.
Whoever invented the $150 2-TB disk drive should be shot! Seriously, we store way, way more than we need, then make a copy or two of what we have, and when things get tight, a quick trip to Staples and another 2-TB is available. Imagine if we could add physical storage as easily as digital storage — everyone would have a thousand times as many garages as just a few years ago. And when those thousand garages are full of useless crap, we’d be able to add a thousand more.
The only solution to data backup is to actually think about what you need to backup versus what you can cheaply store.
That’s an incredibly simplistic viewpoint that has no relevance to nearly any business environment. Most often it is legal and regulatory requirements that dictate what data needs to be retained, not just what you can “cheaply store”.
> That’s an incredibly simplistic viewpoint that has no relevance to nearly any business environment.
On the contrary, many big businesses have email and document retention policies to get *rid* of old emails and documents, so that they *aren’t* subject to future legal and regulatory requirements such as legal discovery, subpoenas, and such.
I certainly plead guilty to trying to offer a simplified viewpoint. And I agree that the legal requirements for what needs to be retained have been increasing. But remember, a decade or so ago, we seemed to be managing OK with 1 or 2 GB, commodity disk drives, and now we have 1 or 2 TB drives, at a much lower $/GB cost. Do you really think those legal requirements have gone up by a factor of a thousand or more?
there is the issue of “code diarrhea,” in which just because the machines run faster and the disks are cheaper, nobody gives a rip about pruning junk software. you get big fat wallowing codebases and executeables that trail slime because… you can… and everybody does it. cheaper than refining the output of your cheap offshore coders.
no, I don’t expect rearchitecting over and over to get the native code from assembly language down another 480 bytes. we’re off card storage for good.
but bloatware making the floors groan needs to go the (!) away.
Although I agree 100% about the problems of bloated software, I find that most of my disk space is taken up not by software, but by music files, photos and HD movie clips of my family etc.
I was so much easier for my parents – just a few photo albums and that was it!
Looking after around large amounts of data, even just for home use, can be quite a burden…
“but bloatware making the floors groan needs to go the (!) away.”
Please, if you are going to make claims like this, back them up.
Have you ever LOOKED inside one of these “bloatware” apps you mock to see why they ares o much larger than older apps?
I’m unfamiliar with the Windows world, but I’ve looked at this on Macs,
It has very little to do with actual code growing larger.
A relatively minor issue is Apple’s somewhat unfortunate explosion in ISAs. This is now under control going forward, and an app that only needs to support 10.7 or higher can be x86-64 only, but an app that also wants to support 10.6 also needs a i386 binary in there, and one that wants to support 10.5 or earlier also needs a PPC, and perhaps also a PPC64 binary in there. This is not code bloat, and it’s not really anyone’s fault; it’s a simply a consequence of change.
A larger issue is much higher quality assets. Most of us want our apps to look good. We prefer textures to flat color, we prefer 44kHz 16-bit audio to 8Hz 8-bit audio. We like icons that don’t look like pixel art.
You’re welcome to complain that life was better in 1995; heck you’re welcome to use Linux apps where it’s always 1995; but it’s not really clear what an argument against larger assets is achieving. Is is wrong to be delighted by the subtle beauty of an app? Is it problematic to spend an extra 20MB on such assets when we have TB of storage to hold them, and VM to handle swapping them to and from RAM.
A third significant factor is support for a substantial array of languages, so that a wide variety of users can feel comfortable, even when as guests on machines that are not their own. Once again, we could go back to 1995, when software shipped supporting only one language, and anyone who actually care about multi-lingual language support lived in a special form of hell, but once again it’s not clear exactly what the value is in wearing a hair short when a perfectly good silk shirt is in the closet right next door.
Go look at the average Mac. Code makes up a small fraction of what’s stored. Much of what’s stored is photos, more is music, even more is movies and TV.
(For some people it’s video games, which are perhaps something like a movie.)
And the tradeoff is that these same people no longer have a closer full of photo albums, CDs, and DVDs.
I’m not sure why it’s something to bemoan that people are getting more value out of their computers today than ever before.
If you’re complaining about the size of programs than you’re probably talking about home backups vs enterprise. The floors shouldn’t be groaning under the weight of the backups of programs because those should be fairly static and you don’t back them up more than once.
But giant files are a problem. The too-large backups should be largely comprised of vast steaming piles of crap “data” files (and even more vast fields of cold, dry data chips.) That’s where the explosion has occurred.
It used to take some serious effort to crank out a spreadsheet larger than 10MB, but now they’re waddling all over the place. Now you’ve got high res screen caps, SQL extracts into XLS, XML output from all too many programs, the usual mountain of non-business-related movies and pictures (plus a bunch that are actually business-related), and PPT files that include them all. It’s a fat, slow-moving place. I like to think that there are parallels between these marginally useful, fat files and how simple it is to consume thousands of extra calories a day. All these pudgy formats are high fructose corn syrup and there are dozens of drive-thrus that make it way too easy to rank them out. these days document retention takes almost as much discipline and exercise as a diet.
It simply takes too much time and effort to make efficient code. I think Steve Gibson is the only one left doing it. (Assembly language.) Besides, for years Intel has spoiled the programmers into believing the next chip will make their code run well and when Intel couldn’t keep up the programmers are now depending on the “Cloud” computers. Unfortunately, the Internet is the next bottleneck until everyone can get FIOS and LTE without data caps or huge monthly data charges.
Nice article. I’ve been thinking about the role of the Cloud recently, trying to think of a useful purpose. I think perhaps we should regard it is not as a backup (of course), but rather as a distributed data cache. Funnily enough, I guess that’s what its original purpose was. With a distributed cache collaboration between permitted groups of people becomes so much easier and access to a data resource is easier since you are accessing a high performance system. Although this ‘cache’ is persistent it is in no way a backup. Which means you have to manage backing up from the cloud to your own hardware/storage.
This was enlightening. I had thought Apple would have had back ups of iCloud. Good to know not trust it to much. The only thing I use I cloud for is to sync and backup my phone. I’m ok with that level of storage with them. I would never let more important things be stored only in a cloud.
“I had thought Apple would have had back ups of iCloud. Good to know not trust it to much.”
Cringely’s statements in this respect are pure FUD. Use some common sense.
Apple have more money than god, they have built a reputation on things just working, and they’ve invested their company future on iCloud. (Quite seriously. iCloud is Apple’s System 360, because Apple believes the future is individuals owning multiple Apple products, each doing their targeted tasks well bit not doing everything badly, and all tied together via iCloud.)
Given this, do you honestly think they have not put together some sort of serious backup plan?
(And not just Apple. I’d say the same thing for Google. MS I’m less dogmatic about because it’s not yet clear to me how central the cloud is to their vision of their future, but I would imagine they’re also following best practices simply out of common sense.)
Yes yes we can all trot out some story of how someone lost a web page in iTools ten years ago. MS has also made mistakes in this field. So has Google. So has Amazon. No-one is perfect, but they’re all getting better all the time, and they’re all learning from each other.
(Vide this last week’s issue about the joint Apple+Amazon hack attack. Not especially relevant to backup, but obviously less than ideal, and just as obviously something both companies — and Google, and MS and Facebook, and …, will learn from. There ARE certainly companies that have no interest in improving their practices or serving their customers well — the telcos, the banks, the airlines all spring to mind — but I see no evidence that the top-tier CS companies should be thought of that way.)
That’s a fanboy response if I ever saw one.
It’s remarkably easy to dismiss a reasoned response as “fanboy” – you don’t have to actually think about what the person said. That’s too bad… if you’d really thought about it, you might have admitted the possibility that Bob presented something of a false dilemma by intimating that the requirements for HIPAA, SOX, or SAS 70 are the requirements anyone storing data must meet. Relatively speaking, there are few entities, corporate or individual, who have to meet those standards – or could afford to do so even if they desired to do so. Acting as though providers who don’t meet those standards, which were never designed for them, aren’t backing up or providing some level of data protection, borders on alarmism – and I’d be willing to bet you that Bob hasn’t seen the inside of most, if any, of the data centers for the big companies he mentioned. They’re not the kind of folks who give tours.
Now, their terms of service may not guarantee you data protection, and it’s always worth reminding people that these FREE services aren’t the same as data storage you pay for. But the implication that your data’s in danger of suddenly disappearing at DropBox, Apple, or Microsoft, especially as two of the three are syncing services, is worth at least a little critical analysis. And Mr. Handley’s reply to that effect was worth your critical analysis as well, not an out-of-hand dismissal.
“Bob presented something of a false dilemma by intimating that the requirements for HIPAA, SOX, or SAS 70 are the requirements anyone storing data must meet” Bob did not intimate this to me. He merely said that he was disappointed that Apple was not dependable for cloud storage…at least no better than other providers. But I agree that wishful thinking on the part of people who expect it to “just work” is the real cause of the problem. The “fanboy” defense of Apple is this part: “Apple have more money than god, they have built a reputation on things just working, and they’ve invested their company future on iCloud.”
Re: “I had thought Apple would have had back ups of iCloud. Good to know not trust it to much.” Any company can fail in that respect, including Apple. That’s why the second sentence is true if “it” refers to the “cloud” in general. I don’t expect any of the current cloud services to be up to the standards of the banking system for several decades. So local backups are still important.
It isn’t clear what Apple is doing. To this point iCloud seems to be made up of services from other companies including Amazon and Microsoft. So what’s actually happening with those huge Apple data centers? Nobody knows. You could be completely correct… or totally wrong.
“Firms like IBM provide a professional backup service. ”
Hehehe…..you better do your due diligence and check an actual IBM cloud backup contract before you call it a “professional backup service”!
Professional….LoL!
I went on to describe in the next sentence what “professional” meant. I believe you’ll find the description compatible with what IBM generally does, though still not that impressive.
“Professional” means they do it as their profession…ie they make a significant amount of money from doing it. As with any profession, the service is what it is. If it turns out to be unsuitable for your needs, you seek another solution. When it comes to certain types jobs you can do yourself, you can take the time, money, and effort to do a better job than any professional who has to get it done quickly so he can move on to the next job.
You’d be surprised at how many corporations don’t care whether their cloud storage vendor can pass an audit. Oh, they all talk a good game, but do they actually practice what they preach? Quite a few, unfortunately, do not. If they did, they’d emphasize the audits in their contracts and insist on stringent audit requirements, but those items drive up costs. Cloud service contracts aren’t driven by security, they’re driven by costs, and anything that makes it more expensive is usually tossed aside.
I imagine the data is not stored in a RAID array but more likely HDFS or another GFS alternative. In this respec,t it is harder to argue that the data is not backed-up. Nothing is 100%, so it becomes important to pay for the level of risk you want to take with your data. Hey, you could always have 2 or more “cloud” servers each with a copy of your data, plus the copy burried in the back yard 😉
SAS70 has now been replaced, more or less, by SSAE-16. The cloud services provide help with offsite backups, but yes, you should have always assume they could go belly-up or have an incredible meltdown.
[…] August 2012: Cringely agrees with Wozniak and explains how professional data storage is nothing like a free cloud service. Share […]
***slightly, but not wholly off topic***
I once wanted to see a naturapath pysician but their office only used a paperless system for all their medical data and financial transactions. I thought initially that that would be alright but when I went to the login screen, it wasn’t on a secure page. When I called the office and told them that their login page was not secure (http instead of https); their IT guy told me that since the info from the login screen goes to (or ends up in) a secure server it was secure. But is this true? What about the info in transit before it gets to the secure server? I ended up not seeing that doctor because I didn’t want to log in on a page that didn’t have https. Sorry if this is off topic but if someone can explain this to me, that would be great.
You are correct – its like your postman telling you, “No, I can’t read that postcard you sent, because, once its delivered, it goes in a locked filing cabinet.”
“I once wanted to see a naturapath pysician…”
And you’re surprised that someone who makes a living being delusional doesn’t use IT properly?
As for the software bloat, I don’t care how many languages and binaries are in there, it’s ridiculous that iPhoto on the Mac is a gigabyte in size.
+1 🙂
It might help to think about the motivations. For cloud storage companies. One is to access a steady subscriber base for a medium that decreases in hardware costs over time. Secondly, some can scan the data to learn about trends (or, for some “free” sites, more about you so that ads, etc can be better targetted. Thirdly, some data are generic (professional songs, movies, tv shows, etc). Rather than housing everyone’s copy of the same thing, they can store master copies and distribute them. That’s why iTunes match and the new Amazon match are so relatively cheap. Since professional media represents a significant amount of home use, this model relieves massive duplication and reduces storage need by the individual. That leaves user generated data (personal documents, images/photos and home movies – and pirated material) as the main storage needs. This should be independently backed up by a service that has the first motivation (I.e. it’s their business and you pay for it).
A note on iCloud. This is a hybrid model and it seems to me that Apple provides it more as a syncing service than for storage. Docs that are sent to iCloud still reside in a directory on each device (at least on Macs). However, deleting on one device, deletes on all. Hence, this is not a back-up at all.
I’m guessing that for precisely the reasons mentioned the cloud storage phenomenon will go “pop” at some point.
The costs of a properly backed-up and maintained service, capable of being audited by the most stringent government or otherwise agencies, should be high. And let’s be honest, the reason so many companies are jumping on a cloud bandwagon is to be seen to be reducing their overheads. Of course, they are, but only by opting for a data storage model built on sand, a model overpaid managers have no real concept of nor the real dangers associated with.
Local storage is now relatively cheap, it’s access to it that needs infrastructure and robust networking. I don’t see something as vague as a “Cloud” being solid enough.
The name says it all.
Actually dropbox is built on top of AWS which is PCI, HIPAA, SAS 70 and Sarbanes-Oxley compliant. So Dropbox can be considered compliant. That’s one of the reasons why a lot of companies use AWS to provide their cloud services.
https://www.dropbox.com/help/238/
Is Dropbox HIPAA, FERPA, SAS 70, Safe Harbor, ISO 9001, ISO 27001,or PCI compliant?
We certified that we adhere to the US-EU and US-Swiss Safe Harbor as of February 2012.
Unfortunately, Dropbox does not currently have HIPAA, FERPA, SAS 70, ISO 9001, ISO 27001, or PCI certifications. We’ll update this page with any new certifications as we receive them, so please do check back.
You beat me to it.
SOX compliance is pretty much the minimum these days for data security.
I believe the facts written within your write-up is really superb. I’ve been doing work on a preliminary analysis mission regarding this topic and your weblog really helped with numerous considerations that I had. I’m creating a term paper for school and I?m currently following many blogs for assessment.
As an amateur developer, I find cloud computing to be something of a dark art. I’d long thought it could be used as an alternative to making backups, with the theory that the cloud itself would likely have frequent viable backups I could fall back on. It seems that isn’t the case!
As an IT consultant, I have a few, not many, of my clients using some cloud based programs for their businesses, and beyond data loss is the issue of data DROP. Let us have old Optimum on-line climb up a pole, make modifications to their network and no tell anybody. Whammo, no internet. Happened to a medical office I support and without internet, NO BUSINESS, no patient management, no appts they could track. Now imagine if an EMERGENCY PATIENT was impacted by this loss?? We are talking genuine lawsuit here. And medical issues for the patient too. Medications? Nope. Not there. But everyone says the cloud is a beautiful thing.
The cloud exists for one reason: cheap. You can take room D303, get rid of that awful, expensive data center, get rid of the cooling devices, fire the IT staff and ship the whole damn thing to India and have it IN THE CLOUD! Note to reality: there is nothing IN the cloud. The cloud stores nothing, it is a global cable from your network to servers in another country. With somebody else’s hands on your secure data.
But again, no internet, no cloud, no business. Now how long can a business stay in business without it’s software?
Which is why I advocate LOCAL STORAGE and REDUNDANT levels of storage and TESTING too.
I guess cheap rules!!
When I first heard about “cloud storage” I thought it was like RAID; you’d install a filesystem driver that would split your data reads and writes across five, or seven, or however many remote servers. No server would ever have more than 1/5 of your data, and if one or two of them were inaccessible for some reason,. no big deal.
Instead, “cloud storage” turned out to be the same old “internet storage” schtick, like the “back up your hard drive to the internet!” companies from 1995, or people who stored all their TRS-80 files on Compuserve in 1985…
I think I’ll stick with local storage and verified backups with off-site rotation…
A little late to this party but anyways
http://thedailywtf.com/Articles/The-Unimportant-Clients.aspx
So what does Amazon’s Glacier say about this? I can’t see if it’s ISO worthy for FIPS certified – anyone know?
Hey there! I know this is kinda off topic however , I’d figured I’d ask.
Would you be interested in exchanging links or maybe guest authoring a blog article or vice-versa?
My website discusses a lot of the same topics as yours and I think we could greatly benefit from each other.
If you might be interested feel free to send me an email.
I look forward to hearing from you! Great blog by the way!
trend tracks…
[…]Cloudy with a chance of data loss[…]…
I constantly spent my half an hour to read this webpage’s articles or reviews all the time along with a mug of coffee.