DavsDisorder

This blog captures some of the observations of Tim Davoren, Data Engines' founder and Managing Consultant. Do not expect an especially coherent delivery here!

Please let us know what you think of our new logo

Tim Davoren - Tuesday, May 10, 2011
Hi Everyone, if you are interested please jump over to here and tell us what you think of our new logo we are thinking or going with as we transition the ENSTOR name and logo away.

Designing for Failure

Tim Davoren - Friday, April 30, 2010
I recently read a news article about Intermedia's service level agreement 'miss' that was linked to a performance issue on an EMC CLARiiON array.

http://searchstorage.techtarget.com/news/article/0,289142,sid5_gci1510721,00.html


There have also been a couple of subsequent posts and email responses linked to this story;

http://itknowledgeexchange.techtarget.com/storage-soup/one-storage-pros-response-to-intermedias-hosted-email-outage/

http://chucksblog.emc.com/chucks_blog/2010/04/helping-to-avoid-a-really-bad-day.html


I wanted to make a few commetns myself in regards to the story and the responses shown above.

Firstly, I agree with everything Chuck Hollis at EMC says in his post, and I wanted to emphasis and elaborate on his points.

Products Fail?
Damn right they do, all the time...sometimes without causing much of a fuss, but trust me failures don't seem that common because you only hear about the big ones (like Intermedia's). It is a testament to IT hardware vendor's engineering that alot of these "failures" go unnoticed because fo the rigorous redundancy build into their systems...not to mention field support services which, in the case of EMC, are some of the best around.

A short anecdote that relates to this story; an insurance client of ours suffered a similar failure on their IBM N-Series (NetApp) devices a few years back. A controller panicked due to a power supply issue and tried to hand over its load to the other controller but due to incorrect configuration of multi-pathing, dropped all the workloads that it was serving. Result; reboots, reboots, reboots. Missed SLA.

Design for Failure
It will happen...not if, but when. You will have a component failure somewhere in your data path at some point in the future. Design for it (or insure for it!).

CLARiiON arrays (like N-Series, HDS and many other array vendors) have controllers that operate in active/active configuration, which is great when both controllers are working, and 99.99% of the time it works fine when one fails (the beauty of PowerPath). But the disadvantage of running and active/active architecture in a disk array is that, unless you religiously monitor your workloads, you can never be sure if you can meet performance demands in a degredated state (this principle applies all down the data path, even to RAID Group design and LUN layout). My favourite disk array of the last 10 years is EqualLogic's PS Series, now owned by Dell. These fellas only operate in active/passive mode to ensure customers don't accidentally find themselves in Intermedia's situation where peak load cannot be accommodated in degradated mode.

The Alarm is Ringing but Everyone's Asleep
This is an interesting point...vendors and integrators like ourselves put effort into engineering and deploying monitoring and alerting for systems in client sites. That's great but if the client doesn't put in place procedural steps that are triggered into action by these tools, all is for nought. There is no point in having a tight RPO and the ability to deliver a quick RTO unless you have the procedural surety  to act when issues are identified. EMC's DialHome feature is a good example of removing this dependency but its simply not possible (nor do you want it to be possible) for all system or component failures. In short your recovery time is only as good the weakest trigger point and usually that trigger point is simply deciding to act on a error/mis-configuration event.

Practice Failure
Great tip here...I hear clients and prospects talk about their highly redundant environments and their sub-minute failover setups and ask have they tested it...usually the answer is no. Reminds me of people who love to talk about how much their house has gone up in value...inevitably when they actually want to sell they are a little disappointed. Proof is in the pudding. You must test your failure recovery procedures. VMware's SRM product is an excellent tool for doing this non-disruptively. Clients should regularly test failover of their Tier 1/2 applications to ensure that the 'best laid plans' are also the 'tried and true' method.


Approaches to Archiving - a cheatsheet for IT Executives

Tim Davoren - Monday, April 05, 2010

This is a little reflection I wrote some years back that I just uncovered. I think most of the key points are still relevant as they ought to be. Regardless of what technology is used archiving is a generic information managment technique (like backups) that should be approached from the high level, first principles before assessing what tools an organisation can leverage.

 

Why Archive Anyway? To Manage RISK & Manage RETURN

  • Reduce storage costs (acquisition and management).
  • Improve email/file server performance.
  • Manage unstructured information – security, business value, risk...
  • User accessibility and user expiration/termination
  • Simplify Discovery
  • Simplify Protection (backup)

What to Think About?

 

  • A long term perspective must be taken in considering an archive product. This product/mechanism/solution will be a strategic platform and is going to be there for a long time!
    • Growth – staff/data?
    • Mergers, multiple business unit integration?
    • Retention policies should match the vendor’s longevity!!
    • Data portability - can I move my archives around?
  • Manageability and Security and Reporting
    • Other applications that the archive may touch/be touched by
    • Role based administration of the archive application
    • Trending feedback – scale up before you hit the wall
  • Focus on Content Intelligence
    • Key organisational data types and storage locations
    • Where and how will data be created
  • Protect your archive application like a Tier 1 application
    • The archive application must be deployed as if it were a Tier 1 system...it now stands as a critical link in business data access.
  • Relate policy to Directory Services.
    • Don’t re-invent the wheel...you will already have IT governance policy in some basic form in Directory Services...ensure your solution can utilise and hinge from this.
  • IT Budget generally grows at about 2%, but Storage grows at about 7%...Stress that archiving is a ‘storage’ application not an email application.

A Farewell to Arms

Tim Davoren - Thursday, December 31, 2009
Well, with a tip of the hat to old Ernest, it is with some degree of relief we farewell a turbulent and bloody year. We look forward to a more peaceful and prosperous 2010. Me and the boys have just had our last 'meeting' for the making sure all is in order for next weeks start to 2010 and as a coda to that meeting we compiled very hastily a list of the 10 most influential and impactful technologies/products to have held sway throughout the past decade. We mainly focused on stuff in the realm of 'technology' (whatever that means) in the consumer and business space...we left rocket science out of the equation. Well here it is...please feel free to comment on them and remember they are in no particular order:

  1. Google's Search Engine (no explanation needed)
  2. Wired & Wireless, Broadly Available, High Speed Internet Access (hard to overestimate this, and its all thanks to bullish business overestimation boom & bust)
  3. Apple's iPhone (the crowning glory of a decade of mobile telephony innovation...the social impact of mobile phones is vast and worthy of further study)
  4. Peer 2 Peer Networking Technology (free music...legal disputes)
  5. YouTube (collective therapy for failed ambitions of all kinds...and cats doing funny things)
  6. Social Networking (think twice before you post...the end of syndicated journalism?)
  7. Digital Cameras (read, digital phone cameras as above...massive social data creation...most recorded decade in human history)
  8. Computer Virtualisation (read VMware's ESX technology...amongst other lesser candidates...impact still being felt...VMware fastest growing software in history - 2008)
  9. Global Positioning Systems (GPS) (warfare is remote controlled...couples stop fighting in cars...big brother knows where you are..cf.#3)
  10. Malware (whilst not a technology per se...represents the single biggest threat {alongside dwindling power supplyy} to our connected world)

An age old question: backup or archive?

Tim Davoren - Sunday, December 06, 2009

I refer to the article Data backup vs. data archiving: Is data backup closing the gap? published on TechTarget's SearchDataBackup.com by Ron Scruggs.

I am amazed that still, after almost 60 years of data storage and backup on electro-magnetic media, people are still confused as to what a "Backup" is and what an "Archive" is. Before I pass on my very simple explanation let me just say that the article by Scruggs is relevant for those wanting to understand the 'vendor-speak' around backup and archive. It is focused on the differences between vendor's individual 'backup' and 'archive' products...but as I am sure you all know vendors are sometimes given to bluring lines, and likewise drawing strict demarcation lines, between technologies and technical practices where they ought not really exist (practically speaking)!

Ok, are you ready...this is the difference between a 'Backup' and an 'Archive'...drum roll:

A backup is a copy of a primary source of data, whereas an archive is an immutable primary source of data

Did you get it?

You see, people and organisations create data for all sorts of reasons. Think about your own 'personal data space'. You own note pads, photo albums, Foxtel IQ, white boards, phone directories, property deeds, etc, etc. Each of these pieces of data has characteristics you assign to it...hopefully you are getting the picture here...a photo album is an archive (i.e you would not wish to have the data contained therein changed in the future...unfortunately for some that is problem!), whereas a  phone directory is 'live data' (i.e it is suspectible to change...your friend may move house and get a new phone number...or you might even make a new friend heaven forbid). You would like to protect both forms of data I would assume?

You'll notice that another dynamic is entering the discussion here...archives need to be backed up...that discussion is for another time. Suffice it to say here that most IT departments have backup of 'live data' on their lists of "expected duties" and by-in-large the task is well understood. The decision to 'archive' data is generally not in the hands of IT departments; it should be treated as a business governance and workflow issue. Firm that understand DMS (Document Management Systems) probably already know this, for others out there please don't let vendors hoodwink you into buying a backup product simply because it archives nor an archive product simply because it backs up data too.

Consider the role of the humble 'archive bit', a file attribute common to most all file systems in existence. This switch indicated whether a file's contents have changed and thus whether a backup product should make another copy of that file. If the file had not changed, then the Archive Bit is turned off. If you (or your business stakeholders) are happy for that bit to forced to 'off' (i.e the file cannot be changed) then it ought to live in an archive and hence attract different consideration to you 'live data'.


Search the Data Engines Site

Featured Content

Backup or Archive? An age old question - after almost 60 years of data storage and backup on electro-magnetic media, people are still confused as to what a "Backup" is and what an "Archive" is. See Tim's blog post explaining the difference. 

Do you "Splunk" ?? It's not a rude question, but it could lead you to some empowering insights into what's happening out there in your multi-vendor, multi-faceted IT infrastructure.

Data Engines have developed a set of field tested, vendor backed data-at-rest encryption solutions that can help organisations mitigate data security risks for removable storage media like tape. Ask us how to ensure your primary data storage or backup data is safely encrypted, but most importantly, how you can insure full recovery in the future.