<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Blog:  Enterprise Storage Sense &#187; storage reliability</title>
	<atom:link href="http://blog.enterprisestoragesense.com/category/storage-reliability/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.enterprisestoragesense.com</link>
	<description>Insight, analysis and commentary on data storage industry trends and technologies.</description>
	<lastBuildDate>Mon, 30 Nov 2009 20:52:16 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Settling the SSD ‘High-Cost’ Debate</title>
		<link>http://blog.enterprisestoragesense.com/2009/06/08/settling-the-ssd-%e2%80%98high-cost%e2%80%99-debate/</link>
		<comments>http://blog.enterprisestoragesense.com/2009/06/08/settling-the-ssd-%e2%80%98high-cost%e2%80%99-debate/#comments</comments>
		<pubDate>Mon, 08 Jun 2009 19:27:31 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Enterprise Flash Drive]]></category>
		<category><![CDATA[Flash Technology]]></category>
		<category><![CDATA[Green IT]]></category>
		<category><![CDATA[HDD]]></category>
		<category><![CDATA[Hard Disk Drives]]></category>
		<category><![CDATA[I/O performance]]></category>
		<category><![CDATA[IOPs]]></category>
		<category><![CDATA[SSD]]></category>
		<category><![CDATA[Solid State Drives]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[data center storage]]></category>
		<category><![CDATA[energy consumption]]></category>
		<category><![CDATA[enterprise storage]]></category>
		<category><![CDATA[storage reliability]]></category>
		<category><![CDATA[data centers]]></category>
		<category><![CDATA[energy efficiency]]></category>
		<category><![CDATA[Enterprise Flash Drives]]></category>
		<category><![CDATA[Flash]]></category>
		<category><![CDATA[hard drive]]></category>
		<category><![CDATA[TCO]]></category>

		<guid isPermaLink="false">http://blog.enterprisestoragesense.com/?p=81</guid>
		<description><![CDATA[A criticism I often hear from industry insiders and ‘experts’ is that the higher cost and TCO (Total Cost of Ownership) of SSD technology is a significant barrier to rapid and widespread enterprise adoption.
Nothing could be further from the truth.
I believe that this stems from the fact that the industry is stuck on using the [...]]]></description>
			<content:encoded><![CDATA[<p>A criticism I often hear from industry insiders and ‘experts’ is that the higher cost and TCO (Total Cost of Ownership) of SSD technology is a significant barrier to rapid and widespread enterprise adoption.</p>
<p>Nothing could be further from the truth.</p>
<p>I believe that this stems from the fact that the industry is stuck on using the HDD metric of $/GB and single drive cost as the primary measures of the cost. As I wrote in a previous post, “<a href="http://blog.enterprisestoragesense.com/2009/04/17/storage-managers-getting-wise-to-prevailing-ssd-limitations/" target="_blank">Storage managers getting wise to prevailing SSD limitations</a>”, looking at historical or single drive cost metrics doesn’t accurately measure solution-level costs. So let’s try this again.</p>
<p>Yes, individual enterprise-class solid state drives (Enterprise Flash Drives) cost more than individual enterprise hard drives. So having stated this fact, let’s also be sure to state the fact that EFDs offer tremendous performance boosts (&gt;100X), and can replace many 15K RPM HDDs. Budget constraints require that enterprises and data centers focus on maximizing both performance and efficiency, so <em><strong>transaction cost </strong></em>($/IOPS) is also a key metric.</p>
<p>The goal is to provide a storage solution that optimizes for both $/GB and $/IOPS.</p>
<p>Let’s look at a typical data warehousing application from the <a href="http://www.tpc.org/tpcc/results/tpcc_perf_results.asp" target="_blank">TPC-C benchmarks</a> (<a href="http://www.tpc.org/tpcc/results/tpcc_perf_results.asp" target="_blank">http://www.tpc.org/tpcc/results/tpcc_perf_results.asp</a>). The storage solution must provide 640,000 transactions/minute (320,000 IOPS) for 18 TB of data. With a typical all-HDD solution, this requires:</p>
<ul>
<li> 1000 15K 2.5-inch HDDs (short stroked to 18GB)</li>
<li>40 rack mounted shelves</li>
<li>8000 watts to operate and (<span style="text-decoration: underline;">an additional</span>) 8000 watts to cool</li>
<li>Price tag = $ 450,000</li>
</ul>
<p>Now, let’s look at how a ‘hybrid’ approach combining EFDs and existing HDDs can not only provide a lower transaction cost, but also a <span style="text-decoration: underline;">lower cost/GB</span> and a <span style="text-decoration: underline;">lower total cost</span>. This hybrid solution would be configured as outlined below:</p>
<p><a href="http://blog.enterprisestoragesense.com/wp-content/uploads/2009/06/ssd-cost-comparison-chart.png"><img class="aligncenter size-full wp-image-84" title="ssd-cost-comparison-chart" src="http://blog.enterprisestoragesense.com/wp-content/uploads/2009/06/ssd-cost-comparison-chart.png" alt="" width="500" height="318" /></a></p>
<p>Not only does the hybrid approach offer a much lower $/GB and $/IOP (and requires 34 fewer shelves), but the total cost is <span style="text-decoration: underline;">one-half</span> that of the HDD-only configuration.</p>
<p>Did you catch that?  <span style="text-decoration: underline;"><em><strong>One-half</strong></em></span> the total cost.</p>
<p>At the end of the day, the numbers don’t lie. The value proposition of EFDs is simple, it provides ‘more for less’ – more performance for less cost, less power and floor space, and more reliability. And, EFDs can be managed with existing software.</p>
<p>What will IT managers do with all the savings?</p>
<p>Amyl Ahola</p>
<p><script src="http://w.sharethis.com/button/sharethis.js#publisher=6fb13b80-4589-4a64-8ef6-9b8178d565fd&amp;type=website" type="text/javascript"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.enterprisestoragesense.com/2009/06/08/settling-the-ssd-%e2%80%98high-cost%e2%80%99-debate/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Storage managers getting wise to prevailing SSD limitations</title>
		<link>http://blog.enterprisestoragesense.com/2009/04/17/storage-managers-getting-wise-to-prevailing-ssd-limitations/</link>
		<comments>http://blog.enterprisestoragesense.com/2009/04/17/storage-managers-getting-wise-to-prevailing-ssd-limitations/#comments</comments>
		<pubDate>Fri, 17 Apr 2009 21:20:55 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Enterprise Flash Drive]]></category>
		<category><![CDATA[Flash Technology]]></category>
		<category><![CDATA[HDD]]></category>
		<category><![CDATA[Hard Disk Drives]]></category>
		<category><![CDATA[IOPs]]></category>
		<category><![CDATA[SSD]]></category>
		<category><![CDATA[data center storage]]></category>
		<category><![CDATA[enterprise storage]]></category>
		<category><![CDATA[storage reliability]]></category>
		<category><![CDATA[Enterprise Flash Drives]]></category>
		<category><![CDATA[Solid State Drives]]></category>
		<category><![CDATA[storage performance]]></category>

		<guid isPermaLink="false">http://blog.enterprisestoragesense.com/?p=68</guid>
		<description><![CDATA[The industry is catching on to what I’ve been talking about for some time: flash technology offers tremendous value for the enterprise, yet adoption hinges on addressing the prevailing limitations of existing SSDs first.
This ‘revelation’ appeared in a SearchStorage.com article by Beth Pariseau, “Storage admins mull SSDs at SNW.”  The article quotes multiple storage administrators [...]]]></description>
			<content:encoded><![CDATA[<p>The industry is catching on to what I’ve been talking about for some time: flash technology offers tremendous value for the enterprise, yet adoption hinges on addressing the prevailing limitations of existing <a href="http://en.wikipedia.org/wiki/Solid-state_drive" target="_blank">SSD</a>s first.</p>
<p>This ‘revelation’ appeared in a <a href="http://www.searchstorage.com" target="_blank">SearchStorage.com</a> article by Beth Pariseau, “<a href="http://searchstorage.techtarget.com/news/article/0,289142,sid5_gci1353007,00.html?track=NL-52&amp;ad=698619&amp;asrc=EM_NLN_6518606&amp;uid=8206148" target="_blank">Storage admins mull SSDs at SNW</a>.”  The article quotes multiple storage administrators who all basically believe in the benefits of SSD, but stop short of saying that the technology is ready for prime time.</p>
<p>Here are their top concerns: predictable performance, data integrity, the lack of consistent, industry-accepted SSD benchmarks, and cost.</p>
<p>Let’s quickly look at each of these:</p>
<ol>
<li> <strong>Predictable performance </strong>– I covered this recently in my “<a href="http://blog.enterprisestoragesense.com/2008/11/05/%E2%80%9Cpredictable-performance%E2%80%9D-for-changing-business-dynamics/" target="_blank">’Predictable performance’ for changing business dynamics</a>” post. This area has traditionally been a challenge for SSDs in enterprise applications because workloads are random and indeterminate. Predictability requires consistent performance, independent of whether reading or writing data, because enterprise applications typically vary the read-to-write ratio between 60/40 and 90/10. Enterprise SSDs should be able to maintain performance across this range.</li>
<li><strong>Data integrity</strong> – I couldn’t agree more that data integrity features are critical if flash technology is to perform at enterprise levels, and the <a href="http://en.wikipedia.org/wiki/Data_Integrity_Field" target="_blank">Data Integrity Field (DIF)</a> standard is an important step in this direction. Yet, today so few storage devices support the DIF standard. Pliant began mapping toward the DIF standard early on, recognizing how important it was for enterprise-class storage systems.</li>
<li><strong>Standardized benchmarks</strong> – In my post, “<a href="http://blog.enterprisestoragesense.com/2009/03/05/ssd-jargon-and-the-need-for-standards/" target="_blank">SSD jargon and the need for standards</a>,” I listed a number of pivotal questions that must be addressed if the industry is ever to develop more accurate, relevant – and yes, consistent – SSD benchmarks. These include making sure that real performance is measured and that product lifecycle benchmarks are based on true, 100% duty cycle operation. If product life metrics are contingent on usage limitations – e.g., based on a maximum number of writes or writes per day due to limited error management capability – then the benchmarks are virtually useless.</li>
<li><strong>Cost</strong> – Transaction cost (IOPS per $) is the key SSD metric to consider, not the old HDD industry metric of $/GB. This metric is an irrelevant measure of SSD value as a performance solution, and we expect EFDs (Enterprise Flash Drives) to complement high capacity HDDs to optimize for both $/IOP and $/GB.</li>
</ol>
<p>With most existing vendors either falling short on a number of these points, or masking the limitations of their devices behind carefully crafted marketing spin, it’s no wonder why some storage admins are still skeptical.</p>
<p>This is why I continue to extol the values of EFDs, a new class of solid state storage devices designed with key enterprise considerations in mind. By definition, EFDs are designed to address all of the above issues.</p>
<p>And, as we prepare to announce availability of our first products shortly, my hope is that our approach will help turn the heads and change the minds of the remaining nay-sayers in the industry.</p>
<p>Amyl Ahola</p>
<p><script src="http://w.sharethis.com/widget/?tabs=web%2Cpost%2Cemail&amp;charset=utf-8&amp;style=default&amp;publisher=6fb13b80-4589-4a64-8ef6-9b8178d565fd" type="text/javascript"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.enterprisestoragesense.com/2009/04/17/storage-managers-getting-wise-to-prevailing-ssd-limitations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SSD jargon and the need for standards</title>
		<link>http://blog.enterprisestoragesense.com/2009/03/05/ssd-jargon-and-the-need-for-standards/</link>
		<comments>http://blog.enterprisestoragesense.com/2009/03/05/ssd-jargon-and-the-need-for-standards/#comments</comments>
		<pubDate>Thu, 05 Mar 2009 18:44:19 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[I/O performance]]></category>
		<category><![CDATA[SSD]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[data center storage]]></category>
		<category><![CDATA[enterprise storage]]></category>
		<category><![CDATA[storage reliability]]></category>
		<category><![CDATA[STORAGESearch.com]]></category>

		<guid isPermaLink="false">http://blog.enterprisestoragesense.com/?p=57</guid>
		<description><![CDATA[A recent article by editor Zsolt Kerekes of STORAGEsearch.com entitled, “flash SSD Jargon Explained,” got my attention.  The fact that there is a need to explain the jargon is a reminder that the marketing wizards keep inventing new terms to ‘differentiate’ their products, while confusing most of us and masking issues of real importance to [...]]]></description>
			<content:encoded><![CDATA[<p>A recent article by editor Zsolt Kerekes of <a href="http://www.storagesearch.com/" target="_blank">STORAGEsearch.com</a> entitled, “<a href="http://www.storagesearch.com/ssd-jargon.html" target="_blank">flash SSD Jargon Explained,</a>” got my attention.  The fact that there is a need to explain the jargon is a reminder that the marketing wizards keep inventing new terms to ‘differentiate’ their products, while confusing most of us and masking issues of real importance to data center operations.   A version of marketing 101: If you have a weakness, flaunt it.</p>
<p>The list of <a href="http://http://en.wikipedia.org/wiki/Solid-state_drive" target="_blank">SSD</a> jargon Kerekes cites in the article includes: dynamic leveling, active leveling, static leveling, BCH codes, Reed Solomon codes, write endurance, write amplification, write attenuation, garbage collection, read patrol, wear leveling, read disturb, and program disturb.</p>
<p>Look at the last two.  These are rarely discussed but are among the most important issues to those who care about losing data.  An earlier STORAGEsearch.com article asks the question, “<a href="http://www.storagesearch.com/ssd-testart.html" target="_blank">Can you trust your flash SSD specs &amp; Benchmarks?</a>”  The answer can only be ‘of course not!’  At least not until there is some semblance of standardization.  This is especially true when considering using SSDs to meet the performance and reliability demands of enterprise applications.</p>
<p>With this in mind, some questions that should be asked (and answered) about SSD performance and reliability specs and benchmarks are:</p>
<p>1.    What is the real performance?</p>
<p>A simple question but rarely, if ever, addressed in the specifications.  Typical environments are random, 60%-70% read, and 4K/8K blocks.  Not small blocks (512b) to show high IOPs, or large blocks to show high bandwidth.</p>
<p>2.    Is the performance deterministic?</p>
<p>The writing process for flash is inherently slower than reading.  Does the performance drop substantially as a function of the read/write mix or does it stay relatively constant as needed to maintain consistent response times?  Is the performance dependent upon the use of cache (and the associated power loss and recovery issues of volatile cache memory)?</p>
<p>3.    Is the performance sustainable?</p>
<p>What does ‘sustainable’ mean? It is not unusual for performance to degrade as more and more of the device gets written to…it may take minutes or hours, but degradation of 50% or more may occur.</p>
<p>4.    What is the capacity available to the user?</p>
<p>Another simple question, but all SSDs contain more flash than that available for end user data. For example, the additional (or over-provisioned) flash may be used to optimize write performance, provide for spare blocks, CRC codes, ECC codes, and meta data.  Does the stated capacity net this all out?</p>
<p>5.    Are there duty cycle or other limitations on usage in order to achieve/maintain the specifications?</p>
<p>Does the architecture provide for 100% duty cycle, or is the product life contingent on a maximum number of writes or writes per day due to limited error management capability.</p>
<p>Is it assumed there will be ‘adequate’ idle time (what’s that in the enterprise?) to perform the necessary flash management activities?</p>
<p>6.    Are the error management and ECC algorithms powerful enough to correct read disturb and program disturb errors without resulting in excessive rates of uncorrectable errors and/or losing capacity due to bad block mapping?</p>
<p>Error correction approaches which utilize limited ECC to correct random bit failures may not have sufficient correction capability for read/program disturb errors. Correction capabilities may appear adequate but be based on codes, such as the Reed Solomon code, which is great for hard drives but not really applicable to flash failure modes. The lack of idle time for background flash management makes this problematic for many / most SSD architectures.</p>
<p>Kerekes sums it up well: &#8220;Better user education about SSDs is a critical factor for the industry to sustain its growth. Design trade offs in products go far deeper than the choice of memory and interface. Being aware that there are other parameters which SSD vendors have implemented well, badly (or not at all) can be the difference between a satisfactory or disillusionary experience.&#8221;</p>
<p>What do you think?</p>
<p>Amyl Ahola</p>
<p><script src="http://w.sharethis.com/widget/?tabs=web%2Cpost%2Cemail&amp;charset=utf-8&amp;style=default&amp;publisher=6fb13b80-4589-4a64-8ef6-9b8178d565fd" type="text/javascript"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.enterprisestoragesense.com/2009/03/05/ssd-jargon-and-the-need-for-standards/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Enterprise Flash Drives:  A definition</title>
		<link>http://blog.enterprisestoragesense.com/2008/07/14/enterprise-flash-drives-a-definition/</link>
		<comments>http://blog.enterprisestoragesense.com/2008/07/14/enterprise-flash-drives-a-definition/#comments</comments>
		<pubDate>Mon, 14 Jul 2008 20:07:17 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Enterprise Flash Drive]]></category>
		<category><![CDATA[Flash Technology]]></category>
		<category><![CDATA[Green IT]]></category>
		<category><![CDATA[HDD]]></category>
		<category><![CDATA[Hard Disk Drives]]></category>
		<category><![CDATA[I/O performance]]></category>
		<category><![CDATA[SSD]]></category>
		<category><![CDATA[Solid State Drives]]></category>
		<category><![CDATA[energy consumption]]></category>
		<category><![CDATA[enterprise storage]]></category>
		<category><![CDATA[storage reliability]]></category>
		<category><![CDATA[Flash]]></category>
		<category><![CDATA[hard drives]]></category>

		<guid isPermaLink="false">http://blog.enterprisestoragesense.com/2008/07/14/enterprise-flash-drives-a-definition/</guid>
		<description><![CDATA[I have written about a new class of SSDs referred to as Enterprise Flash Drives (EFDs) many times.  But what does it take to make a true “enterprise-class” SSD drive?  With so many different SSDs targeted for the enterprise it can be difficult to tell which SSDs really qualify as EFDs, and which do not. 
So, [...]]]></description>
			<content:encoded><![CDATA[<p>I have written about a new class of SSDs referred to as Enterprise Flash Drives (EFDs) many times.  But what does it take to make a true “enterprise-class” SSD drive?  With so many different SSDs targeted for the enterprise it can be difficult to tell which SSDs really qualify as EFDs, and which do not. </p>
<p>So, I think a description and definition is in order. </p>
<p>In the world of disk drives, enterprise-class products are distinguished from desktop and laptop products by their ability to provide superior performance and reliability.  This means that they are expected to perform flawlessly in mission critical environments.  This same requirement also holds true for enterprise SSD devices.  However, just like lower-end disk drives, SSDs designed for laptops and desktops simply can’t pass muster when expected to provide the performance and reliability required in a mission-critical enterprise environment.  There are a number of existing SSD products marketed for the enterprise, many of which are nothing more than re-packaged consumer grade (laptop) SSD technology.  In fact, many of the so-called “enterprise SSD” drives actually underperform HDDs in laptop applications…hardly what I would call enterprise class. </p>
<p>Therefore, a true EFD must provide high levels of performance and reliability for flawless operation in mission critical, I/O-intensive environments.  Given the growing power and space concerns of today’s large enterprise environments, reduced energy consumption is becoming an equally important criterion for any new class of primary storage devices.  An EFD’s superior performance, energy efficiency and improved reliability allow data centers to substantially grow capacity and performance in existing installations while reducing energy needs and TCO.</p>
<p>Given these requirements, an Enterprise Flash Drive should, at a minimum, provide the following:</p>
<ol>
<li><strong>Superior I/O Performance</strong> – Adequate I/O performance levels to prevent bottlenecks, even during peak activity periods (generally 3-5 times greater than typical activity periods), without requiring extra hardware (i.e., cache)  while providing ample scalability for growth.  At a minimum, an EFD should deliver at least 100,000 random IOPS or more and be able to sustain this rate for typical block sizes (4K bytes or more). </li>
<li><strong>Exceptional Reliability</strong> – EFDs need to deliver significantly lower failure rates than disk drives, given the inherent benefit of solid state technology (no moving parts).  Performance and reliability must be predictable and sustainable at 100 percent duty cycles (24/7/365) without cycle-stealing maintenance or “housekeeping” actions.  Lifetime should exceed five years without performance or capacity degradation.  Robust reliability monitoring and reporting capabilities are essential.</li>
<li><strong>Energy Efficiency</strong> – EFDs should meet new standards for green data center excellence of greater than 20,000 IOPS per Watt, with activity-based power management to limit energy consumption when the device is less than 100 percent utilized.</li>
<li><strong>Cost Efficiency</strong> – Transaction costs ($/IOPS) must be substantially reduced from that of an HDD (&lt;10%).  And, it goes without saying that an EFD must be form factor and interface compatible with HDDs (while providing similar storage capacities).</li>
</ol>
<p>While these requirements are very demanding, I believe they only begin to define the needs and ability of solid state technology to transform future system and storage architectures.  In my opinion, the vast majority of today’s SSD products are already falling short of the true needs. </p>
<p>Interested to hear what you think…</p>
<p>Amyl Ahola</p>
<p><script type="text/javascript" src="http://digg.com/tools/diggthis.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.enterprisestoragesense.com/2008/07/14/enterprise-flash-drives-a-definition/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Storage reliability for the enterprise</title>
		<link>http://blog.enterprisestoragesense.com/2008/05/20/storage-reliability-for-the-enterprise/</link>
		<comments>http://blog.enterprisestoragesense.com/2008/05/20/storage-reliability-for-the-enterprise/#comments</comments>
		<pubDate>Tue, 20 May 2008 17:06:50 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[ECC]]></category>
		<category><![CDATA[Enterprise Flash Drive]]></category>
		<category><![CDATA[Flash Technology]]></category>
		<category><![CDATA[Hard Disk Drives]]></category>
		<category><![CDATA[I/O performance]]></category>
		<category><![CDATA[MTBF]]></category>
		<category><![CDATA[SSD]]></category>
		<category><![CDATA[storage reliability]]></category>
		<category><![CDATA[Enterprise Flash Drives]]></category>
		<category><![CDATA[Flash]]></category>
		<category><![CDATA[reliability]]></category>

		<guid isPermaLink="false">http://blog.enterprisestoragesense.com/2008/05/20/storage-reliability-for-the-enterprise/</guid>
		<description><![CDATA[I’ve written a lot about I/O performance on this blog, and with good reason.  When I discuss Pliant’s EFD device and enterprise IT system performance issues with partners and the press, one of the questions that almost always comes up is about performance.  But, I often point out that, just like when considering a sports [...]]]></description>
			<content:encoded><![CDATA[<p>I’ve written a lot about I/O performance on this blog, and with good reason.  When I discuss Pliant’s EFD device and enterprise IT system performance issues with partners and the press, one of the questions that almost always comes up is about performance.  But, I often point out that, just like when considering a sports car, performance is only part of the equation.  Reliability is of equal importance as well.</p>
<p>Enterprise storage applications are demanding, and it is essential that reliability specifications are met at a 100-percent duty cycle operation on a 24/7/365 basis.  Those in the industry know that true enterprise-class disk drives are required for this environment, and that disk drives designed for low cost and low duty cycle laptop/desktop applications literally fall apart when employed in an enterprise application.  Likewise, SSDs designed for laptop/desktop applications also do not even come close to meeting the need.  So, for Enterprise Flash Drives to be accepted in the enterprise they must meet or exceed enterprise class HDD reliability. This is not a trivial task. </p>
<p>The primary enterprise reliability specifications take the form of MTBF (or more meaningfully: annualized failure rate) and non-recoverable error rates (lost data).  Flash technology has three primary failure phenomenon that have a significant impact on reliability:</p>
<ul>
<li>Write Endurance – the limit on how many times a cell can be written/erased before it becomes damaged</li>
<li>Write/Program Disturb –  writing to a given page in a Flash chip can alter bit(s) in a page that is not being written (does not damage the cell); this is sometimes referred to as “bit flip”</li>
<li>Read Disturb – similar to Write Disturb, reading a page in a Flash chip can alter bit(s) in a page not being read (does not damage the cell)</li>
</ul>
<p>A further complication is that these failure modes are not independent.  For example, the read disturb error rate is related to the number of writes or erases so that write endurance and read disturbs (and write disturbs) must be holistically considered.  It is obvious that they all contribute to non-recoverable errors, but perhaps not as obvious that they contribute to MTBF as well.  MTBF is a measurement of performance to specification, not just to some catastrophic event, as is typical with a disk drive.  This includes meeting performance and capacity specifications.</p>
<p>A common approach used in typical SSDs to deal with write endurance is to incorporate a wear-leveling algorithm to distribute writes across blocks within the chip(s), together with error correction (ECC), so that any damaged cells can be corrected when read.  This same ECC can then be applied for all reads to detect and correct altered bits (‘bit flips’) independently of how they became defective, i.e., write endurance, read disturb, or write disturb.  If the number of defective bits exceeds the ECC threshold, the sector(s) being read would then have to be marked as defective (non-recoverable error) and made unavailable to the system.  Depending on the amount of spare Flash capacity, at some point the resulting system capacity may well drop below the specification.</p>
<p>As an example, a well-known supplier of SSDs advertises an ECC that corrects up to 8 bytes in 1024 bytes, while another supplier advertises 6 bytes in 528 bytes.  At the same time, both talk about program erase/write cycles well in excess of 1 million.  However, tests show that both ECC levels would frequently result in non-recoverable errors after as few as 200,000 write/erase cycles.  These error rates result in SSD reliability falling far short of disk drive reliability in terms of non-recoverable error rate.  At the same time, overall capacity begins to erode and eventually falls below the device specification, resulting in an MTBF failure. </p>
<p>And, that’s not all.  There is also a significant performance impact resulting from the management of these high error rates (It drops dramatically!).</p>
<p>The primary point is that enterprise-level reliability, whether it’s MTBF or non-recoverable error rate, can not be addressed with just traditional ECC.  Other techniques must be employed in addition to ECC to manage errors.  In addition, these additional techniques cannot be allowed to significantly impact performance (IOPs or bandwidth).</p>
<p>Sounds like a daunting task…or is it??  Stay tuned.</p>
<p> Amyl Ahola</p>
<p><script type="text/javascript" src="http://digg.com/tools/diggthis.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.enterprisestoragesense.com/2008/05/20/storage-reliability-for-the-enterprise/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
