Posts Tagged ‘SSD’

“Predictable performance” for changing business dynamics

Wednesday, November 5th, 2008

In a previous blog, I suggested that performance, reliability, IOPS per watt, and IOPS per $ are key storage metrics for enterprises. However, satisfying demanding enterprise needs goes far beyond the attainment of just these metrics. I/O-intensive enterprise IT applications require IOPS and bandwidth levels to be predictable and sustainable across a variety of workload requirements.

Predictable performance has traditionally been a challenge for SSDs in enterprise applications because workloads are random and indeterminate. This means that predictability requires consistent performance, independent of whether reading or writing data, as enterprise applications typically vary the read-to-write ratio between 60/40 and 90/10.  Ensuring that predictable performance is maintained while the workload changes is another example of how an Enterprise Flash Drive (EFD) offers differentiation from traditional SSDs. 

A performance comparison (IOmeter-based) between a well-publicized ‘enterprise’ SSD and the new Pliant EFD illustrates this difference.  From the chart, you can see how the ‘enterprise’ SSD(I) performance drops by over 80% as the read/write ratio changes. The Pliant EFD maintains its performance across the range from 100% reads to a 50/50 read/write ratio. This is because the Pliant EFD can read and write simultaneously to the drive and therefore offer substantially better and predictable performance for these demanding applications. Traditional SSDs and HDDs can only perform one read or write at a time. 

The bottom line: EFDs enable enterprises to achieve higher I/O performance, maintain performance predictability with changing workloads, offer higher levels of service quality, and dynamically address changing business requirements without adding additional hardware.   

I’m curious to hear what you think, so please feel free to comment.

Amyl

 

EFD on Wikipedia

Monday, August 18th, 2008

If you’ve been following my blog you’re undoubtedly aware of my views on the advantages of Enterprise Flash Drives (EFDs) over “traditional” SSD technology.

However, as with any new technology, it takes some time for the concept to catch on and for the industry to understand how the technology works and how it can solve real IT and business problems.  So, under the heading of “Industry Education,” I’m very excited that there is an EFD Wikipedia (www.wikipedia.org) definition that clearly outlines EFD benefits, characteristics and applications. 

Take a look:  http://en.wikipedia.org/wiki/Enterprise_Flash_Drive

The great thing about having a universal EFD definition is that it will allow any IT or storage professional to easily access a real-time, detailed explanation of the technology.  Also, because Wikipedia is in the public domain, the EFD definition will evolve as the market grows and the technology advances – making it always relevant to the challenges and issues IT managers will face now and in the future.

Amyl Ahola

Enterprise Flash Drives: A definition

Monday, July 14th, 2008

I have written about a new class of SSDs referred to as Enterprise Flash Drives (EFDs) many times.  But what does it take to make a true “enterprise-class” SSD drive?  With so many different SSDs targeted for the enterprise it can be difficult to tell which SSDs really qualify as EFDs, and which do not. 

So, I think a description and definition is in order. 

In the world of disk drives, enterprise-class products are distinguished from desktop and laptop products by their ability to provide superior performance and reliability.  This means that they are expected to perform flawlessly in mission critical environments.  This same requirement also holds true for enterprise SSD devices.  However, just like lower-end disk drives, SSDs designed for laptops and desktops simply can’t pass muster when expected to provide the performance and reliability required in a mission-critical enterprise environment.  There are a number of existing SSD products marketed for the enterprise, many of which are nothing more than re-packaged consumer grade (laptop) SSD technology.  In fact, many of the so-called “enterprise SSD” drives actually underperform HDDs in laptop applications…hardly what I would call enterprise class. 

Therefore, a true EFD must provide high levels of performance and reliability for flawless operation in mission critical, I/O-intensive environments.  Given the growing power and space concerns of today’s large enterprise environments, reduced energy consumption is becoming an equally important criterion for any new class of primary storage devices.  An EFD’s superior performance, energy efficiency and improved reliability allow data centers to substantially grow capacity and performance in existing installations while reducing energy needs and TCO.

Given these requirements, an Enterprise Flash Drive should, at a minimum, provide the following:

  1. Superior I/O Performance – Adequate I/O performance levels to prevent bottlenecks, even during peak activity periods (generally 3-5 times greater than typical activity periods), without requiring extra hardware (i.e., cache)  while providing ample scalability for growth.  At a minimum, an EFD should deliver at least 100,000 random IOPS or more and be able to sustain this rate for typical block sizes (4K bytes or more). 
  2. Exceptional Reliability – EFDs need to deliver significantly lower failure rates than disk drives, given the inherent benefit of solid state technology (no moving parts).  Performance and reliability must be predictable and sustainable at 100 percent duty cycles (24/7/365) without cycle-stealing maintenance or “housekeeping” actions.  Lifetime should exceed five years without performance or capacity degradation.  Robust reliability monitoring and reporting capabilities are essential.
  3. Energy Efficiency – EFDs should meet new standards for green data center excellence of greater than 20,000 IOPS per Watt, with activity-based power management to limit energy consumption when the device is less than 100 percent utilized.
  4. Cost Efficiency – Transaction costs ($/IOPS) must be substantially reduced from that of an HDD (<10%).  And, it goes without saying that an EFD must be form factor and interface compatible with HDDs (while providing similar storage capacities).

While these requirements are very demanding, I believe they only begin to define the needs and ability of solid state technology to transform future system and storage architectures.  In my opinion, the vast majority of today’s SSD products are already falling short of the true needs. 

Interested to hear what you think…

Amyl Ahola

An SSD Revolution?

Monday, June 9th, 2008

I just read a very interesting on Computerworld.com article written by Jim Damoulakis http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9092918

Jim suggests that we’re on the verge of a “SSD Revolution” because of the significant performance advantage of Flash over disk drives. He makes several compelling and informative points regarding the pros and cons of SSD technology; however, I really wish he didn’t begin the article by discussing SSD in laptops. I believe the enterprise storage industry is missing an important distinction by making comparisons to consumer SSD products when discussing the adaptation of the technology in the enterprise. There are fundamental differences between commodity consumer-grade and enterprise-class products. Comparing consumer and enterprise flash based products is a little like comparing myself to Tiger Woods; we both play golf but, believe me, the similarities end there.

The intense I/O and reliability needs of 24/7 enterprise data centers are the most demanding applications and require more than the products found in Jim’s laptop. Enterprise Flash Drives (EFDs) that will meet these requirements will not be the same as those used in notebooks, cameras or MP3 players.

-Amyl Ahola.

Storage reliability for the enterprise

Tuesday, May 20th, 2008

I’ve written a lot about I/O performance on this blog, and with good reason.  When I discuss Pliant’s EFD device and enterprise IT system performance issues with partners and the press, one of the questions that almost always comes up is about performance.  But, I often point out that, just like when considering a sports car, performance is only part of the equation.  Reliability is of equal importance as well.

Enterprise storage applications are demanding, and it is essential that reliability specifications are met at a 100-percent duty cycle operation on a 24/7/365 basis.  Those in the industry know that true enterprise-class disk drives are required for this environment, and that disk drives designed for low cost and low duty cycle laptop/desktop applications literally fall apart when employed in an enterprise application.  Likewise, SSDs designed for laptop/desktop applications also do not even come close to meeting the need.  So, for Enterprise Flash Drives to be accepted in the enterprise they must meet or exceed enterprise class HDD reliability. This is not a trivial task. 

The primary enterprise reliability specifications take the form of MTBF (or more meaningfully: annualized failure rate) and non-recoverable error rates (lost data).  Flash technology has three primary failure phenomenon that have a significant impact on reliability:

  • Write Endurance – the limit on how many times a cell can be written/erased before it becomes damaged
  • Write/Program Disturb –  writing to a given page in a Flash chip can alter bit(s) in a page that is not being written (does not damage the cell); this is sometimes referred to as “bit flip”
  • Read Disturb – similar to Write Disturb, reading a page in a Flash chip can alter bit(s) in a page not being read (does not damage the cell)

A further complication is that these failure modes are not independent.  For example, the read disturb error rate is related to the number of writes or erases so that write endurance and read disturbs (and write disturbs) must be holistically considered.  It is obvious that they all contribute to non-recoverable errors, but perhaps not as obvious that they contribute to MTBF as well.  MTBF is a measurement of performance to specification, not just to some catastrophic event, as is typical with a disk drive.  This includes meeting performance and capacity specifications.

A common approach used in typical SSDs to deal with write endurance is to incorporate a wear-leveling algorithm to distribute writes across blocks within the chip(s), together with error correction (ECC), so that any damaged cells can be corrected when read.  This same ECC can then be applied for all reads to detect and correct altered bits (‘bit flips’) independently of how they became defective, i.e., write endurance, read disturb, or write disturb.  If the number of defective bits exceeds the ECC threshold, the sector(s) being read would then have to be marked as defective (non-recoverable error) and made unavailable to the system.  Depending on the amount of spare Flash capacity, at some point the resulting system capacity may well drop below the specification.

As an example, a well-known supplier of SSDs advertises an ECC that corrects up to 8 bytes in 1024 bytes, while another supplier advertises 6 bytes in 528 bytes.  At the same time, both talk about program erase/write cycles well in excess of 1 million.  However, tests show that both ECC levels would frequently result in non-recoverable errors after as few as 200,000 write/erase cycles.  These error rates result in SSD reliability falling far short of disk drive reliability in terms of non-recoverable error rate.  At the same time, overall capacity begins to erode and eventually falls below the device specification, resulting in an MTBF failure. 

And, that’s not all.  There is also a significant performance impact resulting from the management of these high error rates (It drops dramatically!).

The primary point is that enterprise-level reliability, whether it’s MTBF or non-recoverable error rate, can not be addressed with just traditional ECC.  Other techniques must be employed in addition to ECC to manage errors.  In addition, these additional techniques cannot be allowed to significantly impact performance (IOPs or bandwidth).

Sounds like a daunting task…or is it??  Stay tuned.

 Amyl Ahola

Never send HDD to do the job fit for EFD…

Monday, April 14th, 2008

Who could ask for more than seeing a new storage industry product announcement to highlight the points you’ve been trying to make?
 
I found myself in that position, and was quite surprised (well not really surprised…more like incredulous) to see a recent announcement of what had been frequently referred to as the Seagate “brick” project (not related to MiniScribe), but minimally disguised within a Seagate-funded private company.  The product that was announced is another version of a sealed unit consisting of multiple hard drives “purpose-built to maximize performance and reliability.”  The announcement makes it clear that many new techniques must have been employed to achieve “self-healing,” and to enable the product to essentially repair itself in place “to the equivalent of a fresh, factory-manufactured drive.”  Wow!  I will leave it up to people smarter than me to respond to this.

What I’d like to discuss is the price performance aspect of this announcement.  The systems tested were fully mirrored, making comparisons never quite “apples to apples.”  However, one needs to keep in mind that the MTBF of the drives employed require mirroring to reach any reasonable reliability level.  While I could not find any real price or performance data on the company’s web site, the reference to their SPC benchmarks provided considerable data.
 
From a pricing standpoint, the 1.03TB configuration sells for more than $36 per gigabyte (after a 40% discount from $60/GB)…and, flash-based SSD at $30/GB is considered expensive?
 
This benchmark is also said to be record-breaking with the lowest cost per SPC-1 IOPs.  I’m not suggesting that $36/GB is unreasonable, only that it illustrates the true cost of hard drives in high-performance environments.  A closer look at the benchmark is even more telling.  This “record-breaking” performance correlates to a response time of nearly 30 milliseconds.  In fact, response time increases dramatically starting at about 50% of the max IOPs, which is certainly troublesome for high transaction-rate systems.

This project was started a few years ago, apparently to address the growing price, performance and reliability gap in enterprise applications, as we have been talking about, and to hold off the encroachment of solid state storage devices.  However, with today’s technology, well designed Enterprise Flash Drives will not only be lower in cost per GB, less than 1/4th the cost per IOP, and more reliable.  And, did I mention power:  EFD’s will be well less than 1/100th the watts per IOPs.  I cannot help but be reminded of the Anderson Cooper segment on CNN:  “What were they thinking!”

Amyl Ahola