<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>agigatech.com &#187; RAID</title>
	<atom:link href="http://agigatech.com/blog/tag/raid/feed/" rel="self" type="application/rss+xml" />
	<link>http://agigatech.com/blog</link>
	<description>AgigA Tech Inc Company Blog</description>
	<lastBuildDate>Wed, 30 Dec 2009 15:10:35 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Bulletproof Memory for RAID Servers, Part 3</title>
		<link>http://agigatech.com/blog/bulletproof-memory-for-raid-servers-part-3/</link>
		<comments>http://agigatech.com/blog/bulletproof-memory-for-raid-servers-part-3/#comments</comments>
		<pubDate>Tue, 17 Nov 2009 05:42:54 +0000</pubDate>
		<dc:creator>AgigA Moderator</dc:creator>
				<category><![CDATA[backup]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[ultra-capacitor]]></category>
		<category><![CDATA[ultracapacitor]]></category>
		<category><![CDATA[DRAM]]></category>
		<category><![CDATA[NAND Flash]]></category>
		<category><![CDATA[RAID]]></category>
		<category><![CDATA[Server]]></category>

		<guid isPermaLink="false">http://agigatech.com/blog/?p=103</guid>
		<description><![CDATA[What’s the right way to create memory for RAID servers that can withstand power outages? Bulletproof server memory. Because that’s what RAID server designers need; that’s what RAID server buyers want. They want a safe place to stash their bits where they no longer need to worry about them.
The question’s not as simple as it [...]]]></description>
			<content:encoded><![CDATA[<p>What’s the right way to create memory for RAID servers that can withstand power outages? Bulletproof server memory. Because that’s what RAID server designers need; that’s what RAID server buyers want. They want a safe place to stash their bits where they no longer need to worry about them.</p>
<p>The question’s not as simple as it seems. There’s a temporal quality to the question. What’s right ten years ago isn’t right today and probably won’t be right ten years from now. Semiconductor technology is both fluid and extremely dynamic. One thing’s certain. You need to deal with today’s problems today. If you can address the same problem in the same way two or three years from now, that’s great! But you still need to address today’s problem today. You need to use components you can get today, not some time in the future. The future may include some surprises that change today’s answer, but today’s answer must be based on what you can do today.</p>
<p>Why the emphasis on today? Well, any RAID server memory used today must be based on some sort of memory technology (or technologies) that’s commercially viable now. Researchers are working on more than a dozen new memory technologies that may someday produce a more ideal memory than the semiconductor memories we have at our fingertips today. It’s not clear when that might happen. Tantalizing technology announcements are made almost weekly. But technology announcements are generally light years away from being commercially competitive products and that’s never truer than when you’re talking about digital memory.</p>
<p>Bulletproof RAID server memory must have some mechanism to ride through power outages without data loss.  The previous two entries in this series (<a href="../bulletproof-memory-for-raid-servers-part-1/">Part 1</a> and <a href="../bulletproof-memory-for-raid-servers-part-2/">Part 2</a>) discussed various approaches to creating bulletproof memory using battery-backed RAM. Seems like a great idea, but batteries aren’t particularly reliable in data-center environments where they live inside of heat-generating boxes squeezed into rack upon rack upon rack where they get no light and precious little maintenance. High-maintenance components like batteries just seem like a poor choice for creating memory that’s supposed to be bulletproof. Wouldn’t you agree?</p>
<p><strong>So what’s that leave?</strong></p>
<p>Well, you could use NAND Flash for memory rather than DRAM. NAND Flash devices have many excellent attributes. They do not require power to provide nonvolatile storage. They are currently the semiconductor industry’s cost-per-bit leader. NAND Flash chips available in higher capacities than DRAMs, which translates into more bits per same-size board, fewer devices per board for same-size capacity, or smaller boards depending on application needs. These are all great attributes.</p>
<p>Unfortunately, NAND Flash devices have some unhappy qualities as well. You can only write to them relatively slowly—much more slowly than DRAM. They also exhibit wearout failure, which is getting to be a bigger and bigger problem as lithographies shrink. NAND Flash devices are block oriented so you can’t write just one word. These three failings are major and make NAND Flash memories unsuitable for RAID server memories.</p>
<p><em>Unsuitable, that is, when used alone.</em></p>
<p>However, volatile DRAM paired with non-volatile NAND Flash make a pretty good team when it comes to building bulletproof RAID server memory. When the power’s good, use the DRAM like&#8230;well&#8230;DRAM. When there’s an indication that power’s about to fail, save the contents of the DRAM in NAND Flash devices.</p>
<p>Note that you can’t let the host CPU save the data when power’s already on the slippery downhill slope. You really don’t know how much time there is before the host CPU loses its mind. You need something more—bulletproof. You need a backup power supply that will sustain the memory subsystem during the data-backup operation and you need a local processor to oversee the transfer.</p>
<p><strong>Batteries are still bad</strong></p>
<p>The previous two installments of this series have already dealt with the many reasons that batteries are not suitable as the backup power supply. Barring the sudden invention of the Mr. Fusion portable reactor last seen attached to the back of Doc Brown’s DeLorean time machine in the <em>Back to the Future</em> movies, there’s really only one good alternative for emergency backup power for RAID server memories: ultra-capacitors.</p>
<p>Ultra-capacitors are capacitors that have electrodes with greatly expanded area, which result in greatly expanded capacitance. The electrode area expansion originates in porous carbon electrodes. Ultra-capacitors have capacities measured in Farads, much greater then conventional electrolytic capacitors. Although they require the proper care when designed into a backup power supply, ultra-capacitors can provide enough backup energy to support the emergency transfer of data from DRAM to NAND Flash memory in a bulletproof RAID server memory subsystem.</p>
<p>How practical is all this? Very practical. Take a look at the following graph, which plots projected memory costs in dollars per megabyte over the next few years. (This graph is based on iSuppli projections.)</p>
<p><img class="aligncenter size-full wp-image-104" title="Memory Costs" src="http://agigatech.com/blog/wp-content/uploads/2009/11/Memory-Costs.jpg" alt="Memory Costs" width="520" height="366" /></p>
<p>As you can see, DRAM and NAND Flash are the least expensive semiconductor memories, per megabyte, and a megabyte of NAND Flash costs about one tenth of what a megabyte of DRAM costs. All of the leading “future” memories, which may someday replace DRAM, cost more. Some cost much more and they will continue to cost more into the foreseeable future. These “future” memory technologies are not about to replace DRAM today or tomorrow. They cost too much.</p>
<p>Finally note the dashed blue line. This line represents the per-bit cost of AGIGARAM, which fuses DRAM, NAND Flash, and ultra-capacitors to create the closest thing to a bulletproof RAID server memory that you can get today. Over time, the cost of a megabyte of AGIGARAM approaches the cost of the equivalent amounts of DRAM and NAND Flash added together. The cost of the memories will essentially dominate the other costs (controller, ultra-capacitor backup power source). Consequently, AGIGARAM, which is AgigA Tech’s bulletproof memory for RAID servers that’s available today, is not only the best technical approach to creating bulletproof memory, it’s the most cost-effective approach available today&#8230;and tomorrow.</p>
]]></content:encoded>
			<wfw:commentRss>http://agigatech.com/blog/bulletproof-memory-for-raid-servers-part-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bulletproof Memory for RAID Servers, Part 2</title>
		<link>http://agigatech.com/blog/bulletproof-memory-for-raid-servers-part-2/</link>
		<comments>http://agigatech.com/blog/bulletproof-memory-for-raid-servers-part-2/#comments</comments>
		<pubDate>Fri, 13 Nov 2009 22:45:43 +0000</pubDate>
		<dc:creator>AgigA Moderator</dc:creator>
				<category><![CDATA[backup]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[RAID]]></category>
		<category><![CDATA[Server]]></category>
		<category><![CDATA[Write Cache]]></category>

		<guid isPermaLink="false">http://agigatech.com/blog/?p=98</guid>
		<description><![CDATA[Just what is the real cost of the memory in a RAID server? Seems like a simple question, right? No matter what technology a RAID server design team adds non-volatile memory, there will be costs beyond the acquisition cost of the memory and those extra costs should be factored into the system design if the design is to be competitive.]]></description>
			<content:encoded><![CDATA[<p>Just what is the real cost of the memory in a RAID server? Seems like a simple question, right? For volatile memories such as DRAM and SRAM, the cost is pretty much the purchase cost of the memory DIMMs. Sure, DRAM and SRAM modules might occasionally fail and require replacement, but the associated failure rate is pretty low so the reliability tax on the failures is also relatively low. Not true for non-volatile memory. No matter what technology a RAID server design team adds non-volatile memory, there will be costs beyond the acquisition cost of the memory and those extra costs should be factored into the system design if the design is to be competitive.</p>
<p>As we discussed in <a href="../bulletproof-memory-for-raid-servers-part-1/" target="_blank">Part 1 of this blog entry series</a>, RAID servers must use non-volatile memory for their write caches to prevent data loss during power failures. There are many ways to achieve nonvolatility. One way is to back up the entire server with an uninterruptible power supply. That takes a lot of battery power or a diesel-driven generator (or a hydroelectric turbine, if there’s one handy). Another way is to use a much smaller battery to back up the RAM used as a write cache. Yet another is to use NAND Flash as a write cache. All of these design approaches have problems and no matter the approach, the server processor must be involved in safely preparing for the imminent loss of power. Let’s examine these last two design approaches more closely, assuming that diesel generators and water power are out of the question.</p>
<p>Backup batteries have short lives and require regular maintenance, which they often do not get. NAND Flash memory has relatively slow write times, so it makes a poor write cache when used directly. Worse, NAND Flash memory exhibits write-induced wearout failure. You really must minimize the number of times you write to Flash memory. For both of these reasons, using Flash memory like it’s RAM is clearly a misapplication of Flash memory technology.</p>
<p><strong>So what’s the real cost?</strong></p>
<p>Back to the original question posed in this blog entry: What’s the real cost of the memory in a RAID server? Let’s run a thought experiment and see where it takes us. Consider a battery-backed RAM. Besides the cost of the RAM, which is the same whether there’s battery backup or not, there’s the cost of the battery. What’s the cost of a battery pack? It’s on the order of $100 for the RAID server customer. However, if your customers are replacing these batteries annually as they should, then there’s roughly $500 worth of batteries to buy per server over the course of a four-year lifespan for the memory. (That’s $100 initially for the first battery and $100 per year for each year following.)</p>
<p>However, that’s not the only cost. Someone must go into the server room, take the server down, replace the battery, and then bring the server back up. For the sake of argument, let’s say it takes an hour for an IT tech to do all of this for one server. What’s the burdened cost of an hour of an IT tech’s time? Well, that number varies, but again it’s on the order of $100. And you need to do it four times over the course of the 4-year life of the server memory. That’s another $400. (We’re ignoring recycling costs here, but batteries should be recycled properly.)</p>
<p>So if battery maintenance occurs as it should, the cost of non-volatile server memory is roughly the cost of the memory plus $900 in maintenance costs. These costs greatly exceed the cost of the memory itself.</p>
<p>But what if battery maintenance doesn’t take place as it should? What if the battery fails in service? What’s the cost then? Well, in this scenario, you need to make some big assumptions. First, you need to assume that the batteries are all properly monitored so that there’s an alert as soon as a battery fails. If not, then the RAID servers are always subject to catastrophic data loss because their write caches are unprotected from power failures. Actually, it’s not so easy to sense battery failure without putting a load on the battery, but let’s ignore this detail for now.</p>
<p>Next you need to assume that there’s a replacement battery handy, sitting ready to go on the shelf next to the server room, and that someone knows where this battery is stored. Otherwise the RAID server with the failed battery will need to be taken out of service and replaced with another server until a new battery can be found, flown in, or otherwise delivered from the warehouse, wherever that is. Battery spares are cheaper to keep on the shelf than spare RAID servers so it’s likely that it’ll be a spare battery on the shelf. Likely as not, the battery on the shelf won’t be fully charged, but let’s ignore that detail for now as well.</p>
<p>Finally, you need to assume that there’s always an IT tech on hand who knows how to replace a server backup battery and can act quickly when a battery fails.</p>
<p>These are all big assumptions and they are all most assuredly <span style="text-decoration: underline;">bad</span> assumptions, but they set a lower bound on the associated maintenance costs. An unattainable lower bound, most certainly, but a lower bound nevertheless.</p>
<p><strong>$300 for one failure, $500 for two</strong></p>
<p>If you make all of these assumptions, then the costs for server-memory nonvolatility using battery backup include the initial $100 battery cost, plus the cost of replacing the failed batteries over the four-year life of the server memory. In the highly unlikely event that there’s only one failure during that time, the 1-time replacement cost is about $200 ($100 for the replacement battery plus $100 for the labor cost to replace it) for a total of $300 for the initial battery plus one replacement. If the battery fails twice during the four years, then the total cost is $500.</p>
<p>While this second scenario sets a lower bound on cost, it’s clearly built on unrealistic assumptions. There will most certainly be unplanned downtime with this scenario.</p>
<p>Batteries almost never fail at convenient times. They seem to have a second sense about these things. Batteries fail at night and when the IT team is otherwise occupied. So you also need to figure in the cost of lost business due to the unplanned server outage. Realistically, that’s clearly going to happen.</p>
<p><strong>Lost time counts too</strong></p>
<p>Now the dollar value of lost data is really tough to set. However, as discussed in the <a href="../bulletproof-memory-for-raid-servers-part-1/" target="_blank">previous blog entry</a>, an hour’s loss of server time could easily cost a large customer thousands or millions of dollars especially if that server customer is Amazon, Google, or a fast-transaction securities trader that relies on response times that are microseconds faster than competing traders. For such customers, the cost of server memory is clearly irrelevant because uninterrupted server uptime is so very valuable to them. These customers know to the penny what server uptime is worth per minute, per second, and even per millisecond. That’s how valuable server uptime is to this class of customer.</p>
<p><em>These customers don’t want to know how much the memory in the server costs. They want to know how the server’s design will prevent unplanned downtime.</em></p>
<p>The server design team must therefore have bulletproof, nonvolatile memory as a goal. This memory should not require annual maintenance so that the server’s design avoids both frequently planned and unplanned downtime due to memory failure. The economics of this goal are simply undeniable.</p>
<p>If you’re thinking that this discussion is leading to a discussion of why AgigA Tech’s approach to non-volatile server memory is worth more money, you’re wrong. After taking maintenance costs into account, AgigA Tech’s AGIGARAM modules actually cost less. Taking the cost of lost data and server downtime into account, AGIGARAM modules cost a lot less. Something to be discussed in the next blog entry.</p>
]]></content:encoded>
			<wfw:commentRss>http://agigatech.com/blog/bulletproof-memory-for-raid-servers-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bulletproof Memory for RAID Servers, Part 1</title>
		<link>http://agigatech.com/blog/bulletproof-memory-for-raid-servers-part-1/</link>
		<comments>http://agigatech.com/blog/bulletproof-memory-for-raid-servers-part-1/#comments</comments>
		<pubDate>Thu, 12 Nov 2009 23:26:20 +0000</pubDate>
		<dc:creator>AgigA Moderator</dc:creator>
				<category><![CDATA[backup]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[ultra-capacitor]]></category>
		<category><![CDATA[ultracapacitor]]></category>
		<category><![CDATA[RAID]]></category>
		<category><![CDATA[Server]]></category>
		<category><![CDATA[Write Cache]]></category>

		<guid isPermaLink="false">http://agigatech.com/blog/?p=94</guid>
		<description><![CDATA[Envision a data center with row upon row of rack-mounted RAID servers. All of these servers have battery-backup units for their RAM caches but buried somewhere deep inside this maze of racks, there’s a battery years past its prime. Perhaps there are several such batteries. These batteries are supposed to be changed out annually, but [...]]]></description>
			<content:encoded><![CDATA[<p><em>Envision a data center with row upon row of rack-mounted RAID servers. All of these servers have battery-backup units for their RAM caches but buried somewhere deep inside this maze of racks, there’s a battery years past its prime. Perhaps there are several such batteries. These batteries are supposed to be changed out annually, but you know how things go. Sometimes, preventative maintenance just doesn’t happen on time. Or at all.</em></p>
<p><em> </em></p>
<p><em>In fact, one of those batteries has failed. The RAM cache it protects is at risk when the next power outage occurs. When that happens, one or more of the data center’s customers will lose data. Critical data. After all, what data isn’t critical?</em></p>
<p><em> </em></p>
<p><em>Worse, the failed battery is leaking. Acid is oozing out of the battery. It’s quite possible that the acid is leaking onto critical circuitry inside of the RAID enclosure. Drip. Drip. Drip. The acid starts to etch into the circuitry. The disaster is perhaps moments away&#8230;</em></p>
<p>Customers buy one thing from RAID vendors: a safe haven for their bits. The bulletproof aspect of a RAID array’s disk storage resides in the redundancy of the disk drives themselves. A RAID 5 array protects against data loss should one disk drive fail and a RAID 6 array protects against faults should two drives fail. Both types employ disk striping with parity (double parity for RAID 6). Because data has value—and some data has tremendous value—the use of RAID systems based on hardware RAID controllers is skyrocketing. However, power loss can negate the efficacy of a RAID system and puts the data at risk.</p>
<p>One critical point of failure in RAID systems with respect to power outages is the write cache. RAID systems employ write caches to speed disk transactions—to boost the IOPS (I/O operations per second) rating. Once a computer system squirts a chunk of data into a RAM-based cache, the RAID system can immediately acknowledge the transaction before actually writing the data to disk. So there’s a critical period of time when the data is at risk from a power failure, after the acknowledgement but before the data is on the disk. If power is lost while the data is in RAM cache, then it’s lost forever.</p>
<p>One way to avoid this problem entirely is to disable the RAID system’s RAM cache. This approach preserves the data but with a huge performance hit. No RAM cache, no performance.</p>
<p>Another way to avoid the problem is to protect the data in a write cache from power failures using a battery-backup unit (BBU). That way, the RAID controller can recognize an impending power failure, can halt transactions, and the BBU will maintain any data yet to be written to disk and thus ride through the power failure.</p>
<p>Sounds great in theory, but in practice there are many problems with BBUs:</p>
<ul>
<li>Batteries      have short, finite lifetimes compared to other electronic components and heat      further shortens their electrochemical lives. There’s heat aplenty inside      most server enclosures. Consequently, battery health should be closely      monitored but it’s often not monitored at all. In fact, some data-center      operations teams are surprised to discover that there’s a high-maintenance      battery inside of many RAID systems. Of course, by the time they realize      that there’s a battery to be maintained, it’s often too late because the event      that brought this fact to light was a data failure induced by power loss.</li>
</ul>
<ul>
<li>Batteries      need to be replaced every one to two years. First, that’s not going to      happen if no one knows there’s a battery to be replaced. Second, battery      maintenance often falls pretty low on the priority list of tasks to be      performed and the replacement may be dangerously deferred when it’s done      at all. Third, there’s no standardization in BBUs so the correct battery      pack may not be on hand. Worse, the required BBU may be discontinued, no      longer be available. If you can’t order a new one, then what? Fourth,      battery packs cost money and so does the time it takes to install new      ones.</li>
</ul>
<ul>
<li>When      replacing the BBU, the RAID server must be taken offline, or at least the      RAM cache needs to be taken off line and it must stay off line until the      BBU charges up. RAID performance suffers during the downtime. Consumer-level      products such as PCs and PVRs (personal video recorders) may not benefit      much from faster disk drives. Enterprise      systems do. Enterprise      computing clients know precisely what a second’s worth of delay costs in      their business. Sometimes a microsecond’s delay costs big money. For      example, Google and Amazon know to the penny what each additional second      of response delay costs them in terms of lost customer purchases. High-frequency      securities traders and arbitrage houses employ trading strategies that are      highly dependent on ultra-low latency networks. In fact, they co-locate      their trading servers with the trading floor to minimize communications      latency with the computers at the market exchange. These traders profit only      by feeding information on competing bids and offers to their trading algorithms      microseconds faster than their competitors. Loss of write-cache performance      in a RAID system could literally cost such traders millions of dollars per      microsecond of delay.</li>
</ul>
<ul>
<li>Batteries      are not environmentally friendly so it’s a bad idea to just toss them in      the trash. Batteries should be properly recycled and proper recycling is      expensive, beyond the cost of the replacement BBU. Even when recycled      properly, batteries just aren’t that great for the environment.</li>
</ul>
<p>So what’s the right answer to the need for bulletproof RAID write cache? AgigA Tech believes that the answer can be found in a fusion of NAND Flash and ultra-capacitor technologies. Ultra-capacitors are essentially made of benign carbon and have many superior qualities compared to batteries. In particular, they charge faster (less downtime) and they have longer lives (when properly applied). NAND Flash can save a RAM cache’s contents indefinitely and without power. So AgigA Tech’s AGIGARAM modules can be used as RAID RAM-cache modules, providing all of the benefits of battery-backed write caches but without the many liabilities batteries incur.</p>
<p>What about the cost of such an approach? Stay tuned. We’ll address that in the next blog entry.</p>
]]></content:encoded>
			<wfw:commentRss>http://agigatech.com/blog/bulletproof-memory-for-raid-servers-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
