James' USENIX 2007 notes: Warehouse-scale Computers

Warehouse-scale Computers
Luiz André Barroso, Google Inc.

The computing systems that are powering many of today's large-scale Internet services look less like refrigerators and more like warehouses. Designing efficient warehouse-scale computers requires many of the traditional tools and methods developed by computer architects, and some new tricks as well. In this talk I'll describe some of the defining characteristics of these systems, with a focus on failure handling and power management.

If you just built fault-tolerant software, you will be in one of two states: very happy or very unhappy. This is because fault-tolerant software hides failures instead of correcting them. When you have enough faults so that they can no longer be hidden (i.e., your fault-tolerant software finally does fail), that failure tends to be massive and very painful.

It's more important to know that your failure rate is 2% than not knowing your failure rate is 1%. You can provision for the former; you cannot provision for the latter.

Google found that temperature had no noticeable correlation with failure rates. As far as Google can tell, operating your drives at a higher temperature doesn't increase the rate of failure.

Drives with SMART scan errors are 10x more likely to fail. But 70% of drives survive for over 8 months after a scan error occurs. Over half of drive failures appear to be completely unpredictable.

We need to worry more about energy efficiency, because it's not a problem that's going to go away.

Moore's law still works, because it speaks to transistor count.

Check out the Climate Savers Computing Initiative.

Good fans get like 2-3 watts are so; they get a bad rap. (But the fans need to have temperature-based speed control.)

The power provisioning problem: $10-22 per watt to build a datacenter; the 10-year energy costs are ~$8 per watt.

If you have a very effecient server, when it is completely idle, it is consuming no less than 50% of the power as it would if it were completely busy.

For energy, average energy utilization matters. In an ideal world, power consumption would be proportional to load, but computer today don't work that way; the range from doing nothing from doing everything is a factor of two.

A human is the equivalent of a 3-year-old PC in terms of power consumption. Our activity range is very wide, and our power efficiency is much better (40W to 1200W).

Power effeciency for other compontents: 50% (DRAM) 75% (disks) 85% (switches). Switches were actually the worst; Google found many switches that consumed 99% of the peak power when they had no load whatsoever.

We need more active low-power modes. CPUs are getting better at this, but most other components fail at this. (E.g., the mode is triggered only when there's zero activity, and there's a high wake-up penalty.) Load balancing makes this worse, because you wind up with a whole lot of barely-utilized servers that are still consuming almost as power as they would be under full load.

Summary: datacenters are complex computers; techniques that improve programming effiency are needed to use them effectively. Deeper understanding of failure characteristics pays off. Several opportunies for improving power and energy efficies exist.

http://labs.google.com/papers Failure Trends in a LArge Disk Drive Population Pinheiro, Weber & Barroso, USENIX FAST 2007 Power Provisioning for a Warehouse-sized Computer Fan, Weber & Barroso, ACM ISCA 2007 http://www.climatesaverscomputing.org

Q&A session

Did you gather any data on correlations with power swings and reliability?

We weren't really looking for this, but one would think that there would be a correlation.


You can go to the index of my Usenix notes.