Sign up for our newsletter and get the latest big data news and analysis.

Hey buddy can you spare a Gigabyte?

I’m currently hard at work building a new monster home PC for a shiny copy of Windows Vista Ultimate 64bit (no comments from the peanut gallery…please) that I picked up during a trip to see Microsoft last month. In the course of sourcing the components, because no self respecting engineer buys a complete turnkey PC, I have been drooling over Western Digital Raptor X SATA drives and my eyes drifted the fine print at the bottom of the web page.

One gigabyte (GB) = one billion bytes. One terabyte (TB) = one trillion bytes. Total accessible capacity varies depending on operating environment.

This disclaimer came to mind when I stumbled across a great blog entry by Jeff Atwood: Gigabyte: Decimal vs. Binary that delves into the prefixes that we will start to hear from the hard drive manufacturers and the “penalty” that we pay for letting the PR and Marketing machines run rampant with decimal descriptions of binary quantities.

When you buy a “500 Gigabyte” hard drive, the vendor defines it using the decimal powers of ten definition of the “Giga” prefix.

500 * 109 bytes = 500,000,000,000 = 500 Gigabytes

But the operating system determines the size of the drive using the computer’s binary powers of two definition of the “Giga” prefix:

465 * 230 bytes = 499,289,948,160 = 465 Gigabytes

Seeing this I began to think about how this really never bothered me much when my hard drives were 500MB, but in considering a 500GB or a 1TB drive suddenly the “lost” space began to add up. If I then forced myself to consider the 440TB SATA based parallel file system that my latest Government HPC box had the problem seemed to take on a life of its own.

Resource Links: