Unbelievable Progress in Compressing Data

I just got in the mail a small shiny object, about as long as a packet of cigarettes (remember those?) but much less wide or thick, that holds 2 Terabytes of data.

It weighs about 43 grams (or one and a half ounces).

(I am neither a Luddite nor an early adopter! I like my technology to be cheap!)

It’s an external hard drive, which I will use to transfer data from my old (10-year old) laptop to a new one. It only cost me 40 bucks.

See the photo for scale:

my very first 2-TB external hard drive (Not SSD)

It holds 2 Terabytes of data.

It is not solid state, because (a) I’m not an early adopter and (b) I’m frugal. Heck, I even build my own telescopes!)

I looked at the little device, and decided to compare its memory capacity to the biggest library I know of, the Library of Congress.

(By the way, when I was younger, I many hours in various sections of the LOC, researching all sorts of stuff. The halls and stacks of the LOC have a very old-fashioned atmosphere, totally different from this little gizmo.)

How big is the LOC? If you look it up, you will find that the estimates made by different people are not very close to each other. Obviously the degree of compression would matter a lot and would vary from work to work, and whether you are including all the videos and songs and other recordings.

If you leave out all the digital material, some estimates (like here) found that the printed part of the LOC, (books, newspapers, magazines, maps, menus, and so on) if scanned from the printed page into digital versions of those would add up to somewhere between 8 and 200 Terabytes of data.

8 to 200 Terabytes.

And my cheap little gizmo holds 2 Terabytes.

In other words, anywhere between 4 and 100 of these cheap little metal-and-plastic boxes would hold ALL of the useful information in ALL of the printed material in the world’s largest library ever.

LOC says their printed collections fill over 500 linear miles of shelving. Or maybe ‘only’ 100 miles of shelving if you stack your shelves 5 units high.

(Yes, I’m leaving out the electronic material.)

For a hundred to a few thousand dollars, depending on whose estimate is correct, and if someone were to digitize all that material, you could theoretically hold the biggest print library in the world – one that holds a copy or two of every single copyrighted book published in the US and most of the world.

That’s just incredible. Yes, we had microfilm when I was young, and it appears it may stay with us for the forseeable future, but the the compression factor for microfilm or microfiche is nothing like what we get now, electronically For example, a single roll of microfilm might hold a month or so of a daily newspaper – and that roll of film occupies roughly the same volume, and weighs close to the same as, one of these little external hard drives.

But my little drive can hold anywhere from 1/4 to 1% of the entire Library of Congress!

A shoe box could hold all of the printed data in the world!

And have room for lots of the film, video, recordings as well!

Amazing.

Now let’s see if it actually works!

========

ON THE OTHER HAND:

There is a lot of meta-information in each and every physical, printed object, and much of the time, the scanned copy of a printed map, painting or photograph is way less satisfactory than original, and harder to use. Plus, there is no guarantee that an electromagnetic pulse won’t wipe out all of your data in a microsecond. Plus, we can’t guarantee that our smart electronics devices will always be able to read this data — have you ever tried to get old data program from a 5.25″ floppy or a large reel-to-reel tape or an 8-track tape? Not easy!

Newspapers from the mid-1700s are often in very good, readable shape.
But where are all the photos you took on your very first cell phone?

So don’t scrap old important documents just because you have a digital copy. Back it all up! Your hand-written diary, or a paperback book, will probably survive much longer than your cell phone. And they don’t need any batteries.

=================================

The URI to TrackBack this entry is: https://gfbrandenburg.wordpress.com/2022/03/21/unbelievable-progress-in-compressing-data/trackback/

RSS feed for comments on this post.

One CommentLeave a comment

  1. I wonder how much storage the Guttenberg project has used so far to digitalize books.

    https://www.gutenberg.org/

    Liked by 1 person


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: