Information Theory – Memory of All Permutations of a Kilobyte Block

information theorypointersstoragetheory

This is a hard enough idea to wrap my head around and I would greatly appreciate any edits/help to get it more readable for those in-the-know.

Is it theoretically possible to have a hard drive that has saved on it one copy of every possible binary permutation of one kilobyte and then have the rest of the system simply create pointers to these locations?

Would a system made such a way be any faster than simply having information stored directly?

To explain another way, say instead of having sentences:

"Hello, I'm Bob." and "That sandwich looks delicious."

…stored on the hard drive, we would have all permutations of the alphabet and other characters up to some number (say, 1000 characters or so), and then have store our sentences as something like:

[Pointer#21381723]

Best Answer

There are 28192 possible different 1K blocks. Storing them all would take 28202 bits of storage. Since the universe contains only about 1080 (or ~2266) particles, it's a safe bet that it isn't possible to store them all, and you don't have to wonder about whether it would save time or not.

But there is, in fact a more interesting way of answering this. You are suggesting creating an index into a huge pool of constants. But how would you know which index to dereference? Imagine for the sake of an argument that you want to store only 1-character blocks: a, b, c... Presumably your indices would be 0, 1, 2 etc., since that's the most efficient layout of storing those blocks.

Do you notice something about the arrangement? Your index is, in fact, a coded representation of the stored data! In other words, you don't have to dereference at all, you just have to transform the index into the data you want.

When you store all possible values of something in a table, this always happens: your index becomes merely an encoded version of the data itself, so storing the data becomes unnecessary in the first place. This why in the real world, indices are only useful for sparse data (e.g. all web pages you've visited, not all web pages that could exist, or even all that do exist).

Related Topic