Once upon a time, computers were slow and disks were small. Every kilobyte—nay, every byte!—was precious, and a thing to be hoarded and safeguarded against the awful ravages of time and bloat. In those days, a great many stupid things were done in the name of efficiency. Here is one.

One thing many programs have in common is a need to pull in a list of words. Don't ask why, ok? It's not important why. Hush. We can come back to that later on. Anyway, we need words, but we don't want to use our precious disk quota on those, because those aren't important. They are a means to an end. How can we save space while still storing every single freaking word in the English language?

A method for compact storage of ordered strings

Did that sound official? I hope so; I worked hard on that.

The idea is pretty simple. Each word consists of either A) a word, or B) the number of characters retained from the previous word followed by any new characters. For instance:

cello
5s

The challenge

For this challenge, you will need to write two programs.

  1. The converter.
  2. The unconverter.

The converter will accept any list of words (you can find one at /usr/share/dict/words or by searching for enable1.txt online) and spit out a compacted list of words. The unconverter will accept a compacted list and return the original list of words. Obviously, the compacter will work better if the words are sorted, but that's not strictly necessary.

You may want to write a third program to compare the two lists, but, to be honest, you can probably do that with just diff or something, so whatever.

Our next challenge will probably be some kind of assignment making use of this code, unless I come up with something amazing between now and then. Good luck!