问题描述:

Currently using the run length encoding for encoding bit-vectors, and the current run time is

2log(i), where is the size of the run. Is there another way of doing it to bring it down to log(i)?

Thanks.

The most efficient way of encoding a bit vector is to isolate any specific properties of the bit source. If it is totally random, there is no real noticeable gain (actually, a totally random stream of bit cannot be compressed in any way).

If you can find properties in your bit stream you could try to define a collection of vectors which will define the base of a Vector Space. In such case, the result will be very efficient.

We'll need a few more details on your bit stream.

(Edit)

Just a few more details to understand the previous statement: "a totally random stream of bits cannot be compressed in any way"

It is not possible to compress a totally random vector of bits if by "compress" we mean the "transformed/compressed stream" **plus** the "vector base definition" **plus** the decompression program. But in most cases the decompression program (and often the vector base too) is embedded in client software. Thus, only the "compressed stream" is needed.

A good explanation (and funny story) about that is Patrick Craig 5000$ compression challenge

More scientific the theory of information, especially entropy section

And, the final one, the full story.

But whatever the solution is, if you have an unknown number of unknown streams to compress you won't be ale to do anything. You have to find a pattern.