No. If your encryption is not busted then the entropy under /any/ model tends to...

No. If your encryption is not busted then the entropy under /any/ model tends to perfect. If a model is able to predict with /any/ success then your crypto is probably broken.

So yes, it is random, and 100 X's could appear in a row. But if your model can effectively compress that, then the model has to be wasting space for all the other sequences.

Let's just imagine that you are doing some like basic run length coding (gif! Or jif if you're wrong :) )

So 100 x's turns into (100, x). So you've compressed that part of the stream. Unfortunately everything else is likely (1,a), etc so every other symbol/byte increases in size. If we look at the stream probabilities we get something like P(nX)=(P(X)^n) -- very rough I haven't been at uni for a long time :). You can do a bunch of reasonably simple math do verify it but you eventually end up with something that your average symbol size is sum{x in X}-log(P(x))/|X|. Where X is your set of possible symbols (n,x:X).

For a given stream that is not encrypted you may save some space representing the input bytes, but each symbol (n,X) has at /least/ 1 extra bit of information. Overall you can save space.

But let's look at the encrypted data. Each input symbol is independent so your probability of any sequence in the input is equal. You will typically have a run length of 1 (P=0.5), 2 (P=0.5), ...

So 50% of your output symbols will have at least one extra bit of information /added/. This means your output by necessity is bigger than the input.

I used RLE as an example here, but it's true for all compression schemes. This is a necessary property for compression to work: for any data that compresses by n% under a given model there must be some input that increases in size by n% (or some such, again, long time since uni). I believe Wikipedia has an article on the pigeon hole problem or some such that probably explains this better than I can.