Suppose that we are given a probability space S with the elements 1, 2, ..., n. The probability an element i would be chosen from the probability space is pi. Then the information entropy of the probability space is defined as:
E(S)=-p1 *log2(p1) - ... - pn *log2(pn) where log2 is a binary logarithm.
So the information entropy of the probability space of unbiased coin throws is:
E = -0.5 * log2(0.5) - 0.5*log2(0.5)=0.5+0.5=1
When the coin is based with 25% chance of a head and 75% change of a tail, then the information entropy of such space is:
E = -0.25 * log2(0.25) - 0.75*log2(0.75) = 0.81127812445
which is less than 1. Thus, for example, if we had a large file with about 25% of 0 bits and 75% of 1 bits, a good compression tool should be able to compress it down to about 81.12% of its size.