I don’t think that’s the point being made by the parent comment. If the informat...

enriquto · on Nov 21, 2020

But the information has nothing to do with the size of the message. It is just a measure of how unique the message is. If everybody in the internet submits the same 1TB of data, but you transmit a single byte, then you are transmitting a lot of information, but everybody else is transmitting almost nothing.

I don't see how your reasoning about sending more or fewer headers fits into this; but anyhow the conclusion won't depend at all on the amount or the length of the headers, but on how many people are sending the same headers as you. It is impossible to know how much information are you sending by only looking at the bytes you send. You need to know the worldwide distribution of headers to measure that.

vladTheInhaler · on Nov 22, 2020

It seems that the term "information" is being used in two different ways in this thread. The usual meaning of a bit of information is with regard to a probability distribution over messages which the user wants to send to a server. I don't think most people are used to thinking about bits in other contexts, so that's where the miscommunication is happening.

Your interpretation, which I think is correct in this context, seems to be with regard to the entropy of a probability distribution over internet users, and the mutual information between that and the distribution over messages. The actual length of the message is irrelevant to the math once you fix the joint probability distribution.

The argument others seem to be making is that the joint probability distribution is in fact not fixed, and that you can smear out the conditional probability over users given a message by shrinking the space of possible messages. In theory that seems possible, but I don't know enough to have any idea how well that would work in practice. If you shrank the message space to be small enough to be useful for this purpose, wouldn't that get in the way of usability?

enriquto · on Nov 22, 2020

> Your interpretation

This is not "my" interpretation, it's the standard definition of information content in computer science, as given [1] by Shannon in 1948 and used by everybody since.

[1] https://en.wikipedia.org/wiki/A_Mathematical_Theory_of_Commu...

vladTheInhaler · on Nov 23, 2020

Obviously I'm familiar with the definition. If you didn't get that, you should probably read my comment again. It seems like you've somehow decided that people in this thread are arguing with you, but they're not. And anyways, it's a bit silly to get mad at people because they haven't studied information theory.

Define the random variables M for message content and U for the identity of the user. The interpretation of "bits of information" that most people will have is H(M). The correct interpretation in this context is H(U). You seem to be confused about why people are talking about H(M) instead of H(U). But I think people correctly intuit that those aren't independent, so the mutual information I(U;M) = H(U) - H(U|M) is positive. And obviously if you change P(M), you will also change the amount of mutual information. That's why talking about sending fewer headers makes sense.