With 64 bits per clock component you can record a billion events per second (literally) for a very long time (as in billions of seconds) before needing to coordinate a clock reset. With 20,000 nodes and a MTU of 1250 bytes you'd only need 128 packets with a traditional vector clock -- contrast this to the (minimum) of 20,000 packets with the Bloom clock. That effect on the network is definitely significant.
Issues like dynamic node membership are interesting, but in practice many useful distributed algorithms that use vector clocks have to carefully control membership anyway. Membership changes are probably rarer than the events tracked by vector clocks. Having a special case for membership in most circumstances treats the network better than sending N messages per event.
EDIT: in other words, suppose you encode each of the N components of the vector clock with E bits. Suppose each packet can contain P bits. As long as (E/P) < 1, the number of messages sent in the system is lower with vector clocks.
Your argument about N messages per event is wrong though. You don't have to broadcast logical clocks to every other node necessarily, that's another issue. As long as every node in the system multicasts the logical clocks, the system-wide clocks will still "move forward" together.
You've only changed the bloom filter in K places corresponding to your K hash functions, so seems like you could send a list of the changes you made.
(Disclaimer: I haven't read the paper, so I might be way off here.)