*The reason it's useful to explicitly "break the rules" like this is because it'...

sillysaurus · on April 24, 2013

  uint8_t foo[4]; *(uint32_t*)foo = 0;

Besides even without strict aliasing, the above is not at all guaranteed to work since not all architectures support unaligned loads.

So, the interesting thing about this example is that it does work. It's in fact very, very difficult to find a platform where that example won't work (i.e. crashes the program). For example, any C library involving image manipulation is likely going to have code similar to what you've described, and those libraries work on almost every platform.

Standards are a good and useful thing. All I'm saying is that it's important to know which rules you can safely violate.

__david__ · on April 24, 2013

> It's in fact very, very difficult to find a platform where that example won't work

No, it isn't. Many ARM processors will bus error on that code if (foo & 3) != 0. I believe PowerPC doesn't do unaligned word reads either...

It quite often has to do with the memory controller and not with the particular processor, though I believe x86 has to support unaligned reads. I've certainly worked first hand with ARMs that did not support it.

sillysaurus · on April 24, 2013

That's interesting. What causes the bus error?

Would

  uint8_t foo[4];  *(uint32_t*)(&foo[0]) = 0;

also result in a bus error? Why?

__david__ · on April 24, 2013

That's the same thing, so yes, if foo is unaligned then it will cause a bus error. It causes it because the code is generate a store word assembly instruction (as opposed to store byte) and if the address is not aligned to 4 bytes then the memory controller hardware will raise a bus error.

Notice I keep saying "if the address is unaligned". The insidious part is that it probably will work for a while since it's likely that your "foo" array will happen to be aligned. But add one uint8_t variable to your structure or stack frame or wherever "foo" is defined and things could shift and suddenly it starts causing bus errors. It can be a very annoying type of heisenbug.

And bus errors are actually a good thing. I believe I've used hardware (an ARM or an SH2, can't remember) where the memory controller just ignored the last 2 bits during whole word reads and writes (which works fine as long as you only read aligned words). So if run your code on that hardware it doesn't give you an error, it just subtly "corrupts" your data. Yay!

brigade · on April 24, 2013

any C library involving image manipulation is likely going to have code similar to what you've described

...which actually is exactly how I found out first-hand that it doesn't always work. If you only ever test on x86 you'll never catch it. You might not even catch it on ARM if you're lucky.

Which is the point - that compilers can and do make use of almost all undefined behavior of C for optimizations, which one developer might not catch because their current compiler happened to work. Then a new version is released that can find and exploit more undefined behavior. And strict aliasing is one of those rules you can't safely violate.