Nice! There's a part two in which I rewrote the C. I got a 12x speedup :) https:...

sriku · on July 7, 2023

Wondering how res += (c=='s')-(c=='p') might do. I sure there is some C undefined behaviour relevant there. Curious but too lazy to check it myself!

Tempest1981 · on July 7, 2023

While `false` evaluates to 0, not sure `true` always evaluate to 1 in C... maybe compiler dependent. Maybe add `? 1 : 0`

ladberg · on July 7, 2023

C doesn't even originally have true/false, I think you may be conflating the two concepts that "any nonzero int is truthy" and "boolean expressions evaluate to ints". The standard mandates that boolean expressions like equality always evaluate to 0/1.

loeg · on July 7, 2023

The `true` constant is always 1. C11 §7.18 (3):

> true which expands to the integer constant 1,

And equality yields a 1 or 0. C11 §6.5.9 (3):

> The == (equal to) and != (not equal to) operators are analogous to the relational operators except for their lower precedence. Each of the operators yields 1 if the specified relation is true and 0 if it is false.

romnon · on July 7, 2023

ive seen people doing += !!(c=='s')-!!(c=='p') for that

sweetjuly · on July 7, 2023

I'm sure people do that (even though it's not necessary per some year C standard) but generally the pattern is actually for converting things which are not already 0 or 1 into 0 or 1. For example, you might want to use it here:

    int num_empty_strings = !!(strlen(s1)) + !!(strlen(s2)) + !!(strlen(s3))

which is equivalent to:

    int num_empty_strings = (strlen(s1) != 0) + (strlen(s2) != 0) + (strlen(s3) != 0)

Which you use is really a matter of coding style.

stkdump · on July 7, 2023

If we are being cryptic already, why not

    int num_empty_strings = !!*s1 + !!*s2 + !!*s3;

gjm11 · on July 7, 2023

That isn't only more cryptic, it's also potentially a lot more efficient -- strlen takes time proportional to the length of the string, which of course you don't need to do if you only care whether or not the length is zero. You shouldn't use strlen for empty-string tests.

LegionMammal978 · on July 7, 2023

In practice, GCC and Clang don't seem to have any issues inlining the necessary part of strlen at -O1 or higher (https://godbolt.org/z/rM198aYea). But MSVC inlines the empty-string case, while still calling out for nonempty strings, probably since it doesn't realize that the returned length will be nonzero.

stkdump · on July 8, 2023

I guess since strlen uses an unsigned size, which has specified overflow behavior the compiler not only has to proof the initial iteration, but also all the ULLONG_MAX+1 multiples, which of course refer to the same memory address. But maybe its harder for the optimizer to see.

areyousure · on July 7, 2023

Note that your code computes the number of nonempty strings.

simonkagedal · on July 7, 2023

That is entirely unneccessary. An == expression will always evaluate to 1 or 0. The !!x trick can be useful in some other situations, though.

Here’s a thing you could do (but I don’t know why):

+= !(c-’s’) - !(c-’p’)

rajnathani · on July 8, 2023

Great post!

One thought: If the code is rewritten using bit arithmetic, then potentially the result could be even faster as there need not be a pointer look-up.

A bit arithmetic solution would have a mask created for the characters ‘p’ and ‘s’, and then the result could be AND-ed, and then with more bit arithmetic this all 1s value can be translated to a 1 if and only if all the bits are 1. Following which, there would be a no conditional check and simply be both an add and a subtract operation but where the value to be added will only be 1 if the mask for ‘p’ matches and 1 to be subtracted if the mask for ‘s’ matches respectively. I’m not fully sure if this would necessarily be faster than the pointer look-up solution, but it would be interested to try this version of the code and see how fast it performs.

Update: The bit arithmetic could also be done with an XOR on the mask, and following which the ‘popcnt’ x86 instruction could be used to figure out if all are 0 bits.

JohnMakin · on July 7, 2023

Thank you for your post and reply but I fear with a post + title like this you may just be chumming the waters.

jdsalaro · on July 7, 2023

What do you mean by "chumming the waters" in this context?

bombcar · on July 7, 2023

People who just skim the headline and article will come away convinced that dropping to assembly is the “way to go fast” even if they never actually do it.

sh34r · on July 7, 2023

Anyone with a passing understanding of Assembly or compilers would find that idea laughable. As for the others, it turns out not knowing what you don’t know can be very expensive.

mayli · on July 7, 2023

"Clickbait"