Well, I decided to do a very quick and dirty grepping of statistics with the following hideous Bash two-liner:
/usr/bin $ for f in *; do objdump -d $f | sed -e 's/^ *[0-9a-f]*:[\t 0-9a-f]*[ \t]\([a-z][0-9a-z][0-9a-z][0-9a-z]*\)[ \t]\(.*\)$/\1/g' | grep '^[a-z0-9]*$' >> /tmp/instrs.txt; done
/usr/bin $ cat /tmp/instrs.txt | awk '/./ { arrs[$1] += 1 } END { for (val in arrs) { print arrs[val], val; sum += arrs[val] } print sum, "Total" }' | sort -n -r | head -n 50
For those of you who can't read bash, this basically amounts to:
1. Find every ELF binary in /usr/bin
2. Use objdump to disassemble the text sections
3. Grep for what the commands probably are (not 100% accurate, since some constant data (e.g., switch tables) gets inlined into the text section, and some nop sequences get mangled)
4. Build a histogram of the instructions, and then print the results
I don't bother to try to handle the different sizes of types (particularly for nop, which can be written in several different ways to align the next instruction on the right boundary). The result is this list:
Since this is Debian Linux, nearly every binary is compiled with gcc. The push and pop instructions are therefore relatively rare (since they tend not to be used to set up call instructions, just mov's to the right position on rbp/rsp). jmpq and pushq are way overrepresented thanks to the PLT relocations (2 jmpq, 1 pushq per PLT entry).
mov's are very common because, well, they mean several different things in x86-64: set a register to a given constant value, copy from one register to another, load from memory into a register, store from a register into memory, store a constant into memory... Note, too, that several x86 instructions require particular operands to be in particular registers (e.g., rdx/rax for div, not to mention function calling conventions).
If you're curious, the rarest instructions I found were the AES and CRC32 instructions.
I am surprised to see lea so high on the list. Back when I last wrote x86 assembly code (early 2000s), it was considered slow, but perhaps that has changed.
And older, about amd64 in general (x86/32bit is quite different from 64bit):
"lea -0x30(%edx,%esi,8),%esi
Compute the address %edx+8*%esi-48, but don’t refer to the contents of memory. Instead, store the address itself into register %esi. This is the “load effective address” instruction: its binary coding is short, it doesn’t
tie up the integer unit, and it doesn’t set the flags."
[ed: in general the purpose of lea is to get the address of things in c/pascal arrays (pointer to array+offset) -- as I understand it. I don't know if it's used for other tricks by optimizing (c) compilers much -- but taking the address of an item in an array (say a character in a C string) sounds like it would be fairly common. If lea is fast(er) on amd64 using the dedicated operand would make sense.]
Yeah, despite it's name, lea is an arithmetic instruction; it doesn't reference memory (although it was designed with computing memory addresses in mind). You can do some neat arithmetic tricks with lea, because it computes register1 + register2<<shift + offset.
The callq/retq ratio means most functions have... apparently, about 8 calls to others, after inlining. It has nothing to do with libc, per se, there's no reason in general to expect a 1-to-1 call/ret ratio.
I've been wondering what were the worst decisions in assigning single-byte opcodes; can you determine that from your data? I'm guessing that the decimal/ASCII adjust operations (DAA/DAS/AAA/AAS) have extremely low use considering they take up 4/256 of the opcode space.
Some of the 1-byte opcodes are specific subsets of instructions (e.g., 04 is ADD AL, Ib). Keeping in mind that many single-byte opcodes don't actually exist in 64-bit code (e.g., PUSH DS or DAS), the ones that do exist that aren't used:
ins, outs, lods, in, stc, cli, wait, pushf, popf, lahf
1. Find every ELF binary in /usr/bin
2. Use objdump to disassemble the text sections
3. Grep for what the commands probably are (not 100% accurate, since some constant data (e.g., switch tables) gets inlined into the text section, and some nop sequences get mangled)
4. Build a histogram of the instructions, and then print the results
I don't bother to try to handle the different sizes of types (particularly for nop, which can be written in several different ways to align the next instruction on the right boundary). The result is this list:
Since this is Debian Linux, nearly every binary is compiled with gcc. The push and pop instructions are therefore relatively rare (since they tend not to be used to set up call instructions, just mov's to the right position on rbp/rsp). jmpq and pushq are way overrepresented thanks to the PLT relocations (2 jmpq, 1 pushq per PLT entry).mov's are very common because, well, they mean several different things in x86-64: set a register to a given constant value, copy from one register to another, load from memory into a register, store from a register into memory, store a constant into memory... Note, too, that several x86 instructions require particular operands to be in particular registers (e.g., rdx/rax for div, not to mention function calling conventions).
If you're curious, the rarest instructions I found were the AES and CRC32 instructions.