Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, I decided to do a very quick and dirty grepping of statistics with the following hideous Bash two-liner:

  /usr/bin $ for f in *; do objdump -d $f | sed -e 's/^ *[0-9a-f]*:[\t 0-9a-f]*[ \t]\([a-z][0-9a-z][0-9a-z][0-9a-z]*\)[ \t]\(.*\)$/\1/g' | grep '^[a-z0-9]*$' >> /tmp/instrs.txt; done
  /usr/bin $ cat /tmp/instrs.txt | awk '/./ { arrs[$1] += 1 } END { for (val in arrs) { print arrs[val], val; sum += arrs[val] } print sum, "Total" }' | sort -n -r | head -n 50
For those of you who can't read bash, this basically amounts to:

1. Find every ELF binary in /usr/bin

2. Use objdump to disassemble the text sections

3. Grep for what the commands probably are (not 100% accurate, since some constant data (e.g., switch tables) gets inlined into the text section, and some nop sequences get mangled)

4. Build a histogram of the instructions, and then print the results

I don't bother to try to handle the different sizes of types (particularly for nop, which can be written in several different ways to align the next instruction on the right boundary). The result is this list:

  117336056 Total
  43795946 mov
  9899354 callq
  7128602 lea
  5258131 test
  4928828 cmp
  4463038 jmpq
  3879175 pop
  3628965 jne
  3519259 add
  3227188 xor
  2824699 push
  2361085 nopl
  2016346 sub
  1678131 and
  1551014 movq
  1311683 jmp
  1311476 retq
  1255216 nopw
  1239479 movzbl
  1234815 movl
  946434 movb
  663852 shr
  614367 shl
  581608 cmpb
  523398 movslq
  427116 pushq
  384794 cmpq
  376695 jbe
  348158 movsd
  341248 testb
  340718 sar
  338542 xchg
  311171 data16
  302296 jle
  266539 movzwl
  252872 cmpl
  210762 jae
  169823 lock
  152724 addq
  151186 sete
  147340 cmove
  146501 imul
  146386 setne
  145233 movabs
  142801 repz
  123489 cmovne
  123439 addl
  105156 pxor
  81745 cmpw

Since this is Debian Linux, nearly every binary is compiled with gcc. The push and pop instructions are therefore relatively rare (since they tend not to be used to set up call instructions, just mov's to the right position on rbp/rsp). jmpq and pushq are way overrepresented thanks to the PLT relocations (2 jmpq, 1 pushq per PLT entry).

mov's are very common because, well, they mean several different things in x86-64: set a register to a given constant value, copy from one register to another, load from memory into a register, store from a register into memory, store a constant into memory... Note, too, that several x86 instructions require particular operands to be in particular registers (e.g., rdx/rax for div, not to mention function calling conventions).

If you're curious, the rarest instructions I found were the AES and CRC32 instructions.



I am surprised to see lea so high on the list. Back when I last wrote x86 assembly code (early 2000s), it was considered slow, but perhaps that has changed.


I came across:

http://www.realworldtech.com/haswell-cpu/4/

And older, about amd64 in general (x86/32bit is quite different from 64bit):

"lea -0x30(%edx,%esi,8),%esi

Compute the address %edx+8*%esi-48, but don’t refer to the contents of memory. Instead, store the address itself into register %esi. This is the “load effective address” instruction: its binary coding is short, it doesn’t tie up the integer unit, and it doesn’t set the flags."

http://people.freebsd.org/~lstewart/references/amd64.pdf

[ed: in general the purpose of lea is to get the address of things in c/pascal arrays (pointer to array+offset) -- as I understand it. I don't know if it's used for other tricks by optimizing (c) compilers much -- but taking the address of an item in an array (say a character in a C string) sounds like it would be fairly common. If lea is fast(er) on amd64 using the dedicated operand would make sense.]


Yeah, despite it's name, lea is an arithmetic instruction; it doesn't reference memory (although it was designed with computing memory addresses in mind). You can do some neat arithmetic tricks with lea, because it computes register1 + register2<<shift + offset.


'lea' is a fast way to multiply by 3, for instance. It's also faster than two adds.


I found the (top) operands that deal with push/pop (in some form) most interesting:

  9899354 callq
  3879175 pop
  2824699 push
  1311476 retq
I suppose many of the callq goes into the kernel/libc, so the coresponding ret/retq wouldn't be in the binaries?

Interesting that pop ~ 2x push. Perhaps this is due to many off the callq's leading to something interesting being left on the stack?


The callq/retq ratio means most functions have... apparently, about 8 calls to others, after inlining. It has nothing to do with libc, per se, there's no reason in general to expect a 1-to-1 call/ret ratio.


I've been wondering what were the worst decisions in assigning single-byte opcodes; can you determine that from your data? I'm guessing that the decimal/ASCII adjust operations (DAA/DAS/AAA/AAS) have extremely low use considering they take up 4/256 of the opcode space.


Some of the 1-byte opcodes are specific subsets of instructions (e.g., 04 is ADD AL, Ib). Keeping in mind that many single-byte opcodes don't actually exist in 64-bit code (e.g., PUSH DS or DAS), the ones that do exist that aren't used: ins, outs, lods, in, stc, cli, wait, pushf, popf, lahf


Interesting how you know so much shell but still uselessly use cat.


It reads better than the alternative, which would be to bury the input filename in the middle of that second line.


    < file od
for example works just as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: