I don't want to belittle the author, but I am surprised, that people using a low...

mtlynch · on March 20, 2024

Author here.

Thanks for reading!

>I am surprised, that people using a low-level language on Linux wouldn't know ... that reading one byte per syscall is quite inefficient.

In my defense, it wasn't that I didn't realize one byte per syscall was inefficient; it was that I didn't realize that I was doing one syscall per byte read.

I'm coming back to low-level programming after 8ish years of Go/Python/JS, so I wasn't really registering that I'd forgotten to layer in a buffered reader on top of stdin's reader.

Alex Kladov (matklad) made an interesting point on the Ziggit thread[0] that the Zig standard library could adjust the API to make this kind of mistake less likely:

>I’d say readByte is a flawed API to have on a Reader. While you technically can read a byte-at-time from something like TCP socket, it just doesn’t make sense. The reader should only allow reading into a slice.

>Byte-oriented API belongs to a buffered reader.

[0] https://ziggit.dev/t/zig-build-run-is-10x-faster-than-compil...

amluto · on March 20, 2024

As a very general tip:

> execution time: 438.059µs

That’s a rather short time. (It’s a lot of cycles, but there are plenty of things one might do on a computer that take time comparable to this, especially anything involving IO. It’s only a small fraction of a disk head seek if you’re using spinning disks, and it’s only a handful of non-overlapping random accesses even on NVMe.)

So, when you benchmark anything and get a time this short, you should make sure you’re benchmarking the right thing. Watch out for fixed costs or correct for them. Run in a loop and see how time varies with iteration count. Consider using a framework that can handle this type of benchmarking.

A result like “this program took half a millisecond to run” doesn’t really tell you much about how long any part of it took.

hawski · on March 20, 2024

Zig certainly needs more work. That part is more on familiarity with Zig and how intuitive it is or isn't.

In any case I would recommend anyone investigating things like that to run things through strace. It is often my first step in trying to understand what happens with anything - like a cryptic error "No such file or directory" without telling me what a thing tried to access. You would run:

$ strace -f sh -c 'your | pipeline | here' -o strace.log

You could then track things easily and see what is really happening.

Cheers!

mtlynch · on March 20, 2024

Thanks for the tip! I don't have experience with strace, and I'm wondering if I'm misunderstanding what you're saying.

I tried running your suggested command, and I got 2,800 lines of output like this:

    execve("/nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/sh", ["sh", "-c", "echo \"60016000526001601ff3\" | xx"...], 0x7fffffffcf38 /* 106 vars */) = 0
    brk(NULL)                               = 0x4ea000
    arch_prctl(0x3001 /* ARCH_??? */, 0x7fffffffcdd0) = -1 EINVAL (Invalid argument)
    access("/etc/ld-nix.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
    openat(AT_FDCWD, "/nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
    newfstatat(3, "", {st_mode=S_IFREG|0555, st_size=15688, ...}, AT_EMPTY_PATH) = 0
    mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff7fc2000
    mmap(NULL, 16400, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ffff7fbd000
    mmap(0x7ffff7fbe000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x7ffff7fbe000

Am I doing it wrong? Or is there more training involved before one could usefully integrate this into debugging? Because to me, the output is pretty inscrutable.

hawski · on March 20, 2024

There is a lot of output here, but you can grep around or filter with strace CLI. If you used -f option you should get PID numbers later on. Then you can look for all execve's to see how PIDs map to parts of the pipeline. For now maybe grep the log file with something like: "grep -e clone -e execve -e write -e read". You can do this with strace CLI, but I never remember the syntax and usually analyze the log extensively.

I think something like this could work:

  strace -f -e execve,clone,write,read -o strace.log sh -c '...'

Clone is fork, so a creation of a new process, before eventual execve (with echo there will probably be just clone).

dgoldstein0 · on March 20, 2024

strace tells you every syscall the process under it makes. So very helpful to understanding how a program interacts with the operating system - and I/O as all IO mechanisms are managed by the operating system.

As for how to filter this I'll leave that to the other comments, but I personally would look at the man page or Google around for tips

vlovich123 · on March 20, 2024

FWIW, the current version of the code (i.e. with the buffered reader), on my machine at least, runs identically fast with and without the tmp file.

Here's a possibly more detailed reason as to why https://news.ycombinator.com/item?id=39764287#39768022.

joelfried · on March 20, 2024

Thank you for sharing it!

Articles like this are how one learns the nuances of such things, and it's good for people to keep putting them out there.

cpuguy83 · on March 20, 2024

It's one of those things that you don't usually need to think about, so you don't.

Not too long ago I hit this same realization with pipes because my "grep ... file | sed > file" (or something of that nature) was racey.

I took the time to think about it and realized "oh I guess that's how pipes would _have_ to be implemented".

0x000xca0xfe · on March 20, 2024

There are many fundamental things People Should Know™, but we are not LLMs that have ingested entire libraries worth of books.

Exploratory programming and being curious about strange effects is a great way to learn the fundamentals. I already knew how pipes and processes work, but I don't know the Ethereum VM. The author now knows both.

nebulous1 · on March 20, 2024

I feel like you are underestimating how many fundamental misunderstandings people can have (including both of us) even though they have deep understanding of adjacent issues.

boffinAudio · on March 20, 2024

This deleterious effect is a factor in computing. We deal with it every few years: kids graduate, having ignored the history of their elders in order to focus on the new and cool - and once they hit industry, find that they really, really should have learned POSIX or whatever.

Its not uncommon. As a professional developer I have observed this obfuscation of prior technology countless times, especially with junior devs.

There is a lot to learn. Always. It doesn't ever stop.

franciscop · on March 20, 2024

This is exactly what surprised me as well. I'm literally now learning in depth WebStreams[1] in JS (vs the traditional Node Streams) and I've seen too many times the comparison of how "pipe() and pipeTo() behave just like Unix's pipes |". Reading this article makes me think this might not be the best comparison, specially since for many webdevs it's the first time for approaching some CS concepts. OTOH, the vast majority of webdevs don't really need to learn WebStreams in-depth.

[1] https://exploringjs.com/nodejs-shell-scripting/ch_web-stream...

LAC-Tech · on March 20, 2024

Most people have gaps somewhere in their knowledge. I learned very on, as a general superstition, to always try and batch things that dealt with the world without like file writes, allocations, network requests etc. But for years I had no idea what a syscall even was.