Or we can use this camel's straw to finally draw a bit of inspiration from our French compatriots. The power these people wield is artificial, and we're capable of taking it away.
That works well for sorting/bucketing/etc in a few places, but as a comparison it's prone to false negatives (your values are close and computed to not be close), so you're restricted to algorithms tolerant of that behavior.
My normal issue with floating-point epsilon shenanigans is that they don't usually pass the sniff test, suggesting something fundamentally wrong with the problem framing or its solution.
It's a classic, so let's take vector normalization as an example. Topologically, you're ripping a hole in the space, and that's causing your issues. It manifests as NaN for length-zero vectors, weird precision issues too close to zero, etc, but no matter what you employ to try to fix it you're never going to have a good time squishing N-D space onto the surface of an N-D sphere if you need it to be continuous.
Some common subroutines where I see this:
1. You want to know the average direction of a bunch of objects and thus have to normalize each vector contributing to that average. Solution 1: That's not what you want almost ever. In any of the sciences, or anything loosely approximating the real world, you want to average the un-normalized vectors 99.999% of the time. Solution 2: Maybe you really do need directions for some reason (e.g., tracking where birds are looking in a game). Then don't rely on vectors for your in-band signaling. Explicitly track direction and magnitude separately and observe the magic of never having direction-related precision errors.
2. You're doing some sort of lighting normalization and need to compute something involving areas of potentially near-degenerate triangles, dividing by those values to weight contributions appropriately. Solution: Same as above, this is kind of like an average of averages problem. It can make fuzzy, intuitive sense, but you'll get better results if you do your summing and averaging in an un-normalized space. If you really do need surface normals, store those explicitly and separate from magnitude.
3. You're doing some sort of ML voodoo to try to get better empirical results via some vague appeal to vanishing gradients or whatever. Solution: The core property you want is a somewhat strange constraint on your layer's Jacobian matrix, and outside of like two papers nobody is willing to put up with the code complexity or runtime costs, even when they recognize it as the right thing to do. Everything you're doing is a hack anyway, so make your normalization term x/(|x|+eps) with eps > 0 rather than equal to zero like normal. Choose eps much smaller than most of the vectors you're normalizing this way and much bigger than zero. Something like 1e-3, 1e-20, and 1e-150 should be fine for f16, f32, and f64. You don't have to tune because it's a pretty weak constraint on the model, and it's able to learn around it.
standup> My coq experiments are promising, but it feels a little harder than it should be, and I'm worried it'll be too slow to deliver without some training and long practice sessions.
Sure, why not? Current US music revenue is $6/mo per US taxpayer. For less than half the cost of Spotify you could 5x the income going to musicians if you skipped the middleman and magically just paid them directly. That doesn't seem like a bad deal.
I don't really know, but I have a story which might prompt some conversation about it.
At $WORK we had a system which, if you traced its logic, could not possibly experience the bug we were seeing in production. This was a userspace control module for an FPGA driver connected to some machinery you really don't want to fuck around with, and the bug had wasted something like three staff+ engineer-years by the time I got there.
Recognizing that the bug was impossible in the userspace code if the system worked as intended end-to-end, the engineers started diving into verilog and driver code, trying to find the issue. People were suspecting miscompilations and all kinds of fun things.
Eventually, for unrelated reasons, I decided to clean up the userspace code (deleting and refactoring things unlocks additional deletion and refactoring opportunities, and all said and done I deleted 80% of the project so that I had a better foundation for some features I had to add).
For one of those improvements, my observation was just that if I had to write the driver code to support the concurrency we were abusing I'd be swearing up a storm and trying to find any way I could to solve a simpler problem instead.
Long story short, I still don't know what the driver bug was, but the actual authors must've felt the same way, since when I opted for userspace code with simpler concurrency demands the bug disappeared.
Tying it back to AI and hacking, the white box approach here literally didn't work, and the black box approach easily illuminated that something was probably fucky. Given that AI can de-minify and otherwise spot patterns from fairly limited data, I wouldn't be shocked if black-box hacking were (at least sometimes) more token-efficient than white-box.
This seems to be extremely common. Been a very long time since I looked at Linux kernel stuff, but there were numerous drivers that disabled hardware acceleration or offloading features simply because they became unreliable if they were given heavy loads or deep queues.
Yes and no. Storage should cost in the ballpark of $200M/yr or less. Transcoding, networking, and delivery should be similar. Let's round up to $10B/yr just for fun.
YT makes $40B/yr (revenue IIRC) across its 3B customers, or $1.11/mo. $16/mo seems high by comparison. It's very high with reasonable costs of $0.28/mo. Nearly every other industry on the planet is jealous of margins like that.
A normal counter-argument here is that they should be allowed to reap those profits till competition forces them to do otherwise. That's a little at odds with our normal view toward monopolies, especially when the monopoly engages in anti-competitive acts to preserve that edge, but whatever; now you're at least having a real debate about real facts and things you care about.
Another is that YT's expenses in practice are way higher than that because they need to hire a bunch of ML people or whatever to extract even more ad money out of you, and that's a point I disagree with pretty firmly. I'm not sure why my subscription needs to subsidize a company's other predatory tendencies.
On qutebrowser I type 2H instead of H, and it doesn't go to the most recent history item at all. Mostly though, no, spammy websites still do this, and browser's haven't fixed it.
On the other side, why should one crazed/corrupt judge in some state which has nothing to do with me be able to infringe on my freedoms and make my life worse? Worse, why is it possible to jurisdiction shop for the single bad actor and impose your will on the entire country?
You're not wrong, but (like most issues in a 350M-person country) it's complicated. The system is tailored to some expected level/type of corruption and bad actors. If you expect that the government is basically fine and that out of 50M people per region surely somebody will file suit if the issue is important then the current system makes a lot of sense. You get judges with more knowledge and awareness of your local issues, anything important still gets addressed, and you're resilient to some degree of random bad judges and bad actors. If those expectations are out of whack then you get worse outcomes.
In reality, the world is complicated enough that even boiling down the lists of judges and whatnot to that simple of a description is misleading at best. Neither solution is anywhere near optimal by itself. So...what next?
Yeah it's a definite mixed bag and maybe the solution is to require them to be approved by at least a multijudge panel at the circuit level before going in to place. In effect that basically already happened though, the normal pattern was for injunctions to be stayed for a few weeks pending the appeal and the appeal court would be able to extend that stay if they believed it was flawed or unjustified. The characterization of it being "one crazed judge" doesn't really hold up to the pattern of their actual use, and where judges didn't put in a stay the appeals court could as well.
reply