More

SaberTail · 2026-04-02T18:16:55 1775153815

I said to a few friends that the recent trailer felt like it could be for a House of Leaves movie. Different overall setting, but the "found footage" aspects, and the narrative over it, felt like they could be right out of the Navidson Record.

I don't have any real proof for this, but it feels like House of Leaves inspired a lot of the people making "found footage" and "creepypasta" stuff one the internet in the 2000s and early 2010s (SCP, Marble Hornets, Slender Man), and then that stuff came together to inspire the Backrooms.

SaberTail · 2026-03-24T18:22:59 1774376579

The term I've heard for this sort of thing is "Physical Neural Networks" or "PNN"s. My impression is that one of the big things holding them back is that because we can't manufacture components to perfect tolerances, you can't train a single model and reuse it like you can with digital logic. Even if you can get close, every single circuit needs some amount of tuning. And we haven't worked out great ways to train them.

There's a lot of research going on in this space though, because yeah, nature can solve certain mathematical problems more efficiently than digital systems.

There's a decent review article that came out recently: https://www.nature.com/articles/s41586-025-09384-2 or https://arxiv.org/html/2406.03372v1

SaberTail · 2026-03-03T23:13:38 1772579618

The "figure out what you want to say" is key. I've started to think of LLMs, at least in a business setting, as misunderstanding amplifiers.

How many times at work have you been talking to someone else where they're using common words as jargon? Maybe it's something like "the online system" or "the platform". And it's perfectly clear to them what they mean, but everyone else in the company either doesn't know what that actually is, or they have a distorted idea based on the conventional definitions of the words. Even without LLMs in the mix, this can lead to people coming out of meetings with completely different understandings of what's going on.

My experience is few people are actually providing the relevant context to the LLM to explain what they mean in situations like this. Or they don't have the actual knowledge and are using the LLM in the hopes it'll fill in for their ignorance. The LLMs are RLHFed to sound confident, so they won't convey that they don't know what a piece of jargon means. Instead they'll use a combination of the common meaning and the rest of the context to invent something. When this gets copy/pasted and sent around, it causes everyone who isn't familiar to get the wrong idea. Hence "misunderstanding amplifier".

To the point of the article, this is soluble if people take the time to actually figure out what they are trying to convey. But if they did that, they wouldn't need the LLM in the first place.

LorenPechtel · 2026-03-04T00:12:15 1772583135

And that people and the systems actually know the relevant terms.

I recently was dealing with the Amazon robot--after correctly identifying the items in the order it then proceeded to use short terms which were wrong, but make sense as what a classifier might have spit out. Instead of understanding being a shared thing it falls entirely on the user. Sufficiently adept user, this is fine. But a lot of users aren't sufficient adept.

SaberTail · 2026-02-20T22:01:10 1771624870

This doesn't seem to be complete. It's missing the Waste Isolation Pilot Plant, for example, which should be southeast of Carlsbad, NM. It's a underground salt (metal/non-metal) mine, and MSHA definitely regulates it

greggsy · 2026-02-21T00:22:09 1771633329

The state numbers don’t seem to marry up, unless they’re indicative of something else?

snypher · 2026-02-21T00:48:59 1771634939

WIPP isn't really a mine, right? More like an Amazon warehouse.

SaberTail · 2026-02-21T18:28:52 1771698532

as far as MSHA is concerned it is. They take salt out of the ground to make room for the waste.

SaberTail · 2026-02-15T23:45:00 1771199100

The poetry you quoted is originally by Vladimir Nabokov in Pale Fire.

ghywertelling · 2026-02-16T02:07:49 1771207669

Pale Fire book is shown in the movie Blade Runner 2049

https://www.youtube.com/watch?v=OtLvtMqWNz8

Solving Nabokov's Pale Fire - A Deep Dive

https://www.youtube.com/watch?v=-8wEEaHUnkA

Pale Fire is what we call as Ergodic literature

Ergodic literature refers to texts requiring non-trivial effort from the reader to traverse, moving beyond linear, top-to-bottom reading to actively navigate complex, often nonlinear structures. Coined by Espen J. Aarseth (1997), it combines "ergon" (work) and "hodos" (path), encompassing print and electronic works that demand physical engagement, such as solving puzzles or following, navigating, or choosing paths.

Ergodic Literature: The Weirdest Book Genre

https://www.youtube.com/watch?v=tKX90LbnYd4

"House of Leaves" is another book from the same genre.

House of Leaves - A Place of Absence

https://www.youtube.com/watch?v=YJl7HpkotCE

Diving into House of Leaves Secrets and Connections | Video Essay

https://www.youtube.com/watch?v=du2R47kMuDE

The Book That Lies to You - House of Leaves Explained

https://www.youtube.com/watch?v=tCQJUUXnRIQ

I went into this rabbit hole few years ago.

zabzonk · 2026-02-16T00:31:07 1771201867

Pale Fire is brilliant - wonderfully written and very funny. The poem itself is pretty good too - one of my favourite bits:

How to locate in blackness, with a gasp,

Terra the Fair, an orbicle of jasp.

How to keep sane in spiral types of space.

Precautions to be taken in the case

Of freak reincarnation: what to do

On suddenly discovering that you

Are now a young and vulnerable toad

Plump in the middle of a busy road

SaberTail · 2026-02-05T16:34:19 1770309259

I was on a greenfield project late last year with a team that was very enthusiastic about coding agents. I would personally call it a failure, and the project is quietly being wound down after only a few months. It went in a few stages:

At first, it proceeded very quickly. Using agents, the team were able to generate a lot of code very fast, and so they were checking off requirements at an amazing pace. PRs were rubber stamped, and I found myself arguing with copy/pasted answers from an agent most of the time I tried to offer feedback.

As the components started to get more integrated, things started breaking. At first these were obvious things with easy fixes, like some code calling other code with wrong arguments, and the coding agents could handle those. But a lot of the code was written in the overly-defensive style agents were fond of, so there were a lot more subtle errors. Things like the agent adding code to substitute an invalid default value in instead of erroring out, far away from where that value was causing other errors.

At this point, the agents started making things strictly worse because they couldn't fit that much code in their context. Instead of actually fixing bugs, they'd catch any exceptions and substitute in more defaults. There was some manual work by some engineers to remove a lot of the defensive code, but they could not keep up with the agents. This is also about when the team discovered that most of the tests were effectively "assert true" because they mocked out so much.

We did ship the project, but it shipped in an incredibly buggy state, and also the performance was terrible. And, as I said, it's now being wound down. That's probably the right thing to do because it would be easier to restart from scratch than try to make sense of the mess we ended up with. Agents were used to write the documentation, and very little of it is comprehensible.

We did screw some things up. People were so enthusiastic about agents, and they produced so much code so fast, that code reviews were essentially non-existent. Instead of taking action on feedback in the reviews, a lot of the time there was some LLM-generated "won't do" response that sounded plausible enough that it could convince managers that the reviewers were slowing things down. We also didn't explicitly figure out things like how error-handling or logging should work ahead of time, and so what the agents did was all over the place depending on what was in their context.

Maybe the whole mess was a necessary learning as we figure out these new ways of working. Personally I'm still using the coding agents, but very selectively to "fill-in-the-blanks" on code where I know what it should look like, but don't need to write it all by hand myself.

abcde666777 · 2026-02-08T13:57:06 1770559026

I'd expect this kind of outcome to be common for anything complex, but there seem to be more claims of success stories than horror stories.

Various possibilities I suppose - I could just be overly skeptical, or failures might be kept on the down low, or perhaps most likely: many companies haven't actually reached the point where the hidden tech debt of using these things comes full circle.

Having been through it, what's your current impression of the success stories when you come across them?

SaberTail · 2026-02-08T15:37:14 1770565034

I'd speculate we had a few factors working against us that made us hit the "limit" sooner.

Several different engineering teams from different parts of the company had to come together for this, and the overall architecture was modular, so there was a lot of complexity before we had to start integrating. We have some company-wide standards and conventions, but they don't cover everything. To work on the code, you might need to know module A does something one way and module B does it in a different way because different teams were involved. That was implicit in how human engineers worked on it, and so it wasn't explicitly explained to the coding agents.

The project was in the life sciences space, and the quality of code in the training data has to be worse than something like a B2B SaaS app. A lot of code in the domain is written by scientists, not software engineers, and only needs to work long enough to publish the paper. So any code an LLM writes is going to look like that by default unless an engineer is paying attention.

I don't know that either of those would be insurmountable if the company were willing to burn more tokens, but I'd guess it's an order of magnitude more than we spent already.

There are politics as well. There have been other changes in the company, and it seems like the current leadership wants to free up resources to work on completely different things, so there's no will to throw more tokens at untangling the mess.

I don't disbelieve the success stories, but I think most of them are either at the level of following already successful patterns instead of doing much novel, or from companies with much bigger budgets for inference. If Anthropic burns a bunch of money to make a C compiler, they can make it back from increased investor hype, but most companies are not in that position.

SaberTail · 2025-12-06T05:54:48 1765000488

And then they'll start feeding in data like gaze tracking, and adjust the generated content in real time to personalize it to be maximally addictive for each viewer.

SaberTail · 2025-11-20T16:36:43 1763656603

It's most notes, and for EU and US notes (as well as some others), it's based on a certain pattern on the bills: https://en.wikipedia.org/wiki/EURion_constellation

SaberTail · 2025-10-16T12:46:46 1760618806

And meanwhile today you can get more power than the Cray-1 (or Cray-2) from a single chip a fraction of the size of that coin.

Very quickly:

  a dollar coin is about 550 mm^2 on a face
  the Cray-1 could do 160 MFLOPS
  an M1 chip has a die size of 120 mm^2
  an M1 chip can do over 1 TFLOPS

Mr_Minderbinder · 2025-10-16T23:00:16 1760655616

The “over 1 TFLOPS” claim for the M1 appears to be for single precision floats whereas FLOPS performance figures for supercomputers, including the one given for the CRAY-1, are almost always based on double precision (FP64) floats. The double precision FLOPS performance of the M1 would be lower, perhaps half of the single precision performance.

foobarian · 2025-10-16T13:40:49 1760622049

I had just gone down this rabbit hole for unrelated reasons (looking into yields). Nvidia's 5090 die is 750 mm^2, managing 419 TFLOPS on the FP16 benchmark.

SaberTail · 2025-09-30T03:46:29 1759203989

The last line about "simplifying approximations within the literature[...] applied outside of their intended context" makes me think the author has an issue with the way other theoreticians are using LIGO data in their analyses.