One of my experiments was to have Claude write a VM and then generate a verification harness (using a DSL) for it to ensure it was correct with the theory being the same bug would have to exist in the test suite, the static verification and the VM for it to sneak through. Found a few bugs in the verification library and some integer overflows in the VM then it became too much for my poor little laptop to run without cutting some important corners.
It's not an abstract thing they can't do, you just have to tell them to.
I find it as an interesting experiment to find the limits of what they can do.
Like, I've had it build a full APL interpreter, half an optimizer, started on a copy-and-patch JIT compiler and it completely fails at "read the spec and make sure the test suite ensures compliance". Plus some additional artifacts which are genuinely useful on their own as I now have an Automated Yak Shaver™ which is where most of my projects ended up dying as the yaks are a fun bunch to play with.
My project over the last week was to get the robots to train a neural net to learn the "303 thing", hasn't gone well at all.
The first one sounded like it was being played on a blown out speaker after it got run over and the second attempt sounded like it was going through a $20 pawn shop guitar pedal that got left in the rain which lead to the 'oh, you wanted the neural net to learn the 303's filter section? My bad, I just made some random stuff up as an approximation...'
The worse part is there's still compute credits left over from the initial ten bucks so we just have to try again...
Yeah, back during Trump's first term I was hoping Congress would rein in executive power a bunch as he is prone to do stuff like this, didn't turn out that way unfortunately...
Now the main constraint on executive power seems to be due process and habeas corpus.
The problem I run into is the propensity for it to cheat so you can't trust the code it produces.
For example, I have this project where the idea is to use code verification to ensure the code is correct, the stated goal of the project is to produce verified software and the daffy robot still can't seem to understand that the verification part is the critical piece so... it cheats on them so they pass. I had the newest Claude Code (4.6?) look over the tests on the day it was released and the issues it found were really, really bad.
Now, the newest plan is to produce a tool which generates the tests from a DSL so they can't be made to pass and/or match buggy code instead of the clearly defined specification. Oh, I guess I didn't mention there's an actual spec for what we're trying to do which is very clear, in fact it should be relatively trivial to ensure the tests match for some super-human coding machine.
All I hear about is how this is the 'shape of things to come' with regards to the AI bubble while nobody seems to care that France just told all the gov't agencies to stop using their stuff.
Losing out on EU governmental contracts seems to me to be somewhat of a big deal and the France thing is just maybe the first move in that direction.
They don't magically gain more privacy protection in public over what your average person has just because they clock out after a hard day of work by virtue of being a government employee.
They are constantly and consistently reminded that people have the right to record in public and they chose to ignore that as there are no consequences if they violate the law. Or that people have a right to peacefully assemble. Or freedom of the press...
I agree they don't gain more privacy protection in public than the average person. I also agree they shouldn't gain more privacy protection in public than the average public employee, either!
I'm merely assuming that the license plates being listed are ones they use for their official work, since the rest of their info is being tied to what's available for any other public work.
>> If someone ends up sniping a famous person, we go back in time and figure out who they are
Yeah, most civilians don't understand operational security at the functional level.
Though... most people doing these thing probably want to be caught because they just aren't quite right in the head and want people to tell them that what they did is 'justified' for whatever reason.
I'm going to go out on a limb and say because rust didn't exist 30 years ago?
Anyhoo... seems interesting. I've been trying to convince Claude to produce a verified JavaCard VM implementation, just for the hell of it, and this probably has a bunch of information to help with that.
reply