Hacker Newsnew | past | comments | ask | show | jobs | submit | _pdp_'s commentslogin

I mean this is basically the foundation of any cyberpunk novel. You don't need to read that far ... just look at the works of Gibson, Neal Stephenson, Philip K. Dick and Richard Morgan.

From Neuromancer and Snow Crash to Altered Carbon the theme is that technology is not salvation but as another axis of inequality.


Yes, we were warned. It turns out we are not good at heeding warnings, at all.

Yup

People will downvote you for talking about fiction, meanwhile we're sleepwalking straight into the worst societies that fiction authors predicted


Take some working code. Ask an LLM to fix bugs. Measure performance and test coverage. Feed the results back into the LLM. Repeat.

This has been the standard approach for more complex LLM deployments for a while now in our shop.

Using different models across iterations is also something I've found useful in my own experiments. It's like getting a fresh pair of eyes.


Can we modify this approach to get LLMs that are good at specific programming languages or frameworks? That seems to be where local LLMs could really shine.

Would love to have a small local model that only knows about rails and mvc web development

Alternatively, a modular model with multiple “experts” that I could mix and match for my specific stack

I don’t need the model to know all of the Internet plus 20 different human languages. I just want it to be really good with the stack of the project


It's just RL-everything.

Our company started migrating our tech stack from USA to EU. We are about 90% there with a few small dependencies that could be resolved but we have not yet tackled.

What are some of the biggest EU alternatives for US big tech?

When I Google this I find a lot of options, but not sure which are actually mature tech companies vs start-up hopefuls


Could you summarize the easy and hard aspects? Have you had any unexpected benefits or downsides?

For me:

- SES was a big one. There was no affordable alternative at my (not big, not small) scale.

- I'm still waning myself personally of GMail. That dependency took decades to build and it will take years for all ties to sever.


What is the difficulty in getting away from gmail?

I did it a few years ago and I simply signed up for Fastmail and had gmail forward all email there. It forwards to a specific e-mail address so I can see if there are still people/companies that use the old email address. The painful part was going through all my accounts to update the e-mail, but you can do it in stages if you follow the above.


Have you checked out zeptomail by zoho? Not as low cost, but getting close. More basics build in.

Not EU.

> Zoho Corporation is an Indian multinational technology company that makes cloud-based office software.


Maybe this comment downthread helps for the email problem? https://news.ycombinator.com/item?id=47489711


> AI agents already read AGENTS.md (or CLAUDE.md, .cursorrules, etc.) as project instructions. This kernel uses that mechanism to teach the agent how to remember.

Dude, this is just prompts. It is as useful as asking claude code to write these files itself.


AI ultimately breaks the social contract.

Sure, people are not perfect, but there are established common values that we don't need to convey in a prompt.

With AI, despite its usefulness, you are never sure if it understands these values. That might be somewhat embedded in the training data, but we all know these properties are much more swayable and unpredictable than those of a human.

It was never about the LLM to begin with.

If Linus Torvalds makes a contribution to the Linux kernel without actually writing the code himself but assigns it to a coding assistant, for better or worse I will 100% accept it on face value. This is because I trust his judgment (I accept that he is fallible as any other human). But if an unknown contributor does the same, even though the code produced is ultimately high quality, you would think twice before merging.

I mean, we already see this in various GitHub projects. There are open-source solutions that whitelist known contributors and it appears that GitHub might be allowing you to control this too.

https://github.com/orgs/community/discussions/185387


Prioritizing or deferring to existing contributors happens in pretty much every human endeavor.

As you point out this of course predates the age of LLM, in many ways it's basic human tribal behavior.

This does have its own set of costs and limitations however. Judgement is hard to measure. Humans create sorting bonds that may optimize for prestige or personal ties over strict qualifications or ability. The tribe is useful, but it can also be ugly. Perhaps in a not too distant future, in some domains or projects these sorts of instincts will be rendered obsolete by projects willing to accept any contribution that satisfies enough constraints, thereby trading human judgement for the desired mix of velocity and safety. Perhaps as the agents themselves improve this tension becomes less an act of external constraint but an internal guide. And what would this be, if not a simulation of judgement itself?

You could also do it in stages, ie have a delegated agent promote people to some purgatory where there is at least some hope of human intervention to attain the same rights and privileges as pre-existing contributors, that is if said agent deems your attempt worthy enough. Or maybe to fight spam an earnest contributor will have to fork over some digital currency, to essentially pay the cost of requesting admission.

All of these scenarios are rather familiar in terms of the history of human social arrangements.

That is just to say, there is no destruction of the social contract here. Only another incremental evolution.


An agent is still attached to an accountable human. If it is not, ignore it.

How do you figure out which is the case, at scale?

You don't.

The problem is that it acts as an accountability sink even when it is attached.

I've had multiple coworkers over the past few months tell me obvious, verifiable untruths. Six months ago, I would have had a clear term for this: they lied to me. They told me something that wasn't true, that they could not possibly have thought was true, and they did it to manipulate me into doing what they want. I would have demanded and their manager would have agreed that they need to be given a severe talking to.

But now I can't call it a lie, both in the sense that I've been instructed not to and in the sense that it subjectively wasn't. They honestly represented what the agent told them was the truth, and they honestly thought that asking an agent to do some exploration was the best way to give me accurate information.

What's the replacement norm that will prevent people from "flooding the zone" with false AI-generated claims shaped to get people to do what they want? Even if AI detection tools worked, which I emphasize that they do not, they wouldn't have stopped the incidents that involved human-generated summaries of false AI information.


I forgot to mention why I brought up the idea of who is making the contribution rather than how (i.e., through an LLM).

Right now, the biggest issue open-source maintainers are facing is an ever-increasing supply of PRs. Before coding assistants, those PRs didn't get pushed not because they were never written (although obviously there were fewer in quantity) but because contributors were conscious of how their contributions might be perceived. In many cases, the changes never saw the light of day outside of the fork.

LLMs don't second-guess whether a change is worth submitting, and they certainly don't feel the social pressure of how their contribution might be received. The filter is completely absent.

So I don't think the question is whether machine-generated code is low quality at all, because that is hard to judge, and frankly coding assistants can certainly produce high-quality code (with guidance). The question is who made the contribution. With rising volumes, we will see an increasing amount of rejections.

By the way, we do this too internally. We have a script that deletes LLM-generated PRs automatically after some time. It is just easier and more cost-effective than reviewing the contribution. Also, PRs get rejected for the smallest of reasons.

If it doesn't pass the smell test moments after the link is opened, it get's deleted.


> LLMs don't second-guess whether a change is worth submitting, and they certainly don't feel the social pressure of how their contribution might be received. The filter is completely absent.

Of course you could have an agent on your side do this, so I take you to mean a LLM that submits a PR and is not instructed to make such a reflection will not intrinsically make it as a human would, that is as a necessary side effect of submitting in the first place (though one might be surprised).

It would be curious to have an API that perhaps attempts to validate some attestation about how the submitting LLM's contribution was derived, ie force that reflection at submission time with some reasonable guarantees of veracity even if it had yet to be considered. Perhaps some future API can enforce such a contract among the various LLMs.


> AI ultimately breaks the social contract

Business schools teach that breaking the social contract is a disruption opportunity for growth, not a negative,

The Hacker in Hacker News refers to "growth hacking" now, not hacking code


It depends who you ask.

You cannot say that breaking the social contract (the fabric of society, if you will) is generally a good thing, although I am sure some will find opportunities for growth.

After all, the phoenix must burn to emerge, but let's not romanticise the fire.


> You cannot say that breaking the social contract (the fabric of society, if you will) is generally a good thing

I am not saying it's a good thing, just that it's a common attitude here

I suppose it didn't come through in my original post, but I was trying to be critical


Generational churn breaks social contract.

You all using Latin and believing in the old Greek gods to honor the dead?

Muricans still owning slaves from Africa?

All ways in which old social contracts were broken at one point.

We are not VHS cassettes with an obligation to play out a fuzzy memory of history.


There is a lot of positive comments in this thread so you are doing something right.

You asked for feedback though.

There are chat apps that basically incorporate all the features of this editor so I am not really sure who is this for.

If this is for writing, in order to make it amazing I would personally focus on using models that are either built for that or fine-tuned specifically for writing.

Otherwise, what is the point?

Notion can do the same, perhaps more.


I welcome any specific critical feedback. What chat apps are you thinking of?

Notion can't quite format and export print-ready documents the same way. This is more akin to Google Docs than Notion. Revise can do proper word processor esque formatting, like page numbers, page layout, page breaks. The agent can format a paper in APA 7 with a single prompt, for example.

I already have a growing user base and customer base. People use Revise for all sorts of things, from writing stories for pleasure, to processing reports for work.

Notion is an amazing tool, but not focused on the same type of documents that Revise is focused on.


Remember Deep Thought, the greatest computer ever built that spent 7.5 million years computing the Answer to the Ultimate Question of Life, the Universe, and Everything? The answer was 42, perfectly correct, utterly useless because nobody understood the question they were asking.

That's what happens when you hand everything to a machine without understanding the problem yourself.

AI can give you correct answers all day long, but if you don't understand what you're building, you'll end up just like the people of Magrathea, staring at 42 and wondering what to do with it.

True understanding is indistinguishable from doing.


The question to which 42 was the answer was, of course, "How many roads must a man walk down, before you call him a man?"

Well, yes, but AI can also give you wildly incorrect answers with alarming frequency.

I know, I know, "skill issue"/"you're holding it wrong". And maybe that's vacuously true, in that it's so hard to guess what will produce correct output, because LLMs are not an abstraction layer in the way that we're used to. Prior abstraction layers related input to output via a transparent homomorphism: the output produced for an input was knowable and relatively straightforward (even with exotic optimization flags). LLMs are not like that. Your input disappears into a maze of twisty little matmuls, all alike (a different maze per run, for the same input!) and you can't relate what comes out the other end in terms of the input except in terms of "vibes". So to get a particular output, you just have to guess how to prompt it, and it is not very helpful if you guess wrong except in providing a wrong (often very subtly so) response!

Back in the day, I had a very primitive, rinky-dink computer—a VIC-20. The VIC-20 came with one of the best "intro to programming" guides a kid could ask for. Regarding error messages it said something like this: "If your VIC-20 tells you something like ?SYNTAX ERROR, don't worry. You haven't broken it. Your VIC-20 is trying to help you correct your mistakes." 8-bit 6502 at 1 MHz. 5 KiB of RAM. And still more helpful than a frontier model when it comes to getting your shit right.


You are correct.

One minor note. The skill issue isn't about failing to prompt it correctly, but rather failing to understand what it actually does.

There's an entire crop of professionals who believe we can Harry Potter our way out of any situation with the right magic words.


> AI can give you correct answers all day long, but if you don't understand what you're building, you'll end up just like the people of Magrathea, staring at 42 and wondering what to do with it.

Yeah it's so true. LLM's tell you what you want to hear based on the input you give it. If you know the domain, it's really powerful because it can output what you want it to output at an incredibly fast rate. If you don't, you're basically rolling the dice on what it gives you. The idea that programmers are now useless because it can output something is hilariously wrong. The quality of the input has a direct impact on the quality of the output, meaning people with domain expertise (i.e. software developers) is a crucial component to its utility. In other words, vibe coding is useless unless the viber has the correct vibes. Or in other other words, being a good programmer is fundamental to the technology producing useful results. As such, we aren't going to see a total replacement of software developers, but rather that good software developers will increase their productive output.


It is, but I thought security wasn't the point.

The point was to give it unlimited access to your entire digital life and while I'd never use it that way myself, that's what many users are signing up for, for better or worse.

Obviously, OpenClaw doesn't advertise it like that, but that's what it is.

Needless to say, OpenClaw wasn't even the first to do this. There were already many products that let you connect an AI agent to Telegram, which you could then link to all your other accounts. We built software like that too.

OpenClaw just took the idea and brought it to the masses and that's the problem.


I don't know, I don't see the benefit in giving it that much freedom. I've given my agent very specific access and it does basically everything I want. I don't think I've ever thought "this needs more access, but I don't want to give it", and it's already very isolated. It runs in a bunch of containers that don't have access to any secrets or the host system.

I don't see what the extra benefit is that OpenClaw gets from being able to access everything.


There are secure alternatives but they are not making millions of dollars.

Which secure alternatives? I've not seen any yet.

Connecting telegram to an agent with a bunch of skills and access to isolated compute environment is largely a solved problem. I don't want to advertise but here but plenty of solutions to spin this up, including what we have built.

That isn't secure is the issue, the more things you have it hooked up to the more havoc it can cause. The environment being locked down doesn't help when you're giving it access to potentially destructive actions. And once you remove those actions, you've neutered it.

The openclaw security model is the equivalent of running as root - i.e. full access. If that is insecure the inverse of it is running without any access as default and adding the things that you need.

This is pretty much standard security 101.

We don't need to reinvent the wheel.


The unsolved security challenge is how to give one of these agents access to private data while also enabling other features that could potentially leak data to an attacker (see the lethal trifecta.)

That's the product people want - they want to use a Claw with the ability to execute arbitrary code and also give it access to their private data.


But if it doesn’t have access to the network, then it’s just not very useful. And if it does, then it’s just a prompt injection away from exfiltrating your data, or doing something you didn’t expect (eg deleting all your emails).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: