Hacker Newsnew | past | comments | ask | show | jobs | submit | willydouhard's commentslogin

We see developers build their own setup over SSH with success, so in that sense I agree with you.

However, once you want to trigger tasks from Slack, Linear, or GitHub issues or onboard teammates who aren't comfortable wiring up LXC + tmux + agent forwarding, a managed layer is needed.

I think we're at a moment where builders with great setups like yours and products like ours are feeding each other good ideas. The patterns you figure out in your zshrc inform what we productize, and the workflows we ship give you new things to try. It's a virtuous circle. Everyone should use the right-sized solution for their situation.


Definitely the same category of product. The main difference with cursor is that we reuse raw harnesses from ai-labs (claude code, codex). Cursor rebuilds its own harness. We believe nothing will beat the "natural" harness of each model because of RL.

You are also free to swap/combine these harnesses as you please, which is something Anthropic can't do. For instance claude code implements and codex reviews.


By default each plan has a limit (https://twill.ai/pricing). Then you can manually set an overage limit if you want to.

The ralph loop mode also has the concept of a budget per task.


The main difference is that you can pick/combine coding agent CLIs (claude code, codex, open code). There is no vendor lock-in.


That's literally just GitHub agents?


Yes, broadly. The main structural difference is that we’re agent-agnostic, so we can combine lab-native CLIs in one workflow. GitHub will likely struggle there because they have direct partnerships with Anthropic and OpenAI.

On the features themselves, we have a better UX across integrations, and more advanced features like video recording.


This seems like a weak argument. GitHub is already agent (not just model) agnostic, they have Copilot and Claude Code. I just don't see how this is a business, sorry.


This definitely feels like the end state. We need to improve models, agent/human UX and make the transition from cloud work to local work seemless to fully get there


Cloud to local seamless transition is exactly where the real value lies. Local setups with zero API cost and full privacy make the biggest difference


Yes or when you get good feedback/idea talking to someone, being able to spawn tasks from your phone makes everything much faster


We still need to get an oauth token to connect to GitHub. We started with the GitHub mcp but migrated to giving the gh cli to the agent directly.

One learning we had is that most of the time CLI > MCP


There is room for plan adaptation but the agent has to justify and highlight it in the PR.

Defining the plan/acceptance criteria for long running task is the hard part.

We recently added a Ralph loop mode in that spirit. The implementation won't start until the human and agent align on verifiable criteria and a different agent judges if criteria are met at the end of each run.

Overall I think this problem is not yet completely solved and improvement on both the UX and model judgement are needed


This is very convenient but has limitations. GitHub actions are not built to resume state (conversations in our case) and handle multi player experiences.

However reusing the GitHub workflows out of the box feels really nice


Thank you for the feedback, we are working on it!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: