We see developers build their own setup over SSH with success, so in that sense I agree with you.
However, once you want to trigger tasks from Slack, Linear, or GitHub issues or onboard teammates who aren't comfortable wiring up LXC + tmux + agent forwarding, a managed layer is needed.
I think we're at a moment where builders with great setups like yours and products like ours are feeding each other good ideas. The patterns you figure out in your zshrc inform what we productize, and the workflows we ship give you new things to try. It's a virtuous circle. Everyone should use the right-sized solution for their situation.
Definitely the same category of product. The main difference with cursor is that we reuse raw harnesses from ai-labs (claude code, codex). Cursor rebuilds its own harness. We believe nothing will beat the "natural" harness of each model because of RL.
You are also free to swap/combine these harnesses as you please, which is something Anthropic can't do. For instance claude code implements and codex reviews.
Yes, broadly. The main structural difference is that we’re agent-agnostic, so we can combine lab-native CLIs in one workflow. GitHub will likely struggle there because they have direct partnerships with Anthropic and OpenAI.
On the features themselves, we have a better UX across integrations, and more advanced features like video recording.
This seems like a weak argument. GitHub is already agent (not just model) agnostic, they have Copilot and Claude Code. I just don't see how this is a business, sorry.
This definitely feels like the end state. We need to improve models, agent/human UX and make the transition from cloud work to local work seemless to fully get there
There is room for plan adaptation but the agent has to justify and highlight it in the PR.
Defining the plan/acceptance criteria for long running task is the hard part.
We recently added a Ralph loop mode in that spirit. The implementation won't start until the human and agent align on verifiable criteria and a different agent judges if criteria are met at the end of each run.
Overall I think this problem is not yet completely solved and improvement on both the UX and model judgement are needed
This is very convenient but has limitations. GitHub actions are not built to resume state (conversations in our case) and handle multi player experiences.
However reusing the GitHub workflows out of the box feels really nice
However, once you want to trigger tasks from Slack, Linear, or GitHub issues or onboard teammates who aren't comfortable wiring up LXC + tmux + agent forwarding, a managed layer is needed.
I think we're at a moment where builders with great setups like yours and products like ours are feeding each other good ideas. The patterns you figure out in your zshrc inform what we productize, and the workflows we ship give you new things to try. It's a virtuous circle. Everyone should use the right-sized solution for their situation.