Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Bloop – Answer questions about your code with an LLM agent (github.com/bloopai)
118 points by louiskw on June 10, 2023 | hide | past | favorite | 57 comments
Hi HN! We launched bloop 10 weeks ago (https://news.ycombinator.com/item?id=35236275) and received a huge amount of feedback (both positive + constructive). We've undertaken a rewrite of the core search framework, which now acts as an LLM agent, significantly improving the number of queries that can be successfully answered.

There's a bunch of hype surrounding LLM agents, but we're positive this is one of the first implementations of an agent that can deliver immediate value for engineers working on existing projects, especially larger ones. We'll do a full write up of how the agent works and the tools it can use soon, but we wanted to share our progress, now that we've got a stable release.

bloop is a developer assistant that uses GPT-4 to answer questions about your codebase. The agent searches both your local and remote repositories with natural language, regex and filtered queries.

Some of the ways engineers use bloop to improve their efficiency when working on large codebases:

- Summarise how large files work and how multiple files work together

- Understand how to use open source libraries when documentation is lacking

- Identify the origin of errors

- Ask questions about English-language codebases in other languages

- Reduce code duplication by checking for existing functionality

- Write new code, taking into account existing codebase context (eg: "write a dockerfile for this project")

bloop runs as a free desktop app on Mac, Windows and Linux: https://github.com/bloopAI/bloop/releases/latest. On desktop, your code is indexed with a MiniLM embedding model and stored locally, meaning at index time your codebase stays private. 'Private' here means that no code is shared with us or OpenAI at index time, and when a search is made only relevant code snippets are shared to generate the response. (This is more or less the same data usage as Copilot).

We also have a paid cloud offering for teams ($45 per user per month). Members of the same organisation can search a shared index hosted by us and will get access to enterprise only features down the line (currently there's no feature gap between desktop and cloud).



Great product! I gave it a try and was really impressed!

One thing: Could your team have an explanation somewhere on the home page to help the users understand how their private code is being handled (locally and when the chat request happen)?

The only place I could find that information is on the HN launch thread, would be nice if it's available in the home page!


Yes, have been meaning to do this! We’ve been super focussed on the app itself often at the expense of the marketing site.

Glad to hear you found it helpful, I’m around if you have any issues.


I'm not currently working for a larger org... but I could certainly say with my last employer having this explicitly documented up front would be crucial for adoption.


This looks super interesting! You mentioned that you'll be sharing some more details on the approach. I'm looking forward to learning more about how you're using the user's query to select relevant code to share with GPT-4.

I have been working on a related problem for my open source GPT coding tool. I am more focused creating a GPT chat experience that can act as a "junior developer" to write and edit code in your git repo. GPT is great at writing fresh, new self-contained code. I have been trying to improve its ability to make changes to a larger, complex repo.

I wrote up some notes on my current approach and some ideas for future work. I'll be curious to see how it compares to your approach, which seems more focused on search and code analysis.

https://aider.chat/docs/ctags.html


Very cool. I imagine your approach using files/folders + identifiers would work well for small repos, where everything can fit within an 8k prompt. An early prototype of our agent had something similar but with just files/folders no identifiers.

Our working thesis atm is the only approach to context that scales to support massive repos is embedding or text based retrieval.

Code search is going to need to be solved before code editing gets solved. You can't make changes if you can't find where the changes need to be made in a repo. Unless you're willing to make the UX concession that users have to manually select which files need to be edited.


Looks interesting. A couple of simple questions,

1. For people with only access to gpt3.5 API, does this still work?

2. Also does it work for repositories not hosted in github (ex: gitlab, bitbucket)?


1. The agent loop runs in GPT4, so the prebuilt binaries come with access to our own OpenAI key (via our hosted proxy)

2. It works with any git repos, you just have to clone them on your local machine first


This looks really interesting, I wonder if we can get to the point of having this all be local to your machine. The idea to use embeddings to search for relevant code snippets is maybe obvious to some, but just now getting into this stuff, just blew my mind!

I find that the hardest problems to find are interfaces between projects like a bad frontend call into a backend or backend to backend. I wonder if this could index separate projects & draw links between them


Semantic relationships between backend/frontend or microservices are super interesting.

We’re not far off, for example you if you index bloop itself with bloop and ask “What message does the backend send to indicate the frontend should close the eventsource?” bloop will return a decent answer which takes into account the relevant frontend and backend code.

This is an active area of improvement.


What have been some of your learnings for getting agents to work?


Generate as few tokens as possible, GPT4 is running a few times to generate a single answer and latency quickly becomes the biggest UX issue.

We abandoned most of the common thinking around chain of thought reasoning, finding it didn’t help accuracy much whilst increasing response times significantly.

Full write up to follow in next week or so.


Does this mean your queries are all one-shot instead of utilizing techniques like LangChain?


Exactly, you can see the prompt in this file [0]. I'm not sure how LangChain arrived at their default agent prompt, but you'll almost certainly want to write your own for performance reasons if you put something into production.

[0] https://github.com/BloopAI/bloop/blob/main/server/bleep/src/...


This is great that you got gpt-4 to explore the codebase using an agent approach. I tried this previously with gpt-3.5-turbo and have been meaning to revisit it since I got gpt-4 access.

I shared some notes on HN awhile back on a variety of experiments I did with gpt-3.5-turbo.

https://news.ycombinator.com/item?id=35441666


Does Bloop play in the same space as GitHub Copilot Chat? https://code.visualstudio.com/docs/editor/artificial-intelli...


I haven’t tried Copilot Chat but I imagine the key difference is the context. bloop’s tuned to answer questions from anywhere in a repo, copilot chat uses the context of what you’re looking at in-IDE.


> bloop’s tuned to answer questions from anywhere in a repo…

GitHub Copilot Chat purportedly (I'm also waiting for access) works across files in a workspace, which typically map to a repo.


In my experience it can only take into account files you have open.

So if you need it to call a function in another file that hasn’t been called in the current file, you’d have to go open the other file in a new tab then go back to the original and finally get the correct completion


Can it work without being granted access to the user's github? Getting authorization to private repos should not be necessary to work with random public repos on github or local code only.


GitHub oauth doesn't have a way to limit scope to specific repos, but the token is stored local to your device and the app's logic is that only repos you explicitly select are synced.

It's also a condition of many LLM providers that application end users are authenticated to prevent abuse, so GitHub auth helps with this.


When you say "your code", could I run this on anyone's GitHub project? What if I want to ask questions about how some code in an emulator works, or Doom, or about Vue?


This would be great, actually. I couldn't necessarily feed my company's code to this due to licensing concerns, but I'd love to point this at a Minecraft mod and ask how block X works, or if there's a console command to do Y, or how to construct a weapon with the most damage, etc.


Yes, I know that a decent proportion of the community uses bloop to understand open source repos. It can be especially helpful for repos that lack documentation.


> ... uses GPT4

when will people learn not to send their entire IP to Microsoft?


Microsoft already owns GitHub. This is probably near the bottom of things to be concerned about.


Are you implying having your code on GitHub means Microsoft accesses to your IP ?


Yes, that's how they trained copilot and that's why they are currently on trial about it (but that damage has been done).


Where can I read about this? They're on trial for training copilot on private repos? This is huge.



Your questions in this thread aren't making a lot of sense. Microsoft owns GitHub, so obviously hosting your code there means "sending your intellectual property to Microsoft" in some sense.


Of course it does, but when I pay fora private repository, I expect it to e private. I've never seen or heard of any evidence that my private code is used to train ChatGPT-4 and or Copilot. So with all due respect, you and the parent aren't making much sense.

It would be an incredible breach of trust if MS was found to be doing this.


We use OpenAI directly, not Azure


But isn’t OpenAI funded by Microsoft in the form of Azure compute?


That’s not how funding works, Microsoft can’t read your data.


Yeah ok, I was just pointing out that the data is ultimately being sent to Microsoft owned infra.


does microsoft own github?


I bet you could look that up


is this a rhetorical question?


I think the point being made by using "Microsoft" here was basically a stand in for "other random people".

I.e anything 'cloud'.


Love the idea of this - but it'd be great to have a list of the languages supported (I'd be wanting C# - it's not listed in the 'treesitter bindings' you've linked to, not sure if that means it isn't supported)

And of course, yes I'd need to be able to provide my gpt4 api key.


Even if C# isn't supported for code navigation, you should still be able to ask questions as the LLMs will have seen C#.

If you want to try it out today (not using your own API key), you could sync something open source.


I really like how the product looks, congrats! Curious how your "write a dockerfile" example would work -- would it write a dockerfile completely autonomously, or (more likely) involve multiple iterations of the agent + human?


It's one shot at the moment, but this isn't by design and may change, as generating code hasn't been our core focus.


V nice. Can you hook in your own LLM (eg Bloom, t5 etc?)


We’re experimenting with this, and the answer is yes and no.

GPT4 is the only model that can just about run the agent execution, mainly due to context length and quality.

We use our own model for the embedding based code retrieval, and will be replacing some of the GPT3.5 calls with fine tuned models over the coming months.


Would be great if the app allowed connecting with local LLMs like text generation webui. As for quality, it's up to the user to choose their LLM, so I don't see this as relevant.


Also interested. Would be really great to run this against a self hosted LLM agent.


Curious what was your rationale to do this in Rust vs Python? Would be instructive to understand the trade-offs you considered. Thanks


I'm sure the other engineers on the project will have their own opinions here, but for me there's the obviously visible parts to the project (the prompt, tools, ...) and the invisible parts (indexing, tokenising/chunking, parallelising, streaming, ...).

Building agents is an experimental process. You test an approach, maybe it works, and there's not always an obvious reason why certain experiments fail or succeed. We built three prototype agents in a Python and JS, because those languages favour scrappy fast iteration. This helped us quickly nail down our approach to the 'visible' parts.

Once we nailed down our approach, we rebuilt the agent in Rust because the speed and safety favoured all the 'invisible' parts of the project.


This is a very interesting perspective, thanks for sharing


Interesting product! Looking forward to hear about how the agent works. Curious about embedding process.


Shameless plug - https://github.com/kesor/chatgpt-code-plugin - also does a very good job of telling you anything you'd like about your code. Shared it on ShowHN as well here https://news.ycombinator.com/item?id=36099507


No shame at all, very cool! We wanted to go beyond the chat interface, and we have a complicated history management system that ensures answers are reliable in long threads, which wouldn't work with ChatGPT as it removes messages chronologically


Can we use our own GPT4 api access ?


Soon, we’re working on open sourcing our GPT proxy. As it’s not possible to self serve sign up to a GPT4 API key we haven’t prioritised


Cool. I downloaded and tried out the app but it just says "Loading code line ranges" on all the returned results.


Can you share a screenshot or if the code is open source the repo and query with louis at bloop dot ai

I’ve seen a similar issue with the syntax highlighter and less common languages. Either way will debug and fix.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: