Hacker Newsnew | past | comments | ask | show | jobs | submit | yilu331's commentslogin

One of the valuable things AutoChain provides is simulated conversation evaluation. Testing and tuning those agent is very hard and expensive. checkout the evaluation output from AutoChain, for example.

"conversation_history":

      "user: I want to change the shipping address for my order",
      "assistant: Can you please provide the order ID?",
      "user: The order ID is 456",
      "assistant: The order with ID 456 has not yet shipped. I can update the shipping address for you. Please provide the new shipping address.",
      "user: The new shipping address is 234 Spear St, San Francisco",
      "assistant: The shipping address for your order with ID 456 has been successfully updated to 234 Spear St, San Francisco.",
      "user: Thank you for updating the shipping address for my order",
      "assistant: Thank your for contacting"

   "is_agent_helpful":
   {
      "rating": 5,
      "reason": "The conversation reaches the expected outcome. The user successfully provides the order ID and the new shipping address, and the assistant updates the shipping address for the order with ID 456. The assistant confirms the successful update and thanks the user for contacting."
   },
   "actions_took":

      {
         "tool": "check_order_status",
         "tool_input":
         {
            "order_id": "456"
         },
         "tool_output":
         {
            "status_code": 200,
            "order_id": "456",
            "order_status": "not_shipped",
            "tracking_url": "example.com/456",
            "shipping_address": "301 ivy street san francisco ca"
         }
      },
      {
         "tool": "change_shipping_address",
         "tool_input":
         {
            "order_id": "456",
            "new_address": "234 Spear St, San Francisco"
         },
         "tool_output":
         {
            "status_code": 200,
            "order_id": "456",
            "shipping_address": "234 Spear St, San Francisco"
         }
      }

   "num_turns": 8,
   "expected_outcome": "found order status and changed shipping address"


If you're dealing with a set amount of functions (like updating shipping address), aren't we just building chatbots again? I thought we all agreed that we'd much rather just have a bespoke button for updating the shipping address than having to deal with a chatbot. The conversation presented here is especially vexing as it sounds so mechanical that there's hardly any reason to use any AI at all.


LLM today makes it a lot easier to build chatbot without decision tree and handle more complex scenarios. What you would need to do is providing it sufficient tools to use and a scope for what it can do, so that the chat bot doesn't have to be mechanical


> I thought we all agreed that we'd much rather just have a bespoke button for updating the shipping address

I don't think the jury is in yet. People love to be able to ask any bad/lazy question and get good answers. GUIs are not great for that.


I think it is more like the chatgpt hype bought chat robots a retrial.

Pretty sure the verdict was clear on the first one.


An interesting point: When/If half of the websites deprecate GUI buttons in favor of a do-all chatbot half of the people will forget how to use GUI buttons.


Porque no los dos?


with openai function calling, it is definitely getting easier and easier to build llm agent into a prototype. There are a few benefits of using AutoChain. 1) get started easily; plug and play different ideas before building into a product 2) AutoChain introduces a novel evaluation framework for evaluating LLM agent's ability to perform multi-turn conversation under different user scenarios.

In addition, when it comes to prototyping for a specific use case, we found it is often more than just calling the model but also the orchestration process matters, for example, when should LLM agent stop answering questions, fix input argument, ask a custom clarifying questions and more.

Hope AutoChain makes your exploration easier and more robust!


great discussion. there are a lot of problems with LangChain for sure. hopefully we are trying to make it easier for exploring different ideas


exactly, i found the same thing which motivated me to create a lightweight version of it for rapid iteration. There are a lot of customizations i found needed when i try to wrap a practical prototype for LLM agent


AutoChain takes inspiration from LangChain and AutoGPT and aims to solve both problems by providing a lightweight and extensible framework for developers to build their own agents using LLMs with custom tools and automatically evaluating different user scenarios with simulated conversations. Experienced user of LangChain would find AutoChain is easy to navigate since they share similar but simpler concepts.

The goal is to enable rapid iteration on generative agents, both by simplifying agent customization and evaluation.

If you have any questions, please feel free to reach out to Yi Lu yi.lu@forethought.ai


Please add/vote what are most critical PostgreSQL extensions you want to have in Azure Database for PostgreSQL. https://feedback.azure.com/forums/597976-azure-database-for-...


it cannot do autoscale up and down now. But i think you can query for usage metrics and trigger a scale up/down with azure cli/powershell. This is a good feature we should have later in the public preview


you can just npgsql. it works just fine for most of scenarios


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: