Hacker News new | past | comments | ask | show | jobs | submit login
AutoChain, lightweight and testable alternative to LangChain (github.com/forethought-technologies)
211 points by yilu331 on July 18, 2023 | hide | past | favorite | 36 comments



AutoChain takes inspiration from LangChain and AutoGPT and aims to solve both problems by providing a lightweight and extensible framework for developers to build their own agents using LLMs with custom tools and automatically evaluating different user scenarios with simulated conversations. Experienced user of LangChain would find AutoChain is easy to navigate since they share similar but simpler concepts.

The goal is to enable rapid iteration on generative agents, both by simplifying agent customization and evaluation.

If you have any questions, please feel free to reach out to Yi Lu yi.lu@forethought.ai


Why do people need LangChain or AutoChain? I don’t really get it in the first place. Genuinely want to get more information.

It seems to me that this can be easily added ad-hoc by programmers in their projects. But the biggest reason I see may be standardization of some kind? For interoperable building blocks maybe


I use Langchain for the slack/discord bots I built for my current company.

Langchain is a swiss army knife of tools. And while that can be a good thing, it's mostly a bad thing when you're trying to build something for production. It's also not very flexible... when you're trying to do anything beyond the pre-built stuff, you have to edit the package itself... which means maintaining your own forked version or submitting a PR with your changes. For example, I improved upon the main text splitter Langchain recommends (https://github.com/ShelbyJenkins/langchain/blob/ca14d3028a57...), but I just haven't gotten around to doing everything required for a PR to add the feature (tests, docs, notebooks). So now I have two repos to maintain...

But I also really don't want to take the time to set up a web scraper for a dozen different document sources when Langchain has tools that will work with a minimal amount of tweaking. So, I'm happy to use it for things like that, and happy to recommend it for it's community built pocket-knife tools!


This has been discussed ad-nauseum in the comment section of anything pertaining to LangChain over the past few weeks or couple of months.

It’s simple: a bit of cargo culting and the real advantage that LangChain provides in its interoperability with various other useful systems.



great discussion. there are a lot of problems with LangChain for sure. hopefully we are trying to make it easier for exploring different ideas


I used it to prototype an app, and it was nice to be able to get started quickly without worrying about writing prompts or writing a vector store integration. But once I settled on the components I was using it made more sense to use a custom implementation.


exactly, i found the same thing which motivated me to create a lightweight version of it for rapid iteration. There are a lot of customizations i found needed when i try to wrap a practical prototype for LLM agent


Same reason javascript needs an is-even library


And a cant-even library.


with openai function calling, it is definitely getting easier and easier to build llm agent into a prototype. There are a few benefits of using AutoChain. 1) get started easily; plug and play different ideas before building into a product 2) AutoChain introduces a novel evaluation framework for evaluating LLM agent's ability to perform multi-turn conversation under different user scenarios.

In addition, when it comes to prototyping for a specific use case, we found it is often more than just calling the model but also the orchestration process matters, for example, when should LLM agent stop answering questions, fix input argument, ask a custom clarifying questions and more.

Hope AutoChain makes your exploration easier and more robust!


LangChain is just too much, personal solutions are great, until you need to compare metrics or methodologies of prompt generation. Then the onus is on these n-parties who are sharing their resources to ensure that all of them used the same templates, they were generated the same way, with the only diff being the models these prompts were run on.

So maybe a simpler library like Microsoft's Guidance (https://github.com/microsoft/guidance)? It does this really well.

Also LangChain has a lot of integrations, that just pop up as soon as new API for anything LLM pops up, so that helps with new user onboarding as well.


I wouldn't rely on guidance. I've been waiting for them to approve a commit for something as basic as token counts for OAI models for weeks now. It's just not actively developed enough for serious use cases.


I wouldn't count anything out as of yet, the trend is usually that people move towards simpler tools that are also widely used by others.

Kind of a chicken-and-egg problem (to be widely used, it helps to be simple, but not all simple-to-use products make it to being widely used). TensorFlow vs PyTorch anyone?

Now, which will be more likely to be widely adopted? No one knows, just keep using both, learning both linguistics and hopefully one of them gets enough traction and all your effort put into building something with these two doesn't go to waste, fingers crossed.


I do have some guidance code (running on a more universal wrapper I made for LLMs) sitting in my prompts directory. I'm not going to throw it out but I don't see the point in maintaining it.


I think a lot of these libraries are made by people who don't really design systems.

At its core, what chatgpt and the like really are, are ways to transform natural language into structured data and vice versa. All you really need is an interface that allows you to plug in these APIs to _normal_ workflow engines that lots of companies are using every day to do work, both fully automated workflow engines like airflow or whatever, and also just plugging them into the _exact same systems_ that human engineers use -- it's perfectly possible to get chatgpt to use Jira and submit PRs or whatever.


> are ways to transform natural language into structured data

I've seen quite the opposite, because at their core they are next word predictors. Telling it to structure an output biases the context into data that it saw in a structured format. From what I've seen, this always leads to lower quality reasoning/output.

I get much better results if I leave the data unstructured, to maximize reasoning, then add a later step to structure that output.


You mean like an intermediate "thoughts" or "observations" output step?


My use case has always been:

1. take some input

2. run it through the llm to get some output

3. make that output useful to my non-english-speaking software by structuring it.

Forcing step three to happen in step 2 means you're in a very unusual/unrealistic "head space"/context for next word completion.

Leaving structuring as a later step means the human like output/reasoning can be maintained for step 2, with the inhuman like structuring being more of a reformatting than reasoning+reformatting.

Leaving the "context" as small and precise as possible (which might me large for a given task) gets the best output, from what I've seen. Being a next word predictor, this makes sense to me.

I suppose this is just "chain of thought".

This is pretty easy to benchmark. Have it put the answer into json format, after producing the answer. Then, have it come up with an answer in json format, as one step. It's a night and day difference, at least for my tasks.


Agreed that it feels like a cowboy show. We get 2 main uses out:

1. Inverting the control structure. Traditional orchestration frameworks blackbox simple calls, like asking an LLM to analyze a patch. Chaining is about an LLM-based planning agent + ensuring all tools etc are abstracted so the planner can orchestrate them. Airflow era tools leave too many integration gaps here.

2. Plugin interfaces. Related to 1, it puts in patterns for keeping vendors easy to swap out. (In theory, practice takes more bc tuning, but at least the shape is right.) Orchestration frameworks and new LLM ones have a lot plugin ground to cover here.

We had to do a lot of work around langchain to make prod ready for louie.ai, and just as we have been figuring out patterns we need, was clear so was langchain. Don't get me started on threading and eval..


How?


This is going to sound like a smart-ass answer, but "programming". Jira has an api, openai has an api.

You need some script that takes structured json and uses it to modify a jira ticket, you need some other script that asks openai to generate that json based on some prompt and context, and that's all there is to it. I wrote a script right after I got access to the API that could open and refine jira tickets based on conversations in a chat window.


Is OData still a thing?


How about haystack? It exists for long and much better architectured than langchain


How's your experience with it, have you used it in anger successfully?


Have an LLM document-QA app in production with a local model and Haystack.

Excellent documentation, great support (on Discord, unfortunately). Exploring Agents now.


I'm the culprit for Discord. But come join us! We host office hours and have NLP chats pretty often :)


One of the valuable things AutoChain provides is simulated conversation evaluation. Testing and tuning those agent is very hard and expensive. checkout the evaluation output from AutoChain, for example.

"conversation_history":

      "user: I want to change the shipping address for my order",
      "assistant: Can you please provide the order ID?",
      "user: The order ID is 456",
      "assistant: The order with ID 456 has not yet shipped. I can update the shipping address for you. Please provide the new shipping address.",
      "user: The new shipping address is 234 Spear St, San Francisco",
      "assistant: The shipping address for your order with ID 456 has been successfully updated to 234 Spear St, San Francisco.",
      "user: Thank you for updating the shipping address for my order",
      "assistant: Thank your for contacting"

   "is_agent_helpful":
   {
      "rating": 5,
      "reason": "The conversation reaches the expected outcome. The user successfully provides the order ID and the new shipping address, and the assistant updates the shipping address for the order with ID 456. The assistant confirms the successful update and thanks the user for contacting."
   },
   "actions_took":

      {
         "tool": "check_order_status",
         "tool_input":
         {
            "order_id": "456"
         },
         "tool_output":
         {
            "status_code": 200,
            "order_id": "456",
            "order_status": "not_shipped",
            "tracking_url": "example.com/456",
            "shipping_address": "301 ivy street san francisco ca"
         }
      },
      {
         "tool": "change_shipping_address",
         "tool_input":
         {
            "order_id": "456",
            "new_address": "234 Spear St, San Francisco"
         },
         "tool_output":
         {
            "status_code": 200,
            "order_id": "456",
            "shipping_address": "234 Spear St, San Francisco"
         }
      }

   "num_turns": 8,
   "expected_outcome": "found order status and changed shipping address"


If you're dealing with a set amount of functions (like updating shipping address), aren't we just building chatbots again? I thought we all agreed that we'd much rather just have a bespoke button for updating the shipping address than having to deal with a chatbot. The conversation presented here is especially vexing as it sounds so mechanical that there's hardly any reason to use any AI at all.


LLM today makes it a lot easier to build chatbot without decision tree and handle more complex scenarios. What you would need to do is providing it sufficient tools to use and a scope for what it can do, so that the chat bot doesn't have to be mechanical


> I thought we all agreed that we'd much rather just have a bespoke button for updating the shipping address

I don't think the jury is in yet. People love to be able to ask any bad/lazy question and get good answers. GUIs are not great for that.


I think it is more like the chatgpt hype bought chat robots a retrial.

Pretty sure the verdict was clear on the first one.


An interesting point: When/If half of the websites deprecate GUI buttons in favor of a do-all chatbot half of the people will forget how to use GUI buttons.


Porque no los dos?


I’d really like libraries to put the language they’re for in the description.

Especially anything AI-related seems to just assume everybody uses Python.


Yeah, I'm getting bothered by all React libraries not saying what language it's for too!

Jokes aside, quick glance at the language pane on GitHub shows what languages are being used.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: