Gorilla: Large Language Model Connected with APIs

shishirpatil · on June 14, 2023

Hi HN, I'm one of the lead authors on the paper! Gorilla is an open source effort and we would love to hear from the community. Let us know if you have any questions or suggestions!!

gsharma · on June 15, 2023

Congratulations on shipping! This is pretty cool. Building next-gen Zapier on top of this would be a great use of this.

There is a finite number of public APIs (100K?) which keeps the problem manageable. IMO, adding support for custom/private APIs (something like OpenAI functions) will make this a very powerful tool.

shishirpatil · on June 15, 2023

Thanks for the kind words @gsharma! Private APIs is something we are definitely thinking about.

joshuanapoli · on June 15, 2023

Is Gorilla helpful for private APIs? Is it addressing the same need that OpenAI's new "function-calling" feature?

absk82 · on June 15, 2023

Your license says it can be used commercially by anyone while Llama's license says it can only be used for research purpose. Isn't your license bound by the Llama usage license ?

shishirpatil · on June 15, 2023

Yes we have three set of models. One based on llama - which you are right, cannot be used commercial. We have two additional models based on MPT-7 base and Falcon-7B which can be used commercially with no obligations!

Kerbonut · on June 15, 2023

There is Open Llama 7B available that is Apache 2.0 licensed. Would you consider fine tuning that one as well for a commercial use of this with Llama model?

shishirpatil · on June 17, 2023

Yes, when we released the initial set of models Open Llama was still at 600B checkpoint and not finished training yet. It's an easy port :)

danShumway · on June 15, 2023

> while Llama's license says it can only be used for research purpose

Minor nitpick, but we still don't have a clear legal answer on whether this would be binding to people who didn't sign that agreement, because we still don't have a clear legal answer on whether model weights are covered by copryight.

That being said, it is good for projects to point out that there's uncertainty over whether Llama can be used commercially; so I agree with the overall point.

az226 · on June 15, 2023

It’s further unclear whether a fine tuned model which has different weights counts as a copyright violation at all. Doesn’t stop a wealthy company from suing though.

rahimnathwani · on June 15, 2023

AIUI it uses the Llama architecture, but not Facebook's Llama weights. It uses MPT-7B, which was trained from scratch: https://www.mosaicml.com/blog/mpt-7b

swyx · on June 15, 2023

the code is open source but not the weights. as far as i can tell.

shishirpatil · on June 15, 2023

Hi swyx, the weights are also open sourced at https://huggingface.co/gorilla-llm Let us know if you are unable to access them.

sfriedr · on June 15, 2023

Congratulation, great paper! It should have been put on HN earlier ;)

I have a few questions:

* you say (page 4): "We then perform standard instruction finetuning on the base LLaMA-7B model" Could you perhaps provide a reference to the _exact_ finetuning approach you used? I'm afraid different groups of people have a different notion of "standart" (see for example pages 131-155 from https://arxiv.org/abs/2302.08575 for various fine-tuning approaches) and without knowing exactly how fine-tuning was carried out, it can be very difficult reproduce your research and results exactly.

* the idea of using AST Sub-Tree Matching is nice. Could you please let me know which function in which file from your GitHub repository this is implemented in?

Again, great job on publishing this paper!

---

Best regards,

friederrr.org

shishirpatil · on June 17, 2023

Thanks @sfriedr We generate self-instruct data and then fine tune the base model with perplexity loss. The self-instruct data is https://github.com/ShishirPatil/gorilla/tree/main/data/apibe...

Thank you! Yes, the code can be found here: https://github.com/ShishirPatil/gorilla/tree/main/eval/eval-...

Hope this helps. Let me know if you have any follow-ups!

data_maan · on June 17, 2023

Awesome, thanks for letting me know!

I'm still not sure though about some nitpicky things: - do you change all the weights, or just the ones from the last layer when fine-tuning? - do you just train on the _code_ field from the JSON file with the self-instruct data, or do you also use the other fields to train (or do you use the other fields just for downstream evaluation purposes)?

I think it could be a major selling point of your paper if on Github (or in an appendix to your preprint, if you update it on arxiv), you had a section where you document the training process in detail

data_maan · on June 18, 2023

(whoops, this comment/questions should have been to as an answer to your other comment @shishirpatil)

data_maan · on June 16, 2023

Seems @shishirpatil ran out of steam answering questions. Too bad.

data_maan · on June 16, 2023

(Or maybe the questions were too tricky and he wasn't able to answer, heh)

shishirpatil · on June 17, 2023

Haha, was busy yesterday! Or was I? :P

fareesh · on June 15, 2023

What's a good/affordable GPU to run these projects locally?

It seems like building anything on top of these runs into either a big GPU cost for yourself or a big compute cost if you scale for others.

Tostino · on June 15, 2023

3090 can run 30b parameter models, 2x can run 65b parameter models.

4090 can run the same, very slightly faster for much more money.

tianjunz · on June 15, 2023

We named the project Gorilla cause it is an cute animal that use tools !

tianjunz · on June 15, 2023

Hey everyone, I am one of the authors of the Gorilla project. Super excited to see how the project grows! We have released LLaMA based, MPT based (Apache 2.0) and Falcon based (Apache 2.0) models so far. Something cooler is coming soon!

arbuge · on June 15, 2023

In the colab example it appears you are using the openai python library but with the gorilla model instead of openai's models. That works? How do you set that up?

  # Query Gorilla server 
  
  def get_gorilla_response(prompt="I would like to translate from English to French.", model="gorilla-7b-hf-v0"):
  try:
    completion = openai.ChatCompletion.create(
      model=model,
      messages=[{"role": "user", "content": prompt}]
    )
    return completion.choices[0].message.content
  except Exception as e:
    raise_issue(e, model, prompt)

lt · on June 15, 2023

they point openai.api_base to their server that implements the same API

OkGoDoIt · on June 15, 2023

That’s clever. Do other LLM API’s do that?

dygd · on June 15, 2023

Yesterday there was a "Launch HN" thread for credal.ai [0] and I noticed that they use the same openai.api_base trick [1].

[0] https://news.ycombinator.com/item?id=36326525 [1] https://credalai.notion.site/Drop-In-APIs-3a45d32405c347e8bf...

anonzzzies · on June 15, 2023

It would take you (or gpt) 3 seconds to write an openai compatible wrapper; the inference api is trivial for all LLMs.

arbuge · on June 15, 2023

Ah, I missed that. Thanks.

lmeyerov · on June 15, 2023

At first I was excited -- this is the second time i'm.seeing this advertised, and we are thinking through reliable API call-out strategies for louie.ai -- but then I got confused by the paper:

Is this really just tested against 95 API calls, and I'm guessing largely from just a small number of libriaries like pytorch?

More importantly, if anywhere near true, is there any reason to (so far) use this for use cases like OpenAI's around calling generic OpenAPI style libs (zapier scenario), known specific tools, or random python libs not in that dataset?

I'm really thinking 3 scenarios for our users:

-- python libraries we know they'll want to use ahead of time, like pandas and pygraphistry

-- Same for CLI, like AWS and az, and OpenAPI from and index

-- Long-tail that we don't expect, esp in python + js, so on the fly, with limited time budget for inspecting GitHub/Google/etc

So far, we generally find auto approaches too unreliable for non-hobbyists, and have to tune a bunch for each tool and database we teach louie. This line of research is def interesting to us...

sublimefire · on June 15, 2023

My issue with this is that it needs to be retrained on a regular basis to make sure latest APIs are included. There needs to be a long term assessment to understand its viability in a commercial setting. Otherwise we'll jump in and after 6 months it will begin producing out of date suggestions for some edge cases. And then again if you need to support an old API how can you be sure it will produce the scoped results?

jarulraj · on June 15, 2023

Neat idea, @shishirpatil! We are developing EvaDB [1] for shipping simpler, faster, and cost-effective AI apps. Can you share your thoughts on transforming the output of the Gorilla LLM to functions in EvaDB apps -- like this function that uses the HuggingFace API -- https://evadb.readthedocs.io/en/stable/source/tutorials/07-o...?

[1] https://github.com/georgia-tech-db/eva

CyrsBel · on June 15, 2023

Outperforms ChatGPT and GPT-4 in generating api calls? Or in every type of querying? It's exciting that this is open, looking forward to trying all of this from end to end!

thisisit · on June 15, 2023

Previously discussed: https://news.ycombinator.com/item?id=36073241

sroussey · on June 14, 2023

Currently released weights:

https://huggingface.co/gorilla-llm

Aditya_Garg · on June 14, 2023

Heads up, your discord link is broken

jeron · on June 14, 2023

working for me...

edmundsauto · on June 14, 2023

How does this compare to LangChain?

TechBro8615 · on June 14, 2023

I don't know how its performance compares, but its architecture is completely different: LangChain is a "normal" software library, but Gorilla is itself an LLM:

> Gorilla is a LLM that can provide appropriate API calls. It is trained on three massive machine learning hub datasets: Torch Hub, TensorFlow Hub and HuggingFace. We are rapidly adding new domains, including Kubernetes, GCP, AWS, OpenAPI, and more. Zero-shot Gorilla outperforms GPT-4, Chat-GPT and Claude. Gorilla is extremely reliable, and significantly reduces hallucination errors.

My reading of that abstract is that it's an LLM that outputs API calls instead of natural language (or maybe it still outputs natural language, but it can use API calls during inference? I didn't read very far), whereas LangChain is simply a software library. In theory, you could probably get Gorilla to output LangChain "API" (function) calls...

csjh · on June 14, 2023

Still outputs natural language. Example from their colab:

Input: I would like to translate from English to Chinese.

Output:

<<<domain>>>: Natural Language Processing Text2Text Generation

<<<api_call>>>: M2M100ForConditionalGeneration.from_pretrained('facebook/m2m100_1.2B')

<<<api_provider>>>: Hugging Face Transformers

<<<explanation>>>: 1. Import M2M100ForConditionalGeneration and M2M100Tokenizer from the transformers library.

2. Load the pre-trained M2M100 model and tokenizer using the from_pretrained() method. The model is trained to translate text from English to Chinese, among other languages.

3. Encode the input text in English using the tokenizer.

4. Generate the translation using the model.generate() method.

5. Decode the output tokens using the tokenizer to obtain the translated text in Chinese.

6. Print the translated text.

shishirpatil · on June 15, 2023

Hi @csjh, we trained to model to also additionally output additional context so it would be useful for a downstream task. We wrapped the API call with special decorator so it's easier to just regex. Would you like to just have the API instead? Happy to release an API only model if there is wider interest - It's a strictly easier task for Gorilla LLM :)

jadbox · on June 15, 2023

Ideally I'd like the result to be in a clean json with a key that's strictly the result and other keys for context information. It would reduce needing to use regex everywhere.

zxexz · on June 15, 2023

I would love this!

valine · on June 14, 2023

"We release Gorilla, a finetuned LLaMA-based model that surpasses the performance of GPT-4 on writing API calls"

Sounds like it's another LLaMA variant specifically fine tuned for API calls.

shishirpatil · on June 15, 2023

Good point. This was the original release, we now also have Apache-2.0 licensed models finetuned on MPT-7B and Falcon-7B!

Kerbonut · on June 15, 2023

There is Open Llama 7B which is Apache 2.0 licensed, please consider checking it out https://github.com/openlm-research/open_llama

shishirpatil · on June 15, 2023

Langchain is a terrific project that tries to teach agents how to use tools using prompting. Our take on this is that prompting is not scalable if you want to pick between 1000s of APIs. So Gorilla is a LLM that can pick and write the semantically and syntactically correct API for you to call! A drop in replacement into Langchain!

random5245 · on June 15, 2023

Is this module raw uncensored ?