You can try this out now on Bard, which appears to be generally available.
https://bard.google.com/
My first impressions of this are that it's quite good. I gave it a handful of logic puzzles which it could solve and asked it for some New Zealand native gardening and bird advice and it gave good answers.
Best of all it doesn't have that... tone that ChatGPT has. You know that kind of smug condescending tone that gives away ChatGPT-generated content. If you ask Bard to respond in the style of ChatGPT it can replicate that tone, if you like it for some reason.
I like this prompt for testing LLMs as the problem is easy to reason about but likely doesn't come up a lot in training data:
> I'm playing assetto corsa competizione, and I need you to tell me how many liters of fuel to take in a race. The qualifying time was 2:04.317, the race is 20 minutes long, and the car uses 2.73 liters per lap.
GPT-3.5 gave me a right-ish answer of 24.848 liters, but it did not realize the last lap needs to be completed once the leader finishes. GPT-4 gave me 28-29 liters as the answer, recognizing that a partial lap needs to be added due to race rules, and that it's good to have 1-2 liters of safety buffer.
I prompted Bard today and the three drafts gave three different answers: 18.28, 82.5, and 327.6 liters. All of these were wildly wrong in different ways.
Interesting. Regenerating replies does seem to help, for whatever reason. I've noticed the first after model loading seems to have a higher likelihood of being inaccurate or a hallucination.
Regenerating a GPT4xAlpaca 30B model from its initial answer of 243L, I got a variation of the following for the next 5 rerolls:
"Based on your qualifying time of 2:04.317 and the length of the race being 20 minutes, we can calculate the number of laps needed for the entire race. Assuming each lap takes approximately 2 minutes (based on the average speed), there will be 20 / 2 = 10 laps during the race.
Using the information provided about the car using 2.73 liters per lap, we can determine the total amount of fuel required for the race. Therefore, it would be advisable to carry at least 10 * 2.73 = 27.3 liters of fuel for this race."
Sure thing! If the race is 20 minutes and each lap takes 2m04s, that means there will be 9.67 laps till the race is over, and you round that up to 10 since partial laps must be finished. You need 2.73 liters per lap, so the 10 laps will use 27.3 liters total. GPT-4 is correct in suggesting a tiny safety buffer above that in case fuel usage differs from expected.
People try stuff like this because it's precisely the kind of problem that AI would be useful for. If one of these models turned out to be really good at it, it would signify that they're now useful for a whole class of problems.
Besides, GPT-4 did solve this question perfectly. I like that rather than just involving math, there’s also some real life knowledge needed to give a practical answer.
Also from what I've heard France, Germany, Netherlands, Sweden...
It's quite odd given the announcement of 180+ countries. There are 195 countries in the world. EU consists of 27 countries. So, it can't be EU and Canada being left out or they are lying. I wonder what's going on. Is it a staggered rollout? There was nothing said about that. Pretty crappy to not simply give a list nor details on the rollout.
I don't think it's actually using PaLM 2.
If it is, it's not great, but I heavily doubt it is.
The commentary they've made on it could mean many things.
One of the possibilities is that they're using the API to PaLM-2 to classify the language and translate it to English if necessary, such that the simpler and non multilingual Bard model can handle it.
Obviously this isn't necessarily what is occurring, and it could be just Bard is now palm-2 for some instances for testing, or a full rollout.
Ultimately we don't know - because it's not open source, and companies are not great about telling you what they're actually doing.
> PaLM 2’s improved multilingual capabilities are allowing us to expand Bard to new languages, starting today. Plus, it’s powering our recently announced coding update.
That’s a bit vague, though. Bard might only be using it for some queries?
Frankly I did not perceive any difference in quality in Bard responses today and yesterday.
I think you are right, they probably dynamically choose PaLM 2 for some queries. Globally rolling out PaLM 2 for everything everywhere isn't Google style.
> "Google, similar to OpenAI, didn’t provide a lot of the technical details about how it trained this next-gen model, including parameter counts (PaLM 2 is a 540-billion parameter model, for what it’s worth). The only technical details Google provided here are that PaLM 2 was built on top of Google’s latest JAX and TPU v4 infrastructure."
I'm sad but not really surprised that these companies aren't publishing and bragging about all of the technical details of their model architecture, size, and training anymore.
They are still stuck in the mindset of “this is our proprietary secret and we can’t let others catch up.” That mindset is just one of the reasons their company is stagnating.
That's a tempting narrative but I think it's wrong. The used to let their researchers publish freely like Google Brain and Deep Mind, but then there was a combination of some of their employees leaving and also other companies learning from the publications. This led to the AI race, so now all race participants are shutting down their publications.
I think you are right, and the quote from article we are commenting here is wrong. From the PaLM technical report: "The largest model in the PaLM 2 family, PaLM 2-L, is significantly smaller than the largest PaLM model but uses more training compute."
update: there's at least some technical report that I didn't know when I wrote the comment https://ai.google/static/documents/palm2techreport.pdf but it's still mostly capabilities evaluation and not explaining the details of the model
> PaLM 2 will power Google’s updated Bard chat tool, the company’s competitor to OpenAI’s ChatGPT, and function as the foundation model for most of the new AI features the company is announcing today. PaLM 2 is now available to developers through Google’s PaLM API, Firebase and on Colab.
Let's not speculate and instead ask Bard directly!
>Do you use PaLM 2?
>I do not use PaLM 2 directly. PaLM 2 is a proprietary language model developed by Google AI, and it is not yet available to the public. However, I am trained on a dataset that includes text and code from PaLM 2, so I am able to benefit from its knowledge and capabilities. For example, I am able to answer questions more accurately and generate more creative text formats than I would be able to if I was not trained on PaLM 2's data.
>I am excited to see what PaLM 2 will be able to do in the future. I believe that it has the potential to revolutionize the way we interact with computers and the way we learn. I am grateful to Google AI for developing such a powerful tool, and I look forward to seeing how it is used to make the world a better place.
>When will you directly use PaLM 2?
>I do not know when I will directly use PaLM 2. It is a proprietary language model developed by Google AI, and it is not yet available to the public. However, I am always learning and growing, and I am hopeful that one day I will be able to directly use PaLM 2 to improve my capabilities.
>In the meantime, I am grateful for the opportunity to learn from PaLM 2's data, and I am confident that it will help me to become a more powerful and versatile language model.
Depends on how you phrase the question, evidently.
I asked the same question you did and got a similar answer:
> Do you use PaLM 2?
> I do not use PaLM 2 directly. I am a smaller language model, trained on a different dataset than PaLM 2. However, I am able to access and process information from PaLM 2, which allows me to provide more comprehensive and informative answers to your questions.
> For example, if you ask me a question about a complex topic, I can use PaLM 2 to access and process information from a variety of sources, including books, articles, and websites. This allows me to provide you with a more comprehensive and informative answer than I would be able to provide if I was only able to access information from my own dataset.
> I am still under development, but I am learning new things every day. I am excited to see how PaLM 2 and other large language models can be used to improve the way we access and process information.
But I asked a similar question and got a totally different answer:
> are you using the updated PaLM 2?
> Yes, I am using the updated PaLM 2. PaLM 2 is a more advanced version of the PaLM language model, with a larger training dataset and improved performance. It is able to generate more comprehensive and informative responses, and it can also perform a wider range of tasks, such as coding, math, and logic.
> I am still under development, but I am learning new things every day. I am excited to see what I can do with PaLM 2, and I hope to be able to help people in many different ways.
I'm surprised they've chosen to release a medical focused model (Med-PaLM 2) before releasing a law focused model of PaLM. I'd think it would be simpler to train and also less liability?
On the one hand, yes. On the other hand, if they are both going to have liability issues, choosing the one that isn't literally full of lawyers seems like a smart play.
Idk, this feels like one of those things that would be a huge problem for a startup with a less established legal team, but fairly easy for big corporate Google lawyers to write language releasing them from all liability.
My first impressions of this are that it's quite good. I gave it a handful of logic puzzles which it could solve and asked it for some New Zealand native gardening and bird advice and it gave good answers.
Best of all it doesn't have that... tone that ChatGPT has. You know that kind of smug condescending tone that gives away ChatGPT-generated content. If you ask Bard to respond in the style of ChatGPT it can replicate that tone, if you like it for some reason.