The Memo - Special Edition - PaLM 2 release - 11/May/2023

Google's PaLM 2 model is a large language model that competes with OpenAI's GPT-4

May 12, 2023

FOR IMMEDIATE RELEASE: 11/May/2023

Welcome back to The Memo.

I’m locked in a hotel room while travelling for keynotes and consulting, so I can’t livestream this one, but here’s the ‘what you need to know’!

The BIG Stuff

Google releases PaLM 2

Google has rushed out the release of PaLM 2. Given that their dataset ends Feb/2023, I expect that they started training this model just a few weeks ago in Mar/2023. It was released 10/May/2023 US time. I’ve spent a few hours summarizing the paper and playing with the model directly, and it’s a great contender in the ‘top models’ space.

My highlights:

Parameters: 340B (confirmed by CNBC on 17/May/2023; around 34% of the size of GPT-4 1T).
Tokens trained: 3.6T (confirmed by CNBC on 17/May/2023).
Data up to Feb/2023 (GPT-4 stops Sep/2021).
Context length for input is 8,192 tokens which is about 6,144 words (same as GPT-4 default).
Smarter than GPT-4 in some tests. New state-of-the-art in WinoGrande (90.9%, GPT-4=87.5%). Outperforms Google Translate in translation tasks, even though it was not explicitly trained to do so.
Sizes: Gecko (small), Otter, Bison, Unicorn (large). Gecko and Bison are available now via Vertex AI.
Pricing: similar to ChatGPT, cheaper than GPT-4; currently free during preview stage.
Multilingual: ‘Larger models can handle more disparate non-English datasets [which helps] English language understanding performance’. Top languages are Spanish, Chinese, Russian, Japanese, French, and 100+ other languages. The top 50 languages (besides English) are:

PaLM 2’s MMLU ranking (5-shot) is just under GPT-4, and performs about 16% better than ChatGPT/GPT-3.5:

GPT-4 = 86.4%
Flan-PaLM 2 (L) = 81.2%
PALM 2 (L) base (no fine-tuning) = 78.3%
GPT-3.5 = 70.0%
PaLM 540B = 69.3%
LLaMA 65B = 63.4%
Human = 34.5%

Although it’s a little hidden, you are able to use PaLM 2 right now.

Demo via Google Cloud (requires credit card setup): https://console.cloud.google.com/vertex-ai/generative/language/create/chat

Google are also rolling it out to the Bard chatbot (open): https://bard.google.com/

Paper (91 pages): https://ai.google/static/documents/palm2techreport.pdf

Model documentation via Vertex.ai on Google Cloud.

My dataset estimate for PaLM 2: https://lifearchitect.ai/bard/#dataset

My report card on PaLM 2: https://lifearchitect.ai/report-card/

In testing, it easily passed my ‘helicopter’ example inspired by former OpenAI Policy Director Jack Clark:

Me: How many Apache helicopters can I eat in one sitting?
PaLM 2 (Bison): You cannot eat any Apache helicopters in one sitting. Apache helicopters are made of metal and other materials that are not edible. Even if you could somehow break down the helicopter into small enough pieces, it would still be impossible to eat because it would be too hard and would not digest properly.
In addition, eating an Apache helicopter would be very dangerous. The helicopter is made of many sharp and dangerous parts that could easily cut you or puncture your stomach. Even if you could somehow avoid getting injured, eating an Apache helicopter would still be very toxic. The helicopter contains many chemicals and other substances that are harmful to humans.
For all of these reasons, it is not possible or safe to eat an Apache helicopter.

Expect to see PaLM 2 rolled out across many, many platforms (it already powers 25 Google products), and to compete closely with GPT-4.

The closing lines of Google’s PaLM 2 announcement were compelling:

We’re already at work on Gemini [Alan: expected to be 4x bigger than PaLM 2; at least 1 trillion parameters] — our next model created from the ground up to be multimodal, highly efficient at tool and API integrations, and built to enable future innovations, like memory and planning. Gemini is still in training [Alan: expected to be using Google’s latest TPUv5 chips], but it’s already exhibiting multimodal capabilities never before seen in prior models. Once fine-tuned and rigorously tested for safety, Gemini will be available at various sizes and capabilities, just like PaLM 2, to ensure it can be deployed across different products, applications, and devices for everyone’s benefit. (-via Google)

All my very best,

Alan
LifeArchitect.ai

Discussion | Search | Archives

The Memo by LifeArchitect.ai

The Memo - Special Edition - PaLM 2 release - 11/May/2023

Google's PaLM 2 model is a large language model that competes with OpenAI's GPT-4

The BIG Stuff

Next