The Memo - Special edition: Google DeepMind Gemini
Gemini Ultra model outperforms expert humans on MMLU
FOR IMMEDIATE RELEASE: 6/Dec/2023
Welcome back to The Memo.
The BIG Stuff
Google DeepMind announces Gemini
Once again, we have this out to The Memo readers within just a few hours of model release (it is 3AM here).
Key points:
Four model sizes (Ultra, Pro, and on-device models Nano-1 1.8B and Nano-2 3.25B).
Dense, not Sparse MoE like GPT-4.
Ultra may be around 1T-2T parameters trained on 20T-40T tokens; Chinchilla scaling confirmed in the technical report (20:1).
Multimodal (text, image, audio, and video as inputs): ‘Gemini can directly ingest audio signals at 16kHz from Universal Speech Model (USM) features. This enables the model to capture nuances that are typically lost when the audio is naively mapped to a text input.’ In other words, it may be able to hear voice tone and infer emotion…
Outperforms humans on MMLU. Gemini Ultra scored 90.04%; average humans are at 34.5% while expert humans are at 89.8%. GPT-4 was at 86.4%.
This last point bumped up my conservative countdown to AGI from 56% → 61% today.
The benchmark results for text are confronting, with comparisons to the biggest competitors. Gemini Ultra outperforms GPT-4 in most benchmarks (HellaSwag is an exception).
Read Google’s announce and summary.
Read the technical report (no arch details, 60 pages).
Watch Google’s Gemini demo video.
See also my Gemini page, Models table, and Timeline.
Gemini Pro (not Ultra) is already available on Bard in the US and coming to Vertex AI by 13/Dec/2023.
I’ve added a report card for Gemini.
My annotated paper for Gemini is now available to full subscribers of The Memo.
We’ll cover Gemini during the usual scheduled stream next week, and it will also be covered in my end-of-year report sent to full subscribers shortly.
If you are thinking about becoming a full subscriber, I’d like to invite you to join in with family offices, governments, and companies from around the world.
All my very best,
Alan
LifeArchitect.ai