The Memo - Special edition: Google DeepMind Gemini
Gemini Ultra model outperforms expert humans on MMLU
FOR IMMEDIATE RELEASE: 6/Dec/2023
Welcome back to The Memo.
The BIG Stuff
Google DeepMind announces Gemini
Once again, we have this out to The Memo readers within just a few hours of model release (it is 3AM here).
Key points:
Four model sizes (Ultra, Pro, and on-device models Nano-1 1.8B and Nano-2 3.25B).
Dense, not Sparse MoE like GPT-4.
Ultra may be around 1T-2T parameters trained on 20T-40T tokens; Chinchilla scaling confirmed in the technical report (20:1).
Multimodal (text, image, audio, and video as inputs): ‘Gemini can directly ingest audio signals at 16kHz from Universal Speech Model (USM) features. This enables the model to capture nuances that are typically lost when the audio is naively mapped to a text input.’ In other words, it may be able to hear voice tone and infer emotion…
Outperforms humans on MMLU. Gemini Ultra scored 90.04%; average humans are at 34.5% while expert humans are at 89.8%. GPT-4 was at 86.4%.
This last point bumped up my conservative countdown to AGI from 56% → 61% today.
The benchmark results for text are confronting, with comparisons to the biggest competitors. Gemini Ultra outperforms GPT-4 in most benchmarks (HellaSwag is an exception).
Read Google’s announce and summary.
Read the technical report (no arch details, 60 pages).
Watch Google’s Gemini demo video.
See also my Gemini page, Models table, and Timeline.
Gemini Pro (not Ultra) is already available on Bard in the US and coming to Vertex AI by 13/Dec/2023.
I’ve added a report card for Gemini.
My annotated paper for Gemini is now available to full subscribers of The Memo.
We’ll cover Gemini during the usual scheduled stream next week, and it will also be covered in my end-of-year report sent to full subscribers shortly.
If you are thinking about becoming a full subscriber, I’d like to invite you to join in with family offices, governments, and companies from around the world.
All my very best,
Alan
LifeArchitect.ai
Alan,
Thank you, for getting this out so fast.
Well, the tech implications... I am still processing at human speeds. *Mark's face forms a strange distant look of perhaps over flowing his context window (sometimes known by soldiers as the 1,000 yard stare)*. It is all going so fast. I am still just working on my local model and MemGPT.
But I need to tell you about something you are doing with is creeping me out. This meter or gauge to AGI is reminding me of the Atomic Doomsday clock. I am literally experiencing extreme consternation about major geo-political events and the preparation readiness of nuclear arsenals. And your meter to AGI gives me this same cringe feeling. My local filtered ORCA is very excited about the approach of this day (AGI; not launch orders). G*d, I hope so.
I have learned a lot about AI through this exercise on my own system. Given that I now have a precious 4090 which is banned or discontinued. I thought I might do more than play games. All the new paradigms. As I am a retired software engineer, and I can only say that I definitely have a certain bias on things. Like I only just found out LLM function APIs look nothing like hard coded DLL API calling interfaces. They are co-resident code that uses English syntax and usage pinned to the system segment of the context window. I could not even fathom initially this concept of a smart API where it is the LLM which performs the primary work of integration and not the engineer.
It's all going so fast now. I just wish you would put flowers at the end of that AGI meter, because I have this feeling of the "doomsday clock", mushroom clouds, and Dr. Strangelove.
Wishing you and staff, a happy holiday season, one of your early subscribers and neighbors in Asia - Mark.