The Memo - Special Edition - Google DeepMind Gemini pre-release report - 8/Sep/2023 (updated Feb/2024)
Google DeepMind Gemini: A general specialist
FOR IMMEDIATE RELEASE: 8/Sep/2023 (updated Feb/2024)
Welcome back to The Memo.
This is an exclusive report for full subscribers of The Memo. We detail Google and DeepMind model highlights, with a focus on the upcoming Gemini release. PDF download link at the end of this edition.
Notice: A pre-release edition of this independent report (Rev A) was made available in Sep/2023, before the release of Gemini. Following the release of the complete Gemini model family, this Feb/2024 report is the final edition (Rev 0).
Abstract
Since Google’s discovery of the Transformer architecture in 2017, and successive release of their pre-trained transformer language model BERT in October 2018, training large language models (LLMs) has become a new space race, bringing humanity towards its largest evolutionary change yet: ‘superintelligence.’
Between 2020 and 2024, LLMs continued to be trained on increasingly larger datasets, by ever larger teams of data scientists, with compute now measured in the hundreds of millions of dollars. The information synthesized here covers the progress made by Google and DeepMind, presenting as one company under the Alphabet umbrella in 2023, with a focus on the massive Gemini multimodal model.
Gemini Nano and Pro were released on 6/Dec/2023, and Gemini Ultra 1.0 was released on 7/Feb/2024. Gemini Ultra 1.0 is likely to be a dense model of around 1.5 trillion parameters trained on 30 trillion tokens. Compared to the GPT-4 sparse MoE model, Gemini Ultra 1.0 has a similar parameter count while being trained on 2× more data.
Contents
Background
1.1 Etymology
1.2 Google DeepMind: Two archers with one target
1.3 Gemini personnel
1.4 Gemini compute resources
1.5 Large language models
1.6 Text-to-image and visual language models
1.7 The Alpha series of AI systems
1.8 Putting it together: LLM + VLM + Text-to-image
Datasets
2.1 Datasets: Text: MassiveText multilingual
2.2 Datasets: Visual (images and video)
2.3 Datasets: Audio
Gemini capabilities and performance
3.1 Languages
3.2 Visual
3.3 IQ
Size comparison
Implementing and applying Gemini
Conclusion
Further reading
Appendix
Download report (PDF)
The pre-release was released in Sep/2023 to full subscribers of The Memo. This final release is available to the public.
The shareable permanent link is: https://lifearchitect.ai/gemini-report/.
Expert calls: If you or your team would like to have an expert call about this report, please contact Denise by replying to this email, and we’ll get you set up.
All my very best,
Alan
LifeArchitect.ai
I always am amazed that in 2017 Google had split the atom and had no idea of the power that they wielded. They could have classified the paper and began work having 2 year lead.
Embarassing - I get your newsletter and did not even know you had a website to sign into. Anyway - I WAS able to recieve your special report. THX