The Memo - 14/Jan/2024
First models for 2024, MosaicML scaling laws, Kepler K1, and much more!
To: US Govt, major govts, Microsoft, Apple, NVIDIA, Alphabet, Amazon, Meta, Tesla, Citi, Tencent, IBM, & 10,000+ more recipients…
From: Dr Alan D. Thompson <LifeArchitect.ai>
Sent: 14/Jan/2024
Subject: The Memo - AI that matters, as it happens, in plain English
AGI: 64%
Welcome back to The Memo.
You’re joining full subscribers from Alibaba, BAAI, Baidu, Huawei, Tencent, Tsinghua University, and more…
What a spectacular start to 2024! Here’s an interesting thought: If the current pace of AI development were to stop right now, we still have enough current model capabilities to discover and new technology to unpack to power humanity for generations. From the releases of frontier models like GPT-4 and Gemini to open source models like Llama 2 and Mixtral 8x7B, last year gave us tremendous coverage, and it will take us some time to sift through the possibilities of each model. Sidenote: I’m still seeing recent analysis of 2019 GPT-2’s capabilities! (1, 2, 3).
The early winner of The Who Moved My Cheese? AI Awards! for January 2024 is George Carlin’s daughter (‘No machine will ever replace [my Dad’s] genius. These AI generated products are clever attempts at trying to recreate a mind that will never exist again.’)
Not closely related to AI, the Vulcan rocket was successfully launched from Florida on 8/Jan/2024. Along with the ashes and DNA of Arthur C. Clarke (ABC, 8/Jan/2024), the rocket and lander contain a pop-sci version of my dissertation on human intelligence and intuition. The final moon landing seems unlikely now (official updates by Astrobotic), so perhaps it will just zoom through space for a while…
The BIG Stuff
Exclusive: First models for 2024 (Jan/2024)
The first large language models for 2024 are JPMorgan DocLLM (7B, paper) an LLM focused on the spatial layout structure of documents. SUTD TinyLlama (1.1B, paper), out of Singapore, finally finished training from its Sep/2023 start. This model was deliberately overtrained using 2,727 tokens per parameter (see my explanation of Chinchilla data-optimal scaling, and Mosaic scaling later in this edition). The dataset was 1T tokens, and ran for 3 epochs to 3T total tokens seen. Tencent LLaMA Pro (8.3B, paper) presented expanded blocks, with fine-tuning (actually ‘a new post-pretraining method’) on 80B tokens using code and math data.
Exclusive: Counting down to the release of GPT-4.5 (11/Jan/2024)
We’re counting down to the release of OpenAI’s GPT-4.5 model release. Will it be some time in the second half of January 2024? Rumors are scarce, though I’m hoping to see another increase in intelligence, as measured by MMLU score. The time between each model release can be months or years, but each successive model has boasted a significant performance increase across this wide benchmark.
Read more or download viz: https://lifearchitect.ai/gpt-4-5/
My video version of this viz: https://youtu.be/AdEfU1B4IFk
The Interesting Stuff
OpenAI quietly deletes ban on using ChatGPT for ‘Military and Warfare’ (12/Jan/2024)
OpenAI has removed a prohibition on using its AI technology for military purposes from its usage policy, raising questions about potential military applications and the enforcement of ethical guidelines.
OpenAI appears to be silently weakening its stance against doing business with militaries. “I could imagine that the shift away from ‘military and warfare’ to ‘weapons’ leaves open a space for OpenAI to support operational infrastructures as long as the application doesn’t directly involve weapons development...”
Read more: https://theintercept.com/2024/01/12/open-ai-military-ban-chatgpt/
In The Memo edition 30/Apr/2023 we explored major defense contractors like Palantir using large language models for military applications. The models shown in their AIP military platform include:
EleutherAI GPT-J 6B.
Google FLAN-T5 XL 3B.
EleutherAI GPT-NeoX-20B.
Databricks Dolly 2.0 12B.
If you’re interested in this space, check out The Memo edition 30/Apr/2023 for Palantir AIP (watch my archived video on AIP from Apr/2023), and The Memo edition 9/Jul/2023 for coverage of Scale Donovan.
AI-generated book wins first literary award (16/Oct/2023)
This achievement is from a few months ago, but I’ve finally uncovered a copy of the original work in Chinese.
Land of Memories (Chinese: 机忆之地) is a Chinese science-fiction novel by Shen Yang (沈阳), a professor at Tsinghua University’s School of Journalism and Communication. (wiki)
The model used by Prof Shen Yang is unclear. I don’t think it was Tsinghua’s own GLM-130B model though because that doesn’t do images (HF), couldn’t have been Baidu’s ERNIE 4.0 1T model because it hadn’t been released, and so may have relied on 360’s Zhinao 100B (link, Chinese) or similar.
It is reported that this is the first time in the history of literature and the first time in the history of AI that AIGC [AI-generated content] works have participated in the competition together with humans and won awards.
Read the book (Chinese, 126 pages) via QQ.
Read the SCMP article source.
See my list of books by AI ending Mar/2022: https://lifearchitect.ai/books-by-ai/
Full subscribers have access to the first books written by GPT-3 and GPT-4.
MosaicML: Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws (31/Dec/2023)
Training is a once-off process where an AI model is fed with a lot of data to recognize patterns and make connections. Large models use a lot of compute during training over many months, but they only have to do so once.
Inference occurs after the model has finished training, every time a user asks the AI model a question (prompt) to produce an output (response).
For inference, popular models might generate trillions and soon quadrillions of words for millions of users over the model’s lifetime (GPT-3 was generating 4.5 billion words per day back in Mar/2021, and then in Dec/2022 I estimated ChatGPT’s output to be around [calcs corrected 16/Jan/2024]:
310 million words per minute;
446.4 billion words per day;
176 trillion words in the 13 months since launch;
using a lot of compute at all times.
This month MosaicML released a paper proposing a new scaling law:
Accounting for both training and inference, how does one minimize the cost required to produce and serve a high quality model?
We conduct our analysis both in terms of a compute budget and real-world costs and find that LLM researchers expecting reasonably large inference demand (~1B requests) should train models smaller and longer than Chinchilla-optimal.
With a fixed compute budget—and for no real-world users because it was locked in a lab!—Chinchilla would train a 70B model for 1.4T tokens (20:1) and also lists a 70B model for 4.26T tokens (61:1).
Mosaic’s proposal would train a popular/inference-intensive 41.6B model for 7,920B tokens (190:1).
At its most basic, Mosaic is attempting to minimize the cost of serving a model. While we can be certain that OpenAI trained, optimized, and deployed the original ChatGPT (gpt-3.5-turbo) 20B model as efficiently as possible to serve more than 200 million people (most of them free users), the world was not yet accustomed to AI copilots, let alone paying for AI inference.
Consider the major frontier models, and the pricing for inference of one million tokens. This is around 750,000 words, which is about how much we speak every 46 days (2007), or most of the seven Harry Potter books (2017):

Evolutions of the original Kaplan (GPT-3) and Chinchilla scaling laws are expected and welcomed. I’m interested to see how this significant increase in training data—an unspoken conclusion of the Mosaic paper—will develop in the real world.
Read the paper: https://arxiv.org/abs/2401.00448
The largest datasets are listed in my 2023 AI report: https://lifearchitect.ai/the-sky-is-comforting/
Read my ‘plain English’ analysis of scaling laws: https://lifearchitect.ai/chinchilla/
ARK updates their AGI prediction based on Metaculus data (3/Jan/2024)

Artificial general intelligence (AGI) is a machine capable of understanding the world as well as—or better than—any human, in practically every field, including the ability to interact with the world via physical embodiment.
ARK Invest used crowd-sourced data from Metaculus (source with AGI definitions) to visualize AGI date predictions (as of 3/Jan/2024).
The two forecast AGI dates presented are:
~Dec/2026 (if Metaculus forecast error continues).
Sep/2031 (if Metaculus forecast is now well-tuned).
Source: https://twitter.com/wintonARK/status/1742979090725101983
Note that my conservative countdown to AGI remains at 64%, with one of the prediction charts now pointing to Jan/2025.
Read more: https://lifearchitect.ai/agi/
Volkswagen integrates ChatGPT into its vehicles (8/Jan/2024)
Volkswagen announced at CES 2024 the integration of ChatGPT into its vehicles including the 2024 Tiguan, Passat, and Golf.
Volkswagen will be the first volume manufacturer to offer ChatGPT as a standard feature from the second quarter of 2024 in many production vehicles…
Enabled by Cerence Chat Pro, the integration of ChatGPT into the backend of the Volkswagen voice assistant… can be used to control the infotainment, navigation, and air conditioning, or to answer general knowledge questions… enriching conversations, clearing up questions, interacting in intuitive language, receiving vehicle-specific information, and much more.
Read more via Volkswagen Newsroom.
New version of Siri with generative AI again rumored for WWDC (4/Jan/2024)
Apple is rumored to unveil a new Siri with generative AI at WWDC [June 2024], featuring natural conversation and increased personalization across devices.
Apple has recently made progress with integrating generative AI into Siri using its Ajax-based model…
The new features are believed to be available across devices, suggesting that the new version of Siri will retain conversation information from one device to another. It is also said to feature a new "Apple-specific creational service," which might relate to the previously reported Siri-based Shortcuts capabilities rumored for iOS 18. Apple is purportedly working on linkages for the new version of Siri to connect to various external services, likely via an API.
Read more via MacRumors.
See Apple’s Ajax GPT on the Models Table: https://lifearchitect.ai/models-table/
This is another very long edition. Let’s look at a lot more AI, eight new robot pieces including the new Boston Dynamics' Spot clone now available on Amazon for $2499, US gov buying ChatGPT licenses, and bleeding-edge toys like the new Midjourney styles, a new AI-generated film, and two new chat platforms.
Deloitte rolling out AI copilot to 75,000 of its employees (8/Jan/2024)