To: US Govt, major govts, Microsoft, Apple, NVIDIA, Alphabet, Amazon, Meta, Tesla, Citi, Tencent, IBM, & 10,000+ more recipients…
From: Dr Alan D. Thompson <LifeArchitect.ai>
Sent: 5/Jul/2024
Subject: The Memo - AI that matters, as it happens, in plain English
AGI: 75%
OpenAI CEO (27/Jun/2024):
”There’s tonnes of wonderful things… What are our lives going to be like when it's not just that the computer understands us, gets to know us, and helps us do these things? We can say, 'Hey computer, discover all of physics,' and it can go off and do that. What does it mean when we can say, 'Hey, start and run a great company,' and it can go off and do that? That's a big change."
July 2024! We’re in the second half of the year already. The first half—as presented in my mid-year report—was spectacular.

There’s been a bit of talk about an ‘AI downturn’. If the media can’t see the immense ‘economic benefits’ already apparent, perhaps it is to be expected from that industry. Some are saying we are in the ‘trough of disillusionment’ for AI, and while I like Gartner, their incessant need to map their ‘hype cycle’ graphic (wiki) to anything and everything is… justified by their business model.
There is no AI hype cycle. Like humanity, AI is much more than a tool, a productivity enhancement, an automation, or even an industry. We can’t map ‘evolution’ or ‘imagination’ to a hype cycle. Gartner and Goldman Sachs may want to stick to the knitting.
Contents
The BIG Stuff (Sonnet details, humor…)
The Interesting Stuff (first major app written with AI, acquisitions, dataset…)
Policy (ChatGPT publishing papers, IATSE…)
Toys to Play With (Free chat, running 1.5B/70B/405B models locally, UBI show…)
Flashback (Roadmap…)
Next (Roundtable…)
The BIG Stuff
Claude 3.5 Sonnet details (Jun/2024)
The Claude 3.5 Sonnet release has been significant, and new emergent properties are still being discovered. We first covered this model within a few hours of launch in The Memo edition 21/Jun/2024.
There is still no official information on Claude 3.5 Sonnet: no paper, no technical note, and no model card. There is a blog post with some benchmarks and pretty pictures, and that’s it.
In 2020, AI labs were proud to release hundred-page academic papers about their models. By 2023, this had shrunk to releasing short ‘technical notes’ or single-page ‘model cards’. Now—apparently—we have to trawl through mainstream media pieces to get just a glimpse of how models were trained.
Michael Gerstenhaber, head of product at Anthropic, was interviewed by two outlets in particular where he provided a little more detail on Claude 3.5 Sonnet. For Wired:
Claude 3.5 Sonnet model is larger than its predecessor but draws much of its new competence from innovations in training. For example, the model was given feedback designed to improve its logical reasoning skills. — 20/Jun/2024
For TechCrunch:
The improvements are the result of architectural tweaks and new training data, including AI-generated data. Which data specifically? Gerstenhaber wouldn’t disclose, but he implied that Claude 3.5 Sonnet draws much of its strength from these training sets. — 20/Jun/2024
I’ve been using the incredible Claude 3.5 Sonnet Artifacts component, and it really is amazing to see in real time (as we explored in the recent roundtable). Take a look at my video (link) and generated web page at https://lifearchitect.ai/distractions
I’ve also taken the time to highlight the full 2,800-word system prompt for Claude 3.5 Sonnet Artifacts. The biggest innovation here is Anthropic’s ‘<antThinking>’ mechanism which allows Claude 3.5 to privately think and reason step-by-step, also known as chain-of-thought (CoT) reasoning.
Anthropic has documented the older <thinking> tag here: https://docs.anthropic.com/en/docs/build-with-claude/tool-use#chain-of-thought
In early July 2024, other researchers flagged the new antThinking hidden mechanism here and here.
Take a look at the Claude 3.5 Sonnet Artifacts system prompt.
How funny is ChatGPT? A comparison of human- and A.I.-produced jokes (3/Jul/2024)
Last year I had a difference of opinion with Prof Jeremy Howard during a private discussion where I told ABC that AI would outperform humans in standup comedy (joke telling). Turns out I was qualitatively and quantitatively correct. A new study systematically tested ChatGPT 3.5’s humor production abilities against human participants. Results showed that ChatGPT 3.5-produced jokes were rated as equally funny or funnier than human-produced jokes, regardless of the comedic task.
ChatGPT outperformed the majority of our human humor producers on each task. ChatGPT 3.5 performed above 73% of human producers on the acronym task, 63% of human producers on the fill-in-the-blank task, and 87% of human producers on the roast joke task.
It is unfortunate that researchers keep using the smaller and lower quality gpt-3.5 20B model versus the much larger GPT-4 Classic 1.76T model. They are effectively testing something that is 88 times smaller (and perhaps 88 times worse) than the current state-of-the-art model.

In my time as a human intelligence researcher working alongside Mensa International and the Davidson Academy and many education systems, humor was widely accepted to be a strong indicator of exceptional intelligence (listen to my tribute to Prof Miraca Gross for GE where she talks about this). I await similar testing on GPT-4 (or the current SOTA, Claude 3.5!).
Read the new ChatGPT humor paper via PLOS ONE.
See it on my GPT Achievements Table.
This is another mammoth edition: around 4,000 words, featuring more than 10 new AI toys to play with…
The Interesting Stuff
Baidu ERNIE 4.0 Turbo (28/Jun/2024)
Baidu announced a new model called ERNIE 4.0 Turbo, with no detail besides the name. The company seems to be following OpenAI’s naming scheme and model design quite closely:
May/2020: OpenAI GPT-3 175B
(1½ year gap…)
Dec/2021: Baidu ERNIE 3.0 260B
Mar/2023: OpenAI GPT-4 1.76T
(7 month gap…)
Oct/2023: Baidu ERNIE 4.0 1T
Nov/2023: OpenAI GPT-4 Turbo
(7 month gap…)
Jun/2024: Baidu ERNIE 4.0 Turbo
Baidu also revealed that ERNIE Bot has 300 million users, which would be about 2× more than OpenAI’s ChatGPT.
Read (not very much) more via Reuters.
The ERNIE Bot model playground is available to Chinese citizens with a Chinese mobile phone number here: https://yiyan.baidu.com/
New datasets: DCLM-Pool and DCLM-Baseline (20/Jun/2024)

A team of researchers from 23 labs (including University of Washington, Apple, and Toyota Research) have deployed the world’s largest dataset, using web data from Common Crawl.
The final dataset is 240 trillion tokens in 1PB (1,000TB or 1,000,000GB) uncompressed.
DCLM-Pool is the largest dataset to date, 8× larger than the previous Oct/2023 SOTA of RedPajama-Data-v2 with 30 trillion tokens in 125TB.
The full DCLM-Pool dataset is nearly useless though, as shown in the graphic above, and has to be filtered down to around 1% of its size to be useful for model training right now. The resulting dataset is called DCLM-Baseline, and is 4 trillion tokens in about 13,000GB uncompressed.
Interestingly, the initial web rip is pretty similar to what we achieved in 2020. GPT-3’s initial CC download was 45TB, versus the DCLM-Pool dataset at 370TB compressed. (From the GPT-3 paper: "The CommonCrawl data was downloaded from 41 shards of monthly CommonCrawl covering 2016 to 2019, constituting 45TB of compressed plaintext before filtering".)
Read my paper: ‘What’s in my AI?’: https://lifearchitect.ai/whats-in-my-ai/
Read the DCLM paper: https://arxiv.org/abs/2406.11794
See the project page: https://www.datacomp.ai/dclm/
See DCLM-Pool and DCLM-Baseline on my updated Datasets Table: https://lifearchitect.ai/datasets-table/
LetterDrop: First major app developed entirely with AI (Jun/2024)
Developer Dawei Ma has used GPT-4o to generate a newsletter app linked to Cloudflare:
I used the GPT-4o model to generate the code for LetterDrop. That means the code is generated by the AI model, and I only need to provide the prompts to the model. This approach is very efficient and can save a lot of time. I've also recorded a video to show how to create the LetterDrop project using the GPT-4o model.
That also means you can easily customize the code by changing the prompts. You can find the prompts in the CDDR file.
Take a look: https://github.com/i365dev/LetterDrop
OpenAI acquisition #1: Enterprise data startup ‘Rockset’ (21/Jun/2024)
OpenAI has made its first acquisition by purchasing Rockset, an enterprise analytics startup, to enhance its retrieval infrastructure across various products. The terms of the acquisition were not disclosed, but Rockset has raised $105 million in funding to date. The integration will see some members of the Rockset team joining OpenAI, as the company gradually transitions its current customers off the platform.
Official announce: https://openai.com/index/openai-acquires-rockset/
Read more via Slashdot.
OpenAI acquisition #2: Remote collaboration platform 'Multi' (24/Jun/2024)
OpenAI has acquired Multi, a New York City-based startup specializing in screenshare and collaboration technologies for Mac users. Multi's team will join OpenAI's ChatGPT desktop team, enhancing capabilities for the ChatGPT for Mac desktop app. The current version of Multi’s software will be sunset on July 24, 2024, with all user data being deleted.
Official announce: https://multi.app/blog/multi-is-joining-openai
Read more via Slashdot.
Swallow this robot: Endiatx’s tiny pill examines your body with cameras, sensors (20/Jun/2024)
Endiatx is pioneering medical technology with its PillBot, a swallowable robotic capsule equipped with cameras and sensors for gastrointestinal examination. The company has raised US$7M to date and is currently in clinical trials, aiming for FDA approval and a commercial launch by early 2026. CEO Torrey Smith envisions AI playing a crucial role in making the technology widely accessible, potentially allowing fully autonomous operation in the future.
Read more via VentureBeat.
Morgan Stanley OpenAI-powered assistant to roll out for wealth advisors (26/Jun/2024)
Morgan Stanley is introducing an AI assistant named Debrief, built using OpenAI's GPT-4, aimed at automating note-taking and email drafting for its wealth advisors. This tool is expected to significantly reduce manual labor, allowing advisors to spend more time engaging with clients. The assistant will be available to approximately 15,000 advisors by early July.
Read more via CNBC.
AI is likely to displace more finance jobs than any other sector, Citi says (19/Jun/2024)
Citigroup reports that artificial intelligence is expected to displace more jobs in the banking industry than any other sector. The study highlights that approximately 54% of banking jobs have a high potential for automation, while an additional 12% could be augmented by AI.
Read more via Bloomberg.
Powering AI-driven research with Argonne's Aurora exascale supercomputer (31/May/2024)
This video highlights how Argonne National Laboratory’s Aurora exascale supercomputer is revolutionizing AI-driven research. Aurora’s immense computational power enables researchers to tackle complex problems in various fields, from climate modeling to drug discovery, by processing vast amounts of data with unprecedented speed and accuracy.
Find out how 60,000 GPUs are being used to train the 1T-parameter AuroraGPT, also known as ScienceGPT.
Watch the video: https://youtu.be/djEzdORj0F0
See it on the Models Table: https://lifearchitect.ai/models-table/
Figure status update - BMW full use case (1/Jul/2024)
This YouTube video provides a comprehensive update on the implementation of Figure's technology at BMW. The video showcases how BMW has integrated Figure's solutions to optimize various aspects of their operations, enhancing efficiency and productivity in their automotive manufacturing processes.
Watch the video (link):
Skeleton key, a new type of generative AI jailbreak technique (26/Jun/2024)
In generative AI, jailbreaks like Skeleton Key are malicious user inputs that attempt to circumvent an AI model’s intended behavior. This technique uses a multi-turn strategy to cause a model to ignore its guardrails, allowing the user to execute ordinarily forbidden actions. Microsoft has addressed this issue in Azure AI-managed models and shared findings with other AI providers.
Read more via Microsoft Security Blog.
Read more via Perplexity.
Exclusive: How Shake Shack used Reddit and AI to drive sales (1/Jul/2024)
Shake Shack utilized a generative AI chatbot [via LLM], The Big Letbotski, to analyze over 80,000 active subreddits for relevant conversations about chicken sandwiches. The campaign focused on phrases like ‘healthy lunch on Sunday’ and ‘gourmet fast food,’ leading to targeted ads across 30 subreddits and resulting in over 13,000 ad clicks, surpassing expectations by 31%.
Read more via Adweek.
The owner of Toys ‘R’ Us just used OpenAI’s Sora to animate the zombie brand (25/Jun/2024)
WHP Global, the owner of Toys ‘R’ Us, has created the first-ever brand film using OpenAI’s text-to-video tool, Sora. This partially AI-generated video premiered at the 2024 Cannes Lions Festival and is available on toysrus.com. The creative agency Native Foreign, which produced the video, mentioned that Sora completed about 80-85% of the work, with the rest being handled by human corrective VFX.
Read more via The Verge.
Watch the video (link):
Policy
The word ‘delve’ and how cheap, outsourced labour in Africa is shaping AI English (Apr/2024 and Jun/2024)
For many years now I’ve spoken about how fine-tuning on human preferences is a fool’s errand. Known as RLHF—reinforcement learning from human feedback—using humans to do AI’s work results in some horrible issues. You can read my thoughts here:
https://lifearchitect.ai/alignment/
Back in April, the Guardian jumped on a finding by Aussie researcher Prof Jeremy Nguyen (Tweet) who asked:
Are medical studies being written with ChatGPT? Well, we all know ChatGPT overuses the word "delve". Look below at how often the word 'delve' is used in papers on PubMed (2023 was the first full year of ChatGPT).
The Guardian investigated exploitation of African workers who are paid minimal wages to assist in the creation of chatbots, resulting in their language patterns being mirrored by AI systems. This has led to the emergence of ‘AI-ese,’ a distinct writing style used by AI assistants.
[The word] “delve” was overused by ChatGPT compared to the internet at large. But there’s one part of the internet where “delve” is a much more common word: the African web. In Nigeria, “delve” is much more frequently used in business English than it is in England or the US. So the workers training their systems provided examples of input and output that used the same language, eventually ending up with an AI system that writes slightly like an African.
And that’s the final indignity. If AI-ese sounds like African English, then African English sounds like AI-ese… how much worse will it get when a significant chunk of humanity sounds like the AI systems they were paid to train?
Read more via The Guardian.
Now, researchers in Germany have analyzed the language change rigorously.

We study vocabulary changes in 14 million PubMed abstracts from 2010-2024, and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words. Our analysis based on excess words usage suggests that at least 10% of 2024 abstracts were processed with LLMs. This lower bound differed across disciplines, countries, and journals, and was as high as 30% for some PubMed sub-corpora. We show that the appearance of LLM-based writing assistants has had an unprecedented impact in the scientific literature, surpassing the effect of major world events such as the Covid pandemic.
Read the paper: ‘Delving into ChatGPT usage in academic writing through excess vocabulary’: https://arxiv.org/abs/2406.07016
IATSE agreement clears way to use artificial intelligence as a tool (30/Jun/2024)
The International Alliance of Theatrical Stage Employees (IATSE) is a labor union founded in 1893 representing over 168,000 behind-the-scenes workers (wiki).
I had a little do to with the organization in a previous life while working as Head of Sound for big theatre and events through Asia Pacific, with some projects in New York. And yes, IATSE can be pronounced ‘Yahtzee’ or ‘Eye-Yahtzee’!
Under the agreement struck last week between the major studios and the union representing crew, artificial intelligence can be used as a tool, with some limitations.The deal provides that workers may ask their employers for a ‘consultation’ about AI use, that a committee will be set up to offer AI skills training, and that AI use cannot be outsourced to non-union labor.
The Memorandum of Agreement is effective from 1 August 2024. Specific provisions related to the use of AI systems have been established, including employee protections and the formation of a committee to develop AI skills training programs.
The role of AI was a major theme of last year’s strikes by the Writers Guild of America and SAG-AFTRA. In the end, both unions got deals that give creators control over how they use AI — within company policies — and guarantees that AI use will be compensated.
Read analysis by Variety.
Download the report (PDF, 7 pages): https://iatse.net/wp-content/uploads/2024/06/2024-SUMMARY-OF-BASIC-AGREEMENT-NEGOTIATIONS_6.28.24-FINAL.pdf
Toys to Play With
From bare metal to a 70B model: infrastructure set-up and scripts (25/Jun/2024)
In a few months, a small team of researchers and engineers from Imbue trained a 70B parameter model on their own infrastructure, outperforming zero-shot GPT-4 on reasoning tasks. This end-to-end guide details the challenges and solutions in setting up the infrastructure, from initial cluster setup to error recovery. The team also released several infrastructure scripts to assist other teams in stabilising their model training environments.
Read more via Imbue.
Chrome running Gemini locally (24/Jun/2024)
Chrome Canary now runs a nano version of Gemini locally in the browser. I expect this to be expanded and released to the public edition of Chrome soon.
See more: https://x.com/mortenjust/status/1805190952358650251
Hugging Face Engineer Matthew Carrigan explains how to run Llama 3 405B locally (21/Jun/2024)
At some point this summer [US summer is around July, August, September], Meta AI will be releasing a LLaMA-3 model with 400B parameters. It will likely be the strongest open-source LLM ever released by a wide margin. This thread discusses how to run it locally, including hardware requirements and cost considerations.
Read more via Thread Reader App or source via X.
llama.ttf (Jun/2024)
llama.ttf is both a font file and a large language model (LLM) inference engine. Utilizing the HarfBuzz font shaping engine, llama.ttf can execute arbitrary code, allowing it to function as an LLM within any application. This unique setup enables local text generation without waiting for vendor updates, essentially embedding AI capabilities directly into your text editor or email client.
Read more via llama.ttf.
Testing generative AI for circuit board design (21/Jun/2024)
A software team tested LLMs to figure out how helpful they are for designing a circuit board. They looked at the usability of frontier models (GPT-4o, Claude 3 Opus, Gemini 1.5) across a set of design tasks, to find where they are and are not useful.
Read more via Jitx Corporate Blog.
Synthesia’s hyperrealistic deepfakes will soon have full bodies (24/Jun/2024)
Startup Synthesia is advancing its AI-generated avatars, enabling them to have full-body movements and more expressive gestures. These new avatars, which can perform actions like singing and walking, aim to launch by the end of the year. While the technology still has minor imperfections, such as hand movements occasionally overlapping, it significantly enhances the realism of digital avatars, raising both opportunities and ethical concerns.
Read more via MIT Technology Review.
Take a moment of silence for Leta, based on Synthesia v1: https://lifearchitect.ai/leta/
Her most popular episode was Episode 30 with ~250,000 views: https://youtu.be/zJDx-y2tPFY
LiveBench benchmark by Abacus, NYU, NVIDIA, UMD, USC (26/Jun/2024)
LiveBench is a benchmark for LLMs designed with test set contamination and objective evaluation in mind. It is not publicly available, instead you submit your model and the evaluators run private testing.
LiveBench has the following properties:
LiveBench is designed to limit potential contamination by releasing new questions monthly, as well as having questions based on recently-released datasets, arXiv papers, news articles, and IMDb movie synopses.
Each question has verifiable, objective ground-truth answers, allowing hard questions to be scored accurately and automatically, without the use of an LLM judge.
LiveBench currently contains a set of 17 diverse tasks across 6 categories, and will release new, harder tasks over time.
Read more via LiveBench.
LibreChat (Jun/2024)
LibreChat is an open-source chat platform designed to provide flexible and customizable communication solutions. The platform allows users to create new chat rooms, manage conversations, and integrate various plugins for enhanced functionality. LibreChat emphasizes privacy and data security, making it a robust choice for both personal and professional use.
Read more via SlashDot.
Try it (free, login): https://librechat-librechat.hf.space/c/new
Slack AI (Feb/2024)
Anthropic Head of Sales and Partnerships, Kate Earle Jensen:
‘Our team loves how quickly they can find answers with Slack AI, which translates to faster decision-making and a greater focus on work that really drives an impact.’
Slack’s new generative AI features harness institutional knowledge so employees can get up to speed instantly. An internal analysis during the pilot found that customers such as SpotOn, Uber, and Anthropic could save an average of 97 minutes per user each week using Slack AI to find answers, distil knowledge and spark ideas.
Read the announce: https://slack.com/intl/en-au/blog/news/slack-ai-has-arrived
Take a look: https://slack.com/intl/en-au/features/ai
Artifacts-like chrome extension for ChatGPT (1/Jul/2024)
The ChatGPT Code Preview Extension enhances the coding experience by allowing users to preview and interact with code snippets directly within the ChatGPT interface. Key features include live code preview, syntax highlighting, and the ability to copy or download code snippets.
View the repo: https://github.com/ykyritsis/ChatGPT-code-preview
This new Fox show is about a guy in a universal basic income program (29/Jun/2024)
Fox will release ‘Universal Basic Guys’, a new animated series, this fall [US fall is around October, November, December], which satirizes universal basic income. The show features two brothers who join a $3,000-a-month basic income program after their factory job is automated. The series explores themes around job loss due to AI and the societal impacts of universal basic income programs.
Read more via Business Insider.
Flashback
Two years ago, I published a paper called ‘Roadmap: AI’s next big steps in the world (AI that matters, as it happens, in plain English)’. It still seems relevant, though we’re further along the timeline of progress.
You are here.
You live in 2022.
You have a front row seat to the most exciting period in human history.
And perhaps most importantly, for some reason, you have unparalleled access to the inner workings of what’s going on. AI labs are extraordinarily and astonishingly open about their progress. You can read the AI papers at no charge, as they are released. You can play with many of the AI models, often for free. You are living in the future. And AI’s next big steps in the world are going to be groundbreaking.
Read it: https://lifearchitect.ai/roadmap/
Watch my video (link):
Next
There are many models in the pipeline that have finished training in the last few weeks but have not been fully released:
Grok 2 (text generation)
GPT-4o (image generation and voice components)
Imagen 3 (image generation)
Claude 3.5 Opus (text generation)
Sora (video generation)
GPT-5 (text generation)
and many more…
The next roundtable will be:
Life Architect - The Memo - Roundtable #14
Follows the Chatham House Rule (no recording, no outside discussion)
Saturday 13/Jul/2024 at 5PM Los Angeles
Saturday 13/Jul/2024 at 8PM New York
Sunday 14/Jul/2024 at 10AM Brisbane (new primary/reference time zone)
or check your timezone via Google.
You don’t need to do anything for this; there’s no registration or forms to fill in, I don’t want your email, you don’t even need to turn on your camera or give your real name!
All my very best,
Alan
LifeArchitect.ai