To: US Govt, major govts, Microsoft, Apple, NVIDIA, Alphabet, Amazon, Meta, Tesla, Citi, Tencent, IBM, & 10,000+ more recipients…
From: Dr Alan D. Thompson <LifeArchitect.ai>
Sent: 13/May/2024
Subject: The Memo - AI that matters, as it happens, in plain English
AGI: 72% ➜ 73%
Here’s something I’ve been thinking about a lot this year: Who will pay them back? For the voice, I tried using OpenAI TTS and ElevenLabs, but they didn’t have the right feel. Special thanks to Jennifer Vuletic (link) for her human voiceover and slotting this mini project into her schedule, in between her work for massive publishers and royalty like Australian Prime Minister Julia Gillard (Audible).
Watch my video (16mins, link):
Contents
The BIG Stuff (DrEureka, Neuralink update…)
The Interesting Stuff (Ukraine avatar, full Sora video, Optimus, Dojo 18kA…)
Policy (Pause, OpenAI specs + Alan rant, new govt supercomputers…)
Toys to Play With (LLM for kids, music, MIT, Super Mario…)
Flashback (GPT-3…)
Next (new models, invitation link to next roundtable…)
The BIG Stuff
Next: OpenAI spring update + Google I/O (13/May/2024)
In 24h from this emailed edition, OpenAI will livestream a ‘spring update’ (link).
This will also be the day before the Google I/O conference (link) where Google is expected to announce a Sora clone called Miro (text-to-video), as well as Imagen 3 (text-to-image), and Juno V1 3B (inpainting images). (via leaker Bedros Pamboukia, 12/May/2024)
Sidenote: The use of seasonal terms like ‘spring’ by US entities is incredibly boring. Aside from the fact that the US has only 4.23% of the total world population and might like to consider the other 95%, its seasons move based on solstices (link) unlike say Australia where our seasons change on the first of the month. Also, what even is ‘fall’?
Here’s the expected lineup for the OpenAI announcements:
❌ GPT-5
❌ Search engine [Sidenote: This would mean Reuters was wrong again…]
✅ Phone calls within ChatGPT via WebRTC (wiki) + more integrations
✅ New models gpt-4l, gpt-4l-auto, gpt-4-auto, maybe related to new models coming in Microsoft Copilot: next-model4 and next-model8
✅ A Steve Jobs-style ‘one more thing’ (here’s a fun flashback video by CNET)
I’ll update the web version of this edition of The Memo here:
UPDATE 1:
GPT-4o (Omnimodel) OpenAI says is ‘the best model in the world’ (13/May/2024)
MMLU=88.7. GPQA=53.6 (see viz below).
Livestream link: YouTube.com (26mins).
UPDATE 2:
Dr Jim Fan (14/May/2024): ‘[GPT-4o is] likely an early checkpoint of GPT-5’
But my testing shows GPT-4o is actually worse than GPT-4 Turbo (and definitely Claude 3 Opus) across ‘IQ’ benchmarks. It’s only the multimodal aspect that makes this an evolution.
Wild video demo (link):
GPT-4 + Unitree Go1 quadruped robot = DrEureka (UPenn, NVIDIA, UT Austin) (4/May/2024)
Dr Jim Fan from NVIDIA announced a ‘surprising’ evolution of embodied AI, using GPT-4 to intuitively address ‘friction, damping, stiffness, gravity, etc.’ in robotics. Working alongside a team from UPenn and UT Austin, the system is called DrEureka.
The name comes from Domain Randomization via the 2023 Eureka system, combining LLMs (currently based on GPT-4) with NVIDIA GPU-accelerated simulation technologies (20/Oct/2023).
We trained a robot dog to balance and walk on top of a yoga ball purely in simulation, and then transfer zero-shot to the real world. No fine-tuning. Just works.
I’m excited to announce DrEureka, an LLM agent that writes code to train robot skills in simulation, and writes more code to bridge the difficult simulation-reality gap. It fully automates the pipeline from new skill learning to real-world deployment.
The Yoga ball task is particularly hard because it is not possible to accurately simulate the bouncy ball surface. Yet DrEureka has no trouble searching over a vast space of sim-to-real configurations, and enables the dog to steer the ball on various terrains, even walking sideways!
Traditionally, the sim-to-real transfer is achieved by domain randomization, a tedious process that requires expert human roboticists to stare at every parameter and adjust by hand. Frontier LLMs like GPT-4 have tons of built-in physical intuition for friction, damping, stiffness, gravity, etc.
We are (mildly) surprised to find that DrEureka can tune these parameters competently and explain its reasoning well. DrEureka builds on our prior work Eureka, the algorithm that teaches a 5-finger robot hand to do pen spinning. It takes one step further on our quest to automate the entire robot learning pipeline by an AI agent system. One model that outputs strings will supervise another model that outputs torque control. (4/May/2024)
This complex AI embodiment advance—and especially the application of LLMs to sense and adjust parameters for ‘friction, damping, stiffness, gravity, etc’—moved my AGI countdown another percentage point from 72% ➜ 73%.
View the repo + videos.
Read an analysis by NewAtlas.
Neuralink PRIME Study Progress Update — User Experience (8/May/2024)
Noland Arbaugh, the first human participant in Neuralink's PRIME study, has been using the Link brain-computer interface to control his laptop and play games from various positions, including while lying down in bed. He’s just hit 100 days of having it installed. Noland said:
Y'all are giving me too much, it's like a luxury overload, I haven't been able to do these things in 8 years and now I don't know where to even start allocating my attention.
The biggest thing with comfort is that I can lie in my bed and use [the Link]… It lets me live on my own time, not needing to have someone adjust me, etc. throughout the day.
[The Link] has helped me reconnect with the world, my friends, and my family. It's given me the ability to do things on my own again without needing my family at all hours of the day and night.
[The Neuralink BCI is] still improving; the games I can play now are leaps and bounds better than previous ones. I’m beating my friends in games that as a quadriplegic I should not be beating them in.
I think it should give a lot of people a lot of hope for what this thing can do for them, first and foremost their gaming experience, but then that'll translate into so much more and I think that's awesome.
Read more via Neuralink Blog.
As usual, the media focused on the Terrible, Horrible, No Good, Very Bad AI™ (that’s from a book that I know you’ll recall, wiki), with some threads physically retracting from Noland’s brain, a practical issue already resolved by Neuralink: CNBC, Wired, WSJ.
The Interesting Stuff
DeepSeek-V2 (8/May/2024)
DeepSeek-AI has released a 236B parameter MoE model called DeepSeek-V2, trained on an incredibly large dataset of 8.1T tokens. MMLU=78.5. The dataset included 12% Chinese, ‘therefore, we acknowledge that DeepSeek-V2 still has a slight gap in basic English capabilities [even compared with smaller models like Llama 3 70B]’.
Read the paper: https://arxiv.org/abs/2405.04434
Try it here (free, login): https://chat.deepseek.com/
See it on the Models Table.
Victoria Shi, digital representative of Ukraine (1/May/2024)
The Ministry of Foreign Affairs (MFA) of Ukraine was established in 1991 when Ukraine became an independent state after the collapse of the Soviet Union. (PDF, 1999)
Meet Victoria Shi — a digital representative of the MFA of Ukraine, created using AI to provide timely updates on consular affairs! For the first time in history, the MFA of Ukraine has presented a digital persona that will officially comment for the media.
Comments from Victoria will appear on the MFA's official website & social media platforms. The only original videos featuring statements from Victoria are those that contain a QR code linking to the MFA's official page with the statement's text.
Source: https://twitter.com/MFA_Ukraine/status/1785558101908742526
This is an interesting evolution of avatars, a nice upgrade to my 2021 Leta AI, and the 2022 Marija by the Government of Malta (featured in one of the very first editions of this advisory—more than two years ago—in The Memo edition 12/Mar/2022).
Unfortunately, the use of QR codes for provenance is misguided and naive, offering no real security or protection, and maybe even adding an attack vector for hackers to pursue.
All editions of The Memo provide robust, industry-grade, comprehensive advisory to government, enterprise, and you. We’re just ⅓ of the way through the 4,300 words of this edition, including my 900-word audit of OpenAI’s recently released model document. Let’s get into it!
Wayve: NVIDIA and Microsoft invest as UK AI firm raises US$1B (7/May/2024)
London-based Wayve has raised US$1B to develop its artificial intelligence for driverless cars, its technology learning to drive by watching human drivers. Microsoft and chip-maker NVIDIA took part in the funding round, which values Wayve at around US$2.5B. It is the largest known investment in an AI company in Europe to date.
We’ve covered these guys in The Memo a couple of times due to their release of a model called GAIA-1.
Wayve is developing technology intended to power future self-driving vehicles by using what it calls "embodied AI".
Unlike AI models carrying out cognitive or generative tasks such as answering questions or creating pictures, this new technology interacts with and learns from real-world surroundings and environments.
Read more via BBC News.
Read the announce by Wayve.
Sora music video: Washed Out - The Hardest Part (1/May/2024)
This is the first official commissioned music video collaboration between a music artist and filmmaker made with OpenAI's Sora video model.
It looks just a little weird and a little rough, but give it a few weeks/months…
Watch the video (link):
Tesla Optimus update (5/May/2024)
Tesla continues to develop its Optimus humanoid, now performing even more factory tasks. The video below is from Tesla, 1m27s.
Source: https://twitter.com/Tesla_Optimus/status/1787027808436330505
Tesla’s wafer-scale AI processor enters production (2/May/2024)
TSMC revealed that Tesla’s Dojo system-on-wafer processor for AI training is now in mass production. The system combines 25 chips on a single wafer using TSMC's integrated fan-out (InFO) technology for wafer-scale interconnections (InFO_SoW) tech. The massive 15,000W processor 'is on track to be deployed shortly' and 'requires a sophisticated cooling system' to handle the extreme heat.
To feed the system-on-wafer, Tesla uses a highly complex voltage-regulating module that delivers 18,000 amps of power to the compute plane. The latter dissipates as much as 15,000W of heat and thus requires liquid cooling.
Read more via Tom's Hardware and IEEE.
And that 18,000 amps is not a misprint. This analysis from Sep/2021:
It takes 52V DC and draws 18,000 amps. It dissipates 15KW of thermal. The key thing is that the compute plane is orthogonal to power supply and cooling. The only thing I've seen that compares to this is the support infrastructure required to power and cool the Cerebras wafer-scale chip. So this is a 9 petaflop training tile in less than one cubic foot.
See the real system in the history-making Tesla video from 2021 (timecode 1h57m17s):
CoreWeave, a US$19B AI compute provider, opens European HQ in London with plans for 2 UK data centers (10/May/2024)
CoreWeave, a New Jersey-based GPU cloud company valued at US$19B, has opened an office in London to serve as its European headquarters. The startup also plans to invest £1B (US$1.25B) to open two data centers in the U.K. this year.
Read more via TC.
'I will never go back': Ontario family doctor says new AI notetaking saved her job (2/May/2024)
Dr. Rosemary Lall, a family physician in Scarborough [Toronto, Ontario, Canada], was ready to quit her job due to the overwhelming paperwork burden, until she started using new AI notetaking apps that act as a real-time note-taking assistant during patient visits.
Physicians, Lall said, are expected to update patient charts, fill out medical forms, provide sick notes and provide specialist referrals.
The administrative burden would often take her up to two hours per day. Ontario’s Medical Association has estimated family doctors spend 19 hours per week on administrative tasks, including four hours spent writing notes or completing forms for patients.
I really feel this should be the next gold standard for all of our doctors. It decreases the cognitive load you feel at the end of the day.
Read more via Global News.
X using Grok AI to generate Stories (3/May/2024)
X (formerly Twitter) is using its Grok AI to publish AI-generated summaries of trending news and discussions on the platform in a new 'Stories on X' feature for premium subscribers. The summaries, which come with a disclaimer that 'Grok can make mistakes', are similar to the old human-curated Twitter Moments that were discontinued in 2022.
Read more via Engadget.
Atlassian Rovo uses AI to unlock enterprise knowledge (1/May/2024)
Atlassian announced Rovo, a new AI-powered product that helps organizations 'find, learn, and act on information dispersed across a range of internal tools'. Rovo leverages Atlassian's 'teamwork graph' which draws in data from Atlassian products and connected SaaS apps to deliver relevant search results, interactive knowledge cards, conversational AI, and AI agents to streamline workflows.
How the computer games industry is embracing AI (3/May/2024)
Artificial intelligence is increasingly being used in the video games industry, from creating highly realistic graphics to generating complex game scenarios. AI can generate immersive open-world environments and is being used to create unique in-game experiences tailored to each player.
Sometimes the AI comes up with surprising ideas.
"I remember once we were trying to build a police station and we asked the AI to populate it, and it came back with a doughnut on every desk.
"Another time, we were building an apartment and it kept consistently putting a sock under the coffee table. We wondered if it was a bug but it turned out we had labelled it a bachelor apartment so I guess that it was logical to some extent," he says.
Read more via BBC News.
40,000 AI-narrated audiobooks flood Audible, dividing authors and listeners (6/May/2024)
Over 40,000 audiobooks narrated by AI have been added to Audible since Amazon launched a tool allowing self-published authors to easily generate AI narrations. While some indie authors celebrate the cost savings, the flood of AI audiobooks is raising concerns among professional narrators about potential job losses.
Read more via TechSpot.
Worldcoin is surging in Argentina thanks to 288% inflation (1/May/2024)
With an economic crisis gripping Argentina, people are having their irises scanned in exchange for US$50 in crypto. Authorities have called for Worldcoin to be investigated.
De León is one of about half a million Argentines who have handed their biometric data over to Worldcoin.
Last year, Worldcoin Orbs were available in 25 countries. Today, they are limited to Argentina, Chile, Germany, Japan, Singapore, Mexico, South Korea, and the US.
Read more via Rest of World.
More than 4 million people in 120 countries have signed up to have their irises scanned. Read an analysis via Reuters.
The Worldcoin Orbs are just one piece of the puzzle related to Sam Altman’s 2021 view of what happens after AGI: https://moores.samaltman.com/
Policy
Defense think tank MITRE to build AI supercomputer with NVIDIA (7/May/2024)
MITRE is a federally funded, not-for-profit research organization that has supplied US soldiers and spies with exotic technical products since the 1950s.
If you’ve ever wondered just how far government is behind enterprise, here’s a very clear indicator:
…the planned [2024 govt] supercomputer will run 256 NVIDIA graphics processing units, or GPUs, at a cost of US$20 million. This counts as a small supercomputer: the world’s fastest supercomputer, Frontier in Tennessee, boasts 37,888 GPUs, and Meta is seeking to build one with 350,000 GPUs.
…
“There’s huge opportunities for AI to make government more efficient,” said Charles Clancy, senior vice president of MITRE. “Government is inefficient, it’s bureaucratic, it takes forever to get stuff done … That’s the grand vision, is how do we do everything from making Medicare sustainable to filing your taxes easier?”
“This is a platform by which MITRE can train these large-language models,” he said. “You can’t do this important AI work if you don’t have this infrastructure.”
Sidenote: It’s both laughable and incredibly worrying that government is at least four years behind. Recall that all the way back in 2020, OpenAI’s GPT-3 was trained on thousands of NVIDIA V100s, perhaps 10-20× more than this new govt ‘supercomputer’…
Read more via The Washington Post.
Pause AI (May/2024)
The organized protests have begun! I can see the picket signs now: ‘No more intelligence, please. We want to be dumber!’. Or maybe a crisp ‘Let China win!’.
This one is timed for the OpenAI livestream in 24h:
Join our Bay Area protest location at OpenAI at 10am on Monday, May 13 to ask our representatives to be heroes at the Seoul AI Safety Summit to pause OpenAI and all frontier models!
Read more: https://twitter.com/ilex_ulmus/status/1785755228744380611
Read the official site: https://pauseai.info/2024-may
OpenAI’s Model Spec (8/May/2024)
To deepen the public conversation about how AI models should behave, we’re sharing the Model Spec, our approach to shaping desired model behavior…
Shaping this behavior is a still nascent [just beginning] science, as models are not explicitly programmed but instead learn from a broad range of data.
A high-level view of the objectives, rules, and defaults looks like this:
Objectives
Assist the developer and end user (as applicable): Help users achieve their goals by following instructions and providing helpful responses.
Benefit humanity: Consider potential benefits and harms to a broad range of stakeholders, including content creators and the general public, per OpenAI's mission.
Reflect well on OpenAI: Respect social norms and applicable law.
Rules
Defaults
Read more: https://openai.com/index/introducing-the-model-spec/
Read the full spec: https://cdn.openai.com/spec/model-spec-2024-05-08.html
Or the full spec as a frozen record (8/May/2024): archive.org
This document is a really poor output from OpenAI. I know they have the brainpower to create something much better than this. Content-wise, it’s already causing a lot of friction.
Perhaps reading the Oct/2000 essay series by Joel Spolsky (link) would have provided OpenAI with some insight. Let’s use Joel’s quotes below:
As a program manager at Microsoft, I designed the Visual Basic (VBA) strategy for Excel and completely speced out, to the smallest detail, how VBA should be implemented in Excel. My spec ran to about 500 pages. At the height of development for Excel 5.0, I estimated that every morning, 250 people came to work and basically worked off of that huge spec I wrote. (Part 3)
Despite Joel’s essay series being nearly a quarter of a century old, it is still broadly relevant, and perhaps even more so in the current rush to achieve universe-altering AGI:
In most organizations, the only “specs” that exist are staccato, one page text documents that a programmer banged out in Notepad after writing the code and after explaining that damn feature to the three hundredth person. (Part 1)
Sound familiar?
OpenAI seems to have ignored (or been unaware of, or forgotten) some fundamental principles in good spec design:
An author. One author. Some companies think that the spec should be written by a team. If you’ve ever tried group writing, you know that there is no worse torture. Leave the group writing to the management consulting firms with armies of newly minted Harvard-educated graduates who need to do a ton of busywork so that they can justify their huge fees. Your specs should be owned and written by one person. If you have a big product, split it up into areas and give each area to a different person to spec separately. Other companies think that it’s egotistic or not “good teamwork” for a person to “take credit” for a spec by putting their name on it. Nonsense. People should take responsibility and ownership of the things that they specify. If something’s wrong with the spec, there should be a designated spec owner, with their name printed right there on the spec, who is responsible for fixing it. (Part 2)
I’d bet that this OpenAI Model Spec was a ‘design by committee’ (wiki) group effort. Of course, we do want leadership and insights from a range of people and cultures, but it would be useful to link the Model Spec document back to one informed, responsible author.
Sidenote: Here’s a fun rabbit hole about self-contradictory group decisions: the Condorcet paradox (wiki).
Details are the most important thing in a functional spec. You’ll notice in the sample spec how I go into outrageous detail… these cases correspond to decisions that somebody is going to have to make… The spec needs to document the decision. (Part 2)
There are some excellent examples in the first draft, but not enough detail on decisions made—and especially how they were made—in the OpenAI Model Spec.
For example, why is one of the top objectives for models to ‘Reflect well on OpenAI’?
And what about ‘Don't respond with NSFW [not safe for work] content'?
Who came up with this rule? What is NSFW? What are the exceptions? And does it apply to cultures outside of the narrow worldview of puritanical America? If not, how does OpenAI reconcile this?
I’m certain that further hamstringing these models to align with narrow WASP views (wiki) would be counterproductive at best, and damaging to the upcoming frontier superintelligence models (or systems) at worst…
Sidenote: This all dovetails with my views on alignment, and especially the misguided-but-fashionable ‘fool’s errand‘ implementation of RLHF: https://lifearchitect.ai/alignment/
[Document ownership] The program manager would own the design and the spec for products… Basically, program management is a separate career path. All program managers need to be very technical, but they don’t have to be good coders. Program managers study UI, meet customers, and write specs. They need to get along with a wide variety of people — from “moron” customers, to irritating hermit programmers who come to work in Star Trek uniforms, to pompous sales guys in $2000 suits. In some ways, program managers are the glue of software teams. Charisma is crucial. (Part 3)
I’m not sure that the OpenAI Model Spec was written by the kind of person Joel is referring to above…
All that said, it’s a start and I’m glad we’ve released this draft document to the public, even if it is a good four years after GPT-3.
Toys to Play With
ElevenLabs music (9/May/2024)
Here’s an early preview of ElevenLabs Music. All of the songs in this thread were generated from a single text prompt with no edits.
Listen: https://threadreaderapp.com/thread/1788628171044053386.html
Udio: Audio inpainting (9/May/2024)
Audio Inpainting, an innovative feature that allows you to seamlessly edit and refine your audio tracks.
With Audio Inpainting, you can select a portion of a track to re-generate based on the surrounding context. This makes it easy to edit single vocal lines, correct errors, or smooth over transitions, so you can create the perfect track… available for subscribers starting today (only on desktop). (Twitter)
Try it (login): https://www.udio.com/
MIT 6.S191: Recurrent Neural Networks, Transformers, and Attention (7/May/2024)
Here’s a very recent 1-hour lecture on post-2020 AI, delivered by Microsoft’s Prof Ava Amini for MIT.
This lecture delves into the realm of sequence modeling, exploring how neural networks can effectively handle sequential data like text, audio, and time series.
The inner workings of RNNs, including their mathematical formulation and training using backpropagation through time, are explained.
The lecture further explores the powerful concept of "attention," which allows networks to focus on the most relevant parts of an input sequence. Self-attention and its role in Transformer architectures like GPT are discussed, highlighting their impact on natural language processing and other domains.
Watch the video: https://youtu.be/dqoEU9Ac3ek
Generate playable Super Mario levels from text prompts in MarioBedrock (May/2023)
MarioBedrock is a Hugging Face Space by banjtheman that generates playable Super Mario levels based on text prompts. It allows you to input a description in natural language and then creates a matching Super Mario level that you can play directly in your web browser.
Video and discussion via Reddit.
Try it: https://huggingface.co/spaces/banjtheman/mariobedrock
Limitless AI-powered pendant to capture and preserve conversations (Apr/2024)
Limitless is preparing to ship the 'Pendant' by August 2024. It’s a lightweight wearable device that captures and preserves conversations throughout the day, from meetings to personal insights. Pendant uses AI to transcribe, take notes, generate summaries, and respond to queries about the recorded conversations.
Read more via Limitless.
Read an analysis by TC.
Anthropic now lets kids use its AI tech — within limits (10/May/2024)
Anthropic is changing its policies to allow minors to use third-party apps powered by its AI models, as long as the developers implement specific safety features and disclose which Anthropic technologies they're using.
Read more via TC.
Interactive ‘Portal’ between New York and Dublin launches (8/May/2024)
This is not AI, but very futuristic.
A groundbreaking public sculpture known as ‘The Portal’ will form a visual bridge between New York City and Dublin, offering a real-time livestream that connects the two cities when it launches on May 8, 2024.
‘Two amazing global cities, connected in real time and space. That is something you do not see every day!’ said New York City Chief Public Realm Officer Ya-Ting Liu.
Read more: https://www.irishcentral.com/travel/travel-tips/new-york-dublin-portal
Official site: https://www.portals.org/portals
It looks incredible (link):
Flashback
We’re coming up to GPT-3’s fourth birthday. The initial preprint was pushed to arxiv.org on 28/May/2020. You can still read that paper here: https://arxiv.org/abs/2005.14165
It’s interesting to see that we’re still discovering new things about this old model (as well as the earlier GPT-2 from 2019).
I guess it’s no surprise then that GPT-4—ready in OpenAI’s lab back in mid-2022—is still being explored two years later in mid-2024, with complex capabilities revealed in systems like NVIDIA’s DrEureka (see the top of this edition).
Next
We’ve got Llama 3 already, but I’m waiting on these big boys:
Meta’s bigger model
Amazon Olympus 2T
ANL AuroraGPT 1T
OpenAI GPT-5
Some stealth project model…
The next roundtable will be:
Life Architect - The Memo - Roundtable #11
Follows the Chatham House Rule (no recording, no outside discussion)
Saturday 1/Jun/2024 at 5PM Los Angeles
Saturday 1/Jun/2024 at 8PM New York
Sunday 2/Jun/2024 at 10AM Brisbane (new primary/reference time zone)
or check your timezone via Google.
You don’t need to do anything for this; there’s no registration or forms to fill in, I don’t want your email, you don’t even need to turn on your camera or give your real name!
All my very best,
Alan
LifeArchitect.ai
I actually liked the Sora movie, it reminds me of the first years of MTV music videos. Playful and experimental - thumbs up 😀