To: US Govt, major govts, Microsoft, Apple, NVIDIA, Alphabet, Amazon, Meta, Tesla, Citi, Tencent, IBM, & 10,000+ more recipients…
From: Dr Alan D. Thompson <LifeArchitect.ai>
Sent: 13/May/2024
Subject: The Memo - AI that matters, as it happens, in plain English
AGI: 72% ➜ 73%
Here’s something I’ve been thinking about a lot this year: Who will pay them back? For the voice, I tried using OpenAI TTS and ElevenLabs, but they didn’t have the right feel. Special thanks to Jennifer Vuletic (link) for her human voiceover and slotting this mini project into her schedule, in between her work for massive publishers and royalty like Australian Prime Minister Julia Gillard (Audible).
Watch my video (16mins, link):
Contents
The BIG Stuff (DrEureka, Neuralink update…)
The Interesting Stuff (Ukraine avatar, full Sora video, Optimus, Dojo 18kA…)
Policy (Pause, OpenAI specs + Alan rant, new govt supercomputers…)
Toys to Play With (LLM for kids, music, MIT, Super Mario…)
Flashback (GPT-3…)
Next (new models, invitation link to next roundtable…)
The BIG Stuff
Next: OpenAI spring update + Google I/O (13/May/2024)
In 24h from this emailed edition, OpenAI will livestream a ‘spring update’ (link).
This will also be the day before the Google I/O conference (link) where Google is expected to announce a Sora clone called Miro (text-to-video), as well as Imagen 3 (text-to-image), and Juno V1 3B (inpainting images). (via leaker Bedros Pamboukia, 12/May/2024)
Sidenote: The use of seasonal terms like ‘spring’ by US entities is incredibly boring. Aside from the fact that the US has only 4.23% of the total world population and might like to consider the other 95%, its seasons move based on solstices (link) unlike say Australia where our seasons change on the first of the month. Also, what even is ‘fall’?
Here’s the expected lineup for the OpenAI announcements:
❌ GPT-5
❌ Search engine [Sidenote: This would mean Reuters was wrong again…]
✅ Phone calls within ChatGPT via WebRTC (wiki) + more integrations
✅ New models gpt-4l, gpt-4l-auto, gpt-4-auto, maybe related to new models coming in Microsoft Copilot: next-model4 and next-model8
✅ A Steve Jobs-style ‘one more thing’ (here’s a fun flashback video by CNET)
I’ll update the web version of this edition of The Memo here:
UPDATE 1:
GPT-4o (Omnimodel) OpenAI says is ‘the best model in the world’ (13/May/2024)
MMLU=88.7. GPQA=53.6 (see viz below).
Livestream link: YouTube.com (26mins).
UPDATE 2:
Dr Jim Fan (14/May/2024): ‘[GPT-4o is] likely an early checkpoint of GPT-5’
But my testing shows GPT-4o is actually worse than GPT-4 Turbo (and definitely Claude 3 Opus) across ‘IQ’ benchmarks. It’s only the multimodal aspect that makes this an evolution.
Wild video demo (link):
GPT-4 + Unitree Go1 quadruped robot = DrEureka (UPenn, NVIDIA, UT Austin) (4/May/2024)
Dr Jim Fan from NVIDIA announced a ‘surprising’ evolution of embodied AI, using GPT-4 to intuitively address ‘friction, damping, stiffness, gravity, etc.’ in robotics. Working alongside a team from UPenn and UT Austin, the system is called DrEureka.
The name comes from Domain Randomization via the 2023 Eureka system, combining LLMs (currently based on GPT-4) with NVIDIA GPU-accelerated simulation technologies (20/Oct/2023).
We trained a robot dog to balance and walk on top of a yoga ball purely in simulation, and then transfer zero-shot to the real world. No fine-tuning. Just works.
I’m excited to announce DrEureka, an LLM agent that writes code to train robot skills in simulation, and writes more code to bridge the difficult simulation-reality gap. It fully automates the pipeline from new skill learning to real-world deployment.
The Yoga ball task is particularly hard because it is not possible to accurately simulate the bouncy ball surface. Yet DrEureka has no trouble searching over a vast space of sim-to-real configurations, and enables the dog to steer the ball on various terrains, even walking sideways!
Traditionally, the sim-to-real transfer is achieved by domain randomization, a tedious process that requires expert human roboticists to stare at every parameter and adjust by hand. Frontier LLMs like GPT-4 have tons of built-in physical intuition for friction, damping, stiffness, gravity, etc.
We are (mildly) surprised to find that DrEureka can tune these parameters competently and explain its reasoning well. DrEureka builds on our prior work Eureka, the algorithm that teaches a 5-finger robot hand to do pen spinning. It takes one step further on our quest to automate the entire robot learning pipeline by an AI agent system. One model that outputs strings will supervise another model that outputs torque control. (4/May/2024)
This complex AI embodiment advance—and especially the application of LLMs to sense and adjust parameters for ‘friction, damping, stiffness, gravity, etc’—moved my AGI countdown another percentage point from 72% ➜ 73%.
View the repo + videos.
Read an analysis by NewAtlas.
Neuralink PRIME Study Progress Update — User Experience (8/May/2024)
Noland Arbaugh, the first human participant in Neuralink's PRIME study, has been using the Link brain-computer interface to control his laptop and play games from various positions, including while lying down in bed. He’s just hit 100 days of having it installed. Noland said:
Y'all are giving me too much, it's like a luxury overload, I haven't been able to do these things in 8 years and now I don't know where to even start allocating my attention.
The biggest thing with comfort is that I can lie in my bed and use [the Link]… It lets me live on my own time, not needing to have someone adjust me, etc. throughout the day.
[The Link] has helped me reconnect with the world, my friends, and my family. It's given me the ability to do things on my own again without needing my family at all hours of the day and night.
[The Neuralink BCI is] still improving; the games I can play now are leaps and bounds better than previous ones. I’m beating my friends in games that as a quadriplegic I should not be beating them in.
I think it should give a lot of people a lot of hope for what this thing can do for them, first and foremost their gaming experience, but then that'll translate into so much more and I think that's awesome.
Read more via Neuralink Blog.
As usual, the media focused on the Terrible, Horrible, No Good, Very Bad AI™ (that’s from a book that I know you’ll recall, wiki), with some threads physically retracting from Noland’s brain, a practical issue already resolved by Neuralink: CNBC, Wired, WSJ.
The Interesting Stuff
DeepSeek-V2 (8/May/2024)
DeepSeek-AI has released a 236B parameter MoE model called DeepSeek-V2, trained on an incredibly large dataset of 8.1T tokens. MMLU=78.5. The dataset included 12% Chinese, ‘therefore, we acknowledge that DeepSeek-V2 still has a slight gap in basic English capabilities [even compared with smaller models like Llama 3 70B]’.
Read the paper: https://arxiv.org/abs/2405.04434
Try it here (free, login): https://chat.deepseek.com/
See it on the Models Table.
Victoria Shi, digital representative of Ukraine (1/May/2024)
The Ministry of Foreign Affairs (MFA) of Ukraine was established in 1991 when Ukraine became an independent state after the collapse of the Soviet Union. (PDF, 1999)
Meet Victoria Shi — a digital representative of the MFA of Ukraine, created using AI to provide timely updates on consular affairs! For the first time in history, the MFA of Ukraine has presented a digital persona that will officially comment for the media.
Comments from Victoria will appear on the MFA's official website & social media platforms. The only original videos featuring statements from Victoria are those that contain a QR code linking to the MFA's official page with the statement's text.
Source: https://twitter.com/MFA_Ukraine/status/1785558101908742526
This is an interesting evolution of avatars, a nice upgrade to my 2021 Leta AI, and the 2022 Marija by the Government of Malta (featured in one of the very first editions of this advisory—more than two years ago—in The Memo edition 12/Mar/2022).
Unfortunately, the use of QR codes for provenance is misguided and naive, offering no real security or protection, and maybe even adding an attack vector for hackers to pursue.
All editions of The Memo provide robust, industry-grade, comprehensive advisory to government, enterprise, and you. We’re just ⅓ of the way through the 4,300 words of this edition, including my 900-word audit of OpenAI’s recently released model document. Let’s get into it!