To: US Govt, major govts, Microsoft, Apple, NVIDIA, Alphabet, Amazon, Meta, Tesla, Citi, Tencent, IBM, & 10,000+ more recipients…
From: Dr Alan D. Thompson <LifeArchitect.ai>
Sent: 27/Feb/2024
Subject: The Memo - AI that matters, as it happens, in plain English
AGI: 70%
AI music generation is having a huge moment, improving dramatically over the last few weeks with the release of Suno v3 (also available via Microsoft Copilot). Listen to this new genre of ‘hyperpop’ prompted by Miles Oliver. Prompt: heavy glitchpunk witch house hyperpop hip hop, innovative production, avant-garde, demon-trap
Link: https://app.suno.ai/song/f5b2556d-811a-4cde-af20-2149230d5a61/
I had to give it a go myself, of course. Welcome ‘Rise above’ (watch out for the eight-second ‘general pause’ and strong downbeat!). Prompt: anthem ballad featuring soaring female vocals
Link: https://app.suno.ai/song/7331877d-5e40-44dd-b063-9ea5f804df73
This is the first time I’ve been rendered speechless by AI in a while (okay, in a few weeks), and I shared it with musician Reggie Watts in California (wiki). He asked ‘How is this app creating music? Is it just learning from a data set and then using the prompts to create it to the best of its ability?’
The best answer I’ve found is from Dr Ilya Sutskever, chief scientist and co-founder of OpenAI. When GPT-2 was released, he told the New Yorker (14/Oct/2019) that ‘this stuff is like alchemy!… Give it the compute, give it the data, and it will do amazing things.’ And that’s as close an explanation as we’re gonna need for now.
Try it here (free credits for v2 or $10/m for v3, login): https://app.suno.ai/create/
The Memo reader Josh recently queried my statement that I don’t listen to the majority of AI experts, and asked for the people I do listen to. They include Prof Scott Aaronson at OpenAI, and you may enjoy his recent talk summary, ‘The Problem of Human Specialness in the Age of AI’:
The surprise, with LLMs, is not merely that they exist, but the way they were created. Back in 1999, you would’ve been laughed out of the room if you’d said that all the ideas needed to build an AI that converses with you in English already existed, and that they’re basically just neural nets, backpropagation, and gradient descent. (With one small exception, a particular architecture for neural nets called the transformer [discovered by Google in 2017], but that probably just saves you a few years of scaling anyway.)
Read more: https://scottaaronson.blog/?p=7784
The next roundtable is 16/Mar, details at the end of this edition.
The BIG Stuff
Dr Daniel Kokotajlo at OpenAI futures/governance (22/Feb/2024)
Once again, a stark warning to be very careful about who you listen to in this space. It seems that there is now a large percentage of the population who have an opinion on artificial intelligence. Roughly 1% of these experts are worth listening to.
In my analysis of Chinchilla (Feb/2023), I noted that ‘in 2010, Google estimated that there are only 130M unique published books in existence.’ In my keynotes, I generously estimate that the average human reads 700 books in their lifetime.
So, if we as biologically-limited humans can only absorb 0.00054% of the available information on Earth, please make sure it’s the highest quality 0.00054%!
Here is Dr Daniel Kokotajlo from OpenAI futures and governance, commenting on the upcoming state of artificial general intelligence. Pay close attention to this!
Probably there will be AGI soon -- literally any year now.
Probably whoever controls AGI will be able to use it to get to ASI shortly thereafter -- maybe in another year, give or take a year.
Probably whoever controls ASI will have access to a spread of powerful skills/abilities and will be able to build and wield technologies that seem like magic to us, just as modern tech would seem like magic to medievals.
This will probably give them godlike powers over whoever doesn't control ASI.
In general there's a lot we don't understand about modern deep learning. Modern AIs are trained, not built/programmed. We can theorize that e.g. they are genuinely robustly helpful and honest instead of e.g. just biding their time, but we can't check.
…
See the policy section in this edition for more.
Read my page on alignment: https://lifearchitect.ai/alignment/
Mistral Large (26/Feb/2024)
French AI lab Mistral has released a new model to compete with GPT-4. It is called Mistral Large, and is only available via API. I’ve estimated this model at around 540B parameters trained on around 11T tokens.
The MMLU score is 81.2, the same as Google’s Flan-PaLM 2 340B, and higher than the non-finetuned PaLM 2 340B (MMLU=78.3). I’ve provided a ranking of major model MMLU scores later in this edition. Mistral Large has a 32k context window. My early testing shows that this is a frontier model with excellent performance.
See more about Microsoft’s recent investment in Mistral via The Verge.
Read about Mistral Large: https://mistral.ai/news/mistral-large/
See more Mistral benchmarks: https://docs.mistral.ai/platform/endpoints/
Try it (login, waitlist): https://chat.mistral.ai/chat
Try it on Poe.com (login, paid): https://poe.com/Mistral-Large
Change in number of Upwork jobs since ChatGPT was released (16/Feb/2024)
Upwork is a freelance hiring website (from a merger of Elance and oDesk) with 14 million users in 180 countries and US$1B in annual freelancer billings (wiki).
Researcher Henley Wing analyzed 5M freelancing jobs to see what jobs are being replaced by AI. His viz is revealing:
I definitely expected writing jobs to decrease, as this is perhaps the most popular use case of ChatGPT, and that was reflected in the 33% decrease in writing jobs. Likewise, seeing customer service jobs decrease 16% wasn’t a huge surprise, as many companies are creating chatbots that can replace many customer service agents.
Read more via ‘Bloomberry’.
[Exclusive sidenote]
There is a disturbing irony here not identified by Henley. Upwork is a fantastic source for these statistics, because ChatGPT (via reinforcement learning from human feedback) was fine-tuned by contractors from Upwork. The InstructGPT paper (4/Mar/2022) by OpenAI points out:
To produce our demonstration and comparison data, and to conduct our main evaluations, we hired a team of about 40 contractors on Upwork… mostly English-speaking people living in the United States or Southeast Asia hired via Upwork... They disagree with each other on many examples; we found the inter-labeler agreement to be about 73%… We take prompts submitted to our API, and several model completions, and have labelers rank the completions by overall quality [to create InstructGPT and ChatGPT].
In other words, the contractors at Upwork took their own jobs…
Gartner predicts search engine volume will drop 25% by 2026, due to AI… (19/Feb/2024)
Gartner anticipates a 25% reduction in search engine volume by 2026, as chatbots and virtual agents powered by Generative AI are expected to handle many user queries.
This is a wildly conservative timeline, especially with the expert AGI summary by Dr Daniel Kokotajlo above!
Read more via Gartner.
The Interesting Stuff
Neuralink's first human patient able to control mouse through thinking, Musk says (20/Feb/2024)
Elon Musk announced that Neuralink's brain-chip interface has allowed a human patient to control a computer mouse with their thoughts. In a Spaces event on Twitter, Musk said:
Progress is good, and the patient seems to have made a full recovery, with no ill effects that we are aware of. Patient is able to move a mouse around the screen by just thinking.
Read more via Reuters.
Google DeepMind Gemma 7B (21/Feb/2024)
Google has released a pair of ‘lightweight, state-of-the art open models built from the research and technology used to create Gemini models.’
The 7B parameter model was trained on a dataset of 6 trillion tokens—300% that used for Llama 2—making it the largest trained dataset in the world besides frontier models. I’ve found its performance to be poor, though okay for its size (especially compared to other open 7B models).
MMLU result highlights:
GPT-4 (@31) = 90.10
Gemini Ultra 1.0 (@32) = 90.04
Human expert baseline (ASI) = 89.8
GPT-4 1.76T = 86.4
Mistral Large = 81.2
Inflection-2 = 79.6
Claude 2 = 78.5
PaLM 2 340B = 78.3
Grok-1 = 73.0
Falcon 180B = 70.6
GPT-3.5 20B = 70.0
Llama 2 70B = 68.9
Google Gemma 7B = 64.3
Human average baseline (AGI) = 34.5
Read the announce: https://blog.google/technology/developers/gemma-open-models/
Read the paper (PDF, 16 pages).
See it on the Models Table: https://lifearchitect.ai/models-table/
Gemma 7B is now available on Poe (login, free): https://poe.com/Gemma-7b-FW
Try it (free, no login) via Perplexity Labs: https://labs.pplx.ai/
OpenAI’s former policy director and now co-founder of Anthropic, Jack Clark (I also listen to him), gave a sensational analogy for this model release (26/Feb/2024):
Why this matters and what Gemma feels like: Picture two giants towering above your head and fighting one another - now imagine that each time they land a punch their fists erupt in gold coins that showers down on you and everyone else watching the fight.
That’s what it feels like these days to watch the megacap technology companies duke it out for AI dominance as most of them are seeking to gain advantages by either
a) undercutting each other on pricing (see: all the price cuts across GPT, Claude, Gemini, etc), or
b) commoditize their competitor and create more top-of-funnel customer acquisition by releasing openly accessible models (see: Mistral, Facebook’s LLaMa models, and now Gemma).
Stable Diffusion 3 (22/Feb/2024)
Prompt: Resting on the kitchen table is an embroidered cloth with the text 'good night' and an embroidered baby tiger. Next to the cloth there is a lit candle. The lighting is dim and dramatic.
Stable Diffusion 3 is a text-to-image model based on diffusion transformer, and comparable with OpenAI’s recent Sora model. It will also do video-to-video, 3D (sample), and more. It is 8B parameters, and the technical report will be available soon.
Stability AI’s founder has highlighted that control over composition is next (22/Feb/2024, video), much like Adobe’s implementation of text-to-image AI in Firefly and Photoshop.
Read the announce (no high-res images): https://stability.ai/news/stable-diffusion-3
Watch my livestream on this model: https://youtu.be/EQreowhLrnQ
OpenAI Feather (27/Feb/2024)
The trademark application for OpenAI Feather is a year old, but the platform is live, presumably for private corporate users.
automated labeling and annotating of images, audio, video, text and other forms of electronic data;
for use in visualizing, transforming, manipulating, and modeling datasets;
I could see Fortune 500 companies using this platform to make their internal intellectual property and knowledge bases searchable and accessible.
Trademark applications: USPTO & UK IPO
Live platform (wall): https://feather.openai.com/ & https://marabou.feather.openai.com/
This is another robust edition. Let’s look at more AI, including robots at a $250B market cap company, NEO’s hands, Google Genie, Meta, Apple, Reka, and more.
Alan’s keynote rehearsal for H1 2024 (26/Feb/2024)
Here’s a private video of my keynote rehearsal H1 2024 (first half), recorded this week and available to all full subscribers. Thanks for your ongoing support of The Memo.