Voice generated by genny.lovo.ai. No, that isn’t AI-Morgan Freeman, it’s AI-‘Bryan Lee Jr’!
FOR IMMEDIATE RELEASE: 30/May/2023
NVIDIA CEO Jensen Huang (27/May/2023):
Agile companies will take advantage of AI and boost their position.
Companies less so will perish…
In 40 years, we created the PC, Internet, mobile, cloud, and now the AI era. What will you create? Whatever it is, run after it like we did. Run, don’t walk.
Welcome back to The Memo.
The Policy section covers the consistent views of the world’s most powerful person (OpenAI’s CEO) from a private blog post in 2015 to their latest push for more governance. We also explore Microsoft’s latest regulatory guidance, as well as updates on the UK’s AI regulation.
In the Toys to play with section, we look at Stability AI’s latest image model, a fully 'offline' LLM for security-conscious people, podcasts generated by the latest AI, a GPT-4 backed news site, and much more!
The BIG Stuff
Brain-computer interfaces: Neuralink to test on humans (26/May/2023)
We are excited to share that we have received the FDA’s approval to launch our first-in-human clinical study!
This is the result of incredible work by the Neuralink team in close collaboration with the FDA and represents an important first step that will one day allow our technology to help many people.
Recruitment is not yet open for our clinical trial. We’ll announce more information on this soon! (-via Twitter)
As usual, I have a page dedicated to this subject. Read more: https://lifearchitect.ai/bmi/
Back in 2021, I documented the benefits of brain-computer interfaces, according to the top creators/inventors.
Voyager: A GPT-4 ‘lifelong learning’ agent in Minecraft (26/May/2023)
…we [NVIDIA, Caltech, UT Austin, Stanford, Arizona State University] introduce Voyager, the first LLM-powered embodied lifelong learning agent, which leverages GPT-4 to explore the world continuously, develop increasingly sophisticated skills, and make new discoveries consistently without human intervention.
This may look like just a silly game, but think further out… LLMs becoming agentic (the ability to act on their own: having goals, reasoning, and monitoring their own behaviour) is the equivalent of humans having created new life. And this new life is already optimizing skill development, and ‘lifelong learning’ within an environment.
It may be that this discovery is a big milestone in our history. Expect the next iteration of this concept to drive real use cases impacting our physical world—and all our lives.
Read the prompts used for GPT-4 (GitHub).
Read Dr Jim Fan’s summary of how this works.
Read the paper: https://arxiv.org/abs/2305.16291
NVIDIA Announces DGX GH200 AI Supercomputer (28/May/2023)
New Class of AI Supercomputer Connects 256 Grace Hopper Superchips Into Massive, 1-Exaflop, 144TB GPU for Giant Models…
GH200 superchips eliminate the need for a traditional CPU-to-GPU PCIe connection by combining an Arm-based NVIDIA Grace™ CPU with an NVIDIA H100 Tensor Core GPU in the same package, using NVIDIA NVLink-C2C chip interconnects
Expect trillion-parameter models like OpenAI GPT-5 (my link), Anthropic Claude-Next (my link), and beyond to be trained with this groundbreaking hardware. Some have estimated that this could train language models up to 80 trillion parameters, which gets us closer to brain-scale.
The Interesting Stuff
GPT-4 annotated paper (26/May/2023)
I’m providing a version of the GPT-4 paper with my annotations and updated context. I’ll also provide an annotated version of Google’s PaLM 2 paper shortly.
Grab a copy: https://lifearchitect.ai/report-card/
Fine-tuning on human preferences is a fool’s errand (28/May/2023)
I’ve presented some background on why allowing models to be guided by humans directly after pre-training—a process called reinforcement learning from human feedback or RLHF—is suboptimal, and introduces problems.
Read more: https://lifearchitect.ai/alignment/#rlhf
PandaGPT 13B (24/May/2023)
While we wait for DeepMind to announce Gato 2, here is some interesting progress from researchers at Cambridge and Tencent. The PandaGPT model is one of the few examples of proto-AGI so far this year, based on a large language model in Vicuna 13B combined with Meta’s ImageBind embedding model (9/May/2023). The researchers note that this is ‘the first foundation model capable of instruction-following data across six modalities’:
Text.
Image/video.
Audio.
Depth.
Thermal.
Inertial measurement units/accelerometer/gyroscope/compass.
Demo: https://huggingface.co/spaces/GMFTBY/PandaGPT
Project page: https://panda-gpt.github.io/
IndexGPT by JPMorgan Chase; trademark filing (May/2023)
I continue to provide AI consulting via expert calls as well as longer-term advisory to big companies and major governments around the world, most under NDA with ‘no identify’ clauses. Besides government, banking and finance is my biggest sector, with spectacular advances being made. While I can’t reveal any of that work, I can certainly point out public documents, including JPMorgan Chase’s trademark filing for a product called IndexGPT earlier this month.
…JPMorgan may be the first financial incumbent aiming to release a GPT-like product directly to its customers…“It’s an A.I. program to select financial securities,” Gerben said. “This sounds to me like they’re trying to put my financial advisor out of business.”…
The bank, which employs 1,500 data scientists and machine-learning engineers, is testing “a number of use cases” for GPT technology, said global tech chief Lori Beer.
‘Laptop models’ are a waste of time (25/May/2023)
I have editorialised this title. The UC Berkeley paper is actually called “The False Promise of Imitating Proprietary LLMs“. It names several smaller ‘imitation’ models (Alpaca, Vicuna, Koala, GPT4ALL) that imitate (or ‘steal’) output examples from larger models (ChatGPT, GPT-3.5). The imitation model output ‘falls short in improving LMs across more challenging axes such as factuality, coding, and problem solving.’
Initially, we were surprised by the output quality of our imitation models—they appear far better at following instructions, and crowd workers rate their outputs as competitive with ChatGPT. However, when conducting more targeted automatic evaluations, we find that imitation models close little to none of the gap from the base LM to ChatGPT on tasks that are not heavily supported in the imitation data. We show that these performance discrepancies may slip past human raters because imitation models are adept at mimicking ChatGPT’s style but not its factuality. Overall, we conclude that model imitation is a false promise: there exists a substantial capabilities gap between open and closed LMs that, with current methods, can only be bridged using an unwieldy amount of imitation data or by using more capable base LMs. In turn, we argue that the highest leverage action for improving open-source models is to tackle the difficult challenge of developing better base LMs, rather than taking the shortcut of imitating proprietary systems.
Read the paper: https://arxiv.org/abs/2305.15717
Uber + Waymo in Phoenix (24/May/2023)
I love catching Waymo self-driving cars in my second home of Phoenix. The feel of being alone in an AI-controlled vehicle speeding along the highway is… futuristic. This week, Waymo and Uber partnered up.
This integration will launch publicly later this year with a set number of Waymo vehicles across Waymo’s newly expanded operating territory in Phoenix, and will include local deliveries and ride-hailing trips. Uber users will be able to experience the safety and delight of the Waymo Driver on both the Uber and Uber Eats apps. Riders will also still be able to hail a Waymo vehicle directly through the Waymo One app. At over 180 square miles [467 km²], Waymo’s Phoenix operations are currently the largest fully autonomous service area in the world.
Pew: 58% of US adults have heard of ChatGPT; 14% have tried it (19/Mar/2023)
It should be noted that Pew Research ran this study from 13/Mar-19/Mar, and the numbers would be higher two months later in May/2023.
Among the subset of Americans who have heard of ChatGPT, 19% say they have used it for entertainment and 14% have used it to learn something new. About one-in-ten adults who have heard of ChatGPT and are currently working for pay have used it at work.
ChatGPT iOS app: 500k downloads in 6 days (26/May/2023)
Despite being U.S.- and iOS-only ahead of today’s expansion to 11 more global markets, OpenAI’s ChatGPT app has been off to a stellar start. The app has already surpassed half a million downloads in its first six days since launch. (-via TC)
UW Guanaco 65B (25/May/2023)
It looks like a LLaMA, but a Guanaco is ‘one of two wild South American camelids, the other being the vicuña, which lives at higher elevations’ (wiki).
Researchers at the University of Washington developed QLoRA, enabling fine-tuning on a single GPU, and then trained a model called Guanaco. The largest Guanaco model is 65B parameters, and achieves above 99 percent of the performance of ChatGPT via GPT-4 benchmarking.
They’ve even shoehorned the animal name into a technical name: Generative Universal Assistant for Natural-language Adaptive Context-aware Omnilingual outputs.
The big takeaway is this:
LLaMA 65B needs 780GB of GPU RAM.
Guanaco 65B needs 48GB of GPU RAM.
Guanaco 7B needs 5GB GPU RAM.
QLoRA will also enable privacy-preserving fine-tuning on your phone. We estimate that you can fine-tune 3 million words each night with an iPhone 12 Plus. This means, soon we will have LLMs on phones which are specialized for each individual app.
A year ago, it was a common sentiment that all important research is done in industrial AI labs. I think this is no longer true. (-via Twitter)
View the project page: https://guanaco-model.github.io/
Read the paper: https://arxiv.org/abs/2305.14314
Demo: https://huggingface.co/spaces/uwnlp/guanaco-playground-tgi
Abu Dhabi releases Falcon 40B (26/May/2023)
Falcon, a foundational large language model (LLM) with 40 billion parameters, trained on one trillion tokens, grants unprecedented access to researchers and small and medium-sized enterprise (SME) innovators alike. TII is providing access to the model’s weights as a more comprehensive open-source package [for commercial use]…
Falcon 40B is a breakthrough led by TII’s AI and Digital Science Research Center (AIDRC). The same team also launched NOOR, the world’s largest Arabic NLP model last year, and is on track to develop and announce Falcon 180B soon.
New models bubbles viz (27/May/2023)
This viz looks at 2023-2024 optimal language models; those using Chinchilla scaling or better, which means using a lot of data. Where GPT-3 only used 2 tokens (words) per parameter, Chinchilla scaling laws advise using 20 tokens (words) per parameter.
As with all my visualizations, this is available for you to use in any reasonable venue.
Download original: https://lifearchitect.ai/models/
New GPT-4 vs human viz (20/May/2023)
By request, here's a simplified version of this full GPT-4 vs human viz; easier to read on a big screen!
Read more: https://lifearchitect.ai/iq-testing-ai/
List of ChatGPT plugins (May/2023)
ChatGPT now has nearly 100 plugins, each performing a different online task. Here’s one of the best examples:
Link Reader plugin for ChatGPT is a great choice. It can read the content of links like webpages, PDFs, and images.
To use it, give it a link and ask for information. ChatGPT works together with Link Reader to give you a detailed answer. So if you want a quick summary, this plugin is perfect.
Prompts For Link Reader ChatGPT Plugin
“Summarize this article”
“Estimate reading time for these links”
View all 80 plugins with two example prompts each, by theinsaneapp (dark mode; I can’t read this white text on black background stuff).
View 78 plugins with uses cases, by startuphub.
Short film clips using 2023 technology (May/2023)
Here’s the latest of the greatest. With the exponential rate of change, we’re only a few months away from full feature-length films. Created by Caleb Ward, with the help of Tyler Ward and Shelby Ward.
Tech stack: ChatGPT for script, ChatGPT for shot list, Midjourney v5.1 for images, ElevenLabs for voices, D-ID for face animations. Then, some masking and background removal by a human with Adobe/Canva/other, titles added by a human.
Here’s some behind-the-scenes info:
Mid-year AI report in progress (May/2023)
Every six months, I release a ‘plain English’ AI report. Previous reports are:
2022 retrospective: The sky is infinite.
Mid-2022 retrospective: The sky is bigger.
2021 retrospective: The sky is on fire.
A quick reminder that full subscribers to The Memo will get early access to my mid-2023 AI report, covering progress between January and the end of June 2023. The report is called