The Memo - 20/Jun/2023
AGI @ 50%, Meta AI MusicGen 3.3B, McKinsey: generative AI adding $4.4T/y to economy, Harvard cracks truthfulness, and much more!
FOR IMMEDIATE RELEASE: 20/Jun/2023
Welcome back to The Memo.
The Who Moved My Cheese? AI Awards! for June has four winners, each more preposterous than the last…
The 153-year-old science journal Nature is banning AI-generated illustrations. Read the announce. Read the report by Ars.
The Grammy awards are banning AI-generated music, ‘A work that contains no human authorship is not eligible in any categories’. Read more via Reuters.
Nikon is freaking out about AI replacing photography: “Millions of people around the world are generating surreal images just by entering a few keywords on a website, which is directly affecting photographers” Read more via PetaPixel.
Belgian ad agency Impact think that they’re immune to the AI revolution because they’re in construction:
AI can do a lot. But AI can’t finish this building on the Keyserlei in Antwerp. AI can't fix a leak or install a heating system neither. Crafts(wo)men are here to stay, and they deserve to be recognized. Their skills are simply irreplaceable.
Good luck with that. Give it a few months…
In the Policy section, I cover the top 5 use case areas for AI, a clean literature review within a new policy document courtesy of Australia, the latest update on Terence Tao and GPT-4, and more…
In the Toys to play with section, we look at creating an AI replica of yourself that uses your old iMessages, Geoffrey Hinton’s latest, and new (and confronting) AI-generated short films.
The BIG Stuff
Exclusive: Harvard doubles truthfulness in LLMs using new approach (8/Jun/2023)
Once again, I’m not sure why the peanut gallery is focusing on old tech. Or worse, misguided tech like Microsoft’s imitation model Orca (I am not a fan of the unnecessary hype around this small model).
Anyway, Harvard’s latest research introduces a concept called ‘inference-time intervention’ (ITI).
Our findings suggest that LLMs may have an internal representation of the likelihood of something being true, even as they produce falsehoods on the surface… At a high level, we first identify a sparse set of attention heads with high linear probing accuracy for truthfulness. Then, during inference, we shift activations along these truth-correlated directions. We repeat the same intervention autoregressively until the whole answer is generated.
Read the paper: https://arxiv.org/abs/2306.03341
My AGI counter requires truthfulness to get to 50%, and we are achieving that with this new approach: https://lifearchitect.ai/agi/
AGI countdown at 50% (16/Jun/2023)
I stand by seeing AGI achieved in the next few months (not the next few years), sometime between now and 2025ish. That doesn’t mean we’ll all have it in our lounge rooms, but that in the lab certain groups will have real AI on par with all human capabilities; probably Google DeepMind post-Gemini (my link) or OpenAI post-GPT-5 (my link), both with full physical embodiment.
McKinsey: AI to add $4.4T annually (14/Jun/2023)
Our latest research estimates that generative AI could add the equivalent of $2.6 trillion to $4.4 trillion annually across the 63 use cases we analyzed—by comparison, the United Kingdom’s entire GDP in 2021 was $3.1 trillion. [I had to check; Australia’s GDP was $1.5T, so this would be 3x Australia’s GDP just from post-2020 AI]
My table with other major AI economic analyses: https://lifearchitect.ai/economics/
92% of software developers are using AI now (13/Jun/2023)
Back in Oct/2022 when I presented to 4,000 Microsoft, Google, and IBM developers in Belgium, only around 50% of software developers had used AI coding tools.
(Watch that keynote video with transcript, timecode with question about AI use.)
Just eight months later, that number has changed significantly! GitHub reports:
Almost all developers have used AI coding tools—92% of those we surveyed say they have used them either at work or in their personal time. We expect this number to increase in the months to come.
New version of GPT-4 and more updates (13/Jun/2023)
new function calling capability in the Chat Completions API
updated and more steerable versions of gpt-4 and gpt-3.5-turbo [gpt-4-0613 and gpt-3.5-turbo-0613]
new 16k context [12,000 words] version of
gpt-3.5-turbo
(vs the standard 4k version)75% cost reduction on our state-of-the-art embeddings model [text-embedding-ada-002, see my viz of the GPT-3 family]
25% cost reduction on input tokens for
gpt-3.5-turbo
Read the announce: https://openai.com/blog/function-calling-and-other-api-updates
Here’s an example using function calls with Stable Diffusion, LangChain, & DeepLake.
The Interesting Stuff
Exclusive: 46% of crowd workers using LLMs to write (13/Jun/2023)
I’ve been talking about the ‘AI-zation’ of data for a couple of years, comparing it with pre- and post- war steel.
On 16/Jul/1945, the US detonated the first nuclear bomb in New Mexico. The bomb was referred to as ‘Gadget,’ a copy of the ‘Fat Man’ nuclear bomb dropped over Nagasaki a few weeks later, and part of Oppenheimer’s Trinity project. Both detonations marked the beginning of an increased number of radioactive particles in Earth’s atmosphere, as these particles made their way into steel due to the use of air during the steel production process.
In plain English: beginning in 1945, the steel we produce is now slightly radioactive. Since those first bombs, our air now carries radionuclides like cobalt-60, which are deposited into the steel and give it a weak radioactive signature. Medical laboratories source pre-war steel (primarily from shipwrecks) to get ‘pure’ steel (‘low-background steel’ with no radiation.
Similarly, there may be three data points for pre- and post- large language model data:
14/Feb/2019: OpenAI GPT-2 paper released; 26/May/2019: GPT-2 subreddit simulator using GPT-2 345M launched; 20/Aug/2019: OpenAI GPT-2 774M publicly released; 5/Nov/2019: OpenAI GPT-2 1.5B publicly released.
28/May/2020: OpenAI GPT-3 175B paper released; 18/Nov/2021: API completely public.
21/Mar/2021: EleutherAI GPT-Neo 2.7B (GPT-2/3 clone by EleutherAI) publicly released.
At some point, using any of these milestones (and I’d lean toward the first date of 14/Feb/2019), the data available on the web, in your email, on your social media, and even in new books (my link), started becoming ‘contaminated’ with AI-generated text.
Interestingly, the first article published using text from GPT-2—in The Verge on 14/Feb/2019 (link)—actually used screenshots of GPT-2-generated text, perhaps to minimize contamination…
This is now coming to a head in 2023, as even people tasked with writing pure text—Amazon Turk or Upwork workers hired to write new content—are using ChatGPT and other large language models to do their work for them… about 46% of the time.
In plain English, beginning in 2019, the text and content we generate via AI was trained on a significant percentage of AI-generated text (blog posts written by GPT-3, rather than say, books written by humans). This percentage will get higher and higher.
This is shocking, and a significant change for humanity. Consider the implications, especially in sourcing human-generated data from the web for training new AI models. Labs like OpenAI could choose to remove any documents close to the terms ‘GPT’ or ‘AI’ (although this would be unwise), but it is incredibly difficult (perhaps impossible) to detect AI-generated content.
This means that our next large language models will be increasingly trained on AI-generated text rather than human-generated text. Like Ouroboros, the snake that eats itself.
[Note: I don’t actually have a problem with this, as I promote the concept of ‘integrated AI’. I just think it’s an interesting and significant point in humanity’s timeline!]
Read the paper: https://arxiv.org/abs/2306.07899
ChatGPT in healthcare (15/Jun/2023)
I’m doing a fair bit of AI consulting in the healthcare and medicine space this year, as well as two upcoming keynotes for doctors. I found this practical report particularly interesting.
I’ve taken to using ChatGPT to help empathically explain specific medical scenarios to patients and their loved ones. It’s become an invaluable resource for the frequent situations where my ER ward is too busy or short-staffed for explaining complex medical diagnoses in a way that is accurate but easy to understand.
Read: I’m an ER doctor. Here’s how I’m already using ChatGPT to help treat patients.
OpenAI: 4,500 enterprise + government clients via Microsoft (May-Jun/2023)
OpenAI has its own business development arm, but additionally benefits from its $10B investment partner, Microsoft bringing in billion-dollar (and trillion-dollar) clients.
[Microsoft] Customers including IKEA and Volvo are leveraging this [OpenAI service] feature to discover business insights at scale and improve end-user journeys. (23/May/2023)
[Microsoft] Customers are already benefitting from Azure OpenAI Service today, including DocuSign, Volvo, Ikea, Crayon, and 4,500 others. (23/May/2023)
The Defense Department, the Energy Department and NASA are among the federal government customers of Azure Government… Federal, state and local government customers can access OpenAI’s GPT-4 and GPT-3 models for tasks such as generating answers to research questions, producing computer code and summarizing field reports
ChatGPT helps design an accumulator, part of a CPU (May/2023)
…two hardware engineers “talked” in standard English with ChatGPT-4 – a Large Language Model (LLM) built to understand and generate human-like text type – to design a new type of microprocessor architecture. The researchers then sent the designs to manufacture.
Read the paper: https://arxiv.org/abs/2305.13243
Read my full list of ChatGPT achievements, from accounting to quantum computing.
NYT: Silicon Valley Confronts the Idea That the ‘Singularity’ Is Here (11/Jun/2023)
The innovation that feeds today’s Singularity debate is the large language model, the type of A.I. system that powers chatbots. Start a conversation with one of these L.L.M.s and it can spit back answers speedily, coherently and often with a fair degree of illumination. “When you ask a question, these models interpret what it means, determine what its response should mean, then translate that back into words — if that’s not a definition of general intelligence, what is?”
Meta BlenderBot 3x (7/Jun/2023)
Before ChatGPT, Meta Ai’s BlenderBot was one of the front-runners for web-enabled chatbots using LLMs. ‘BB3X’ is the latest version of BB3, and while it is only 175B parameters on 300B tokens (2:1), it is still easily equivalent to the power of GPT-3 plus some extra smarts.
Read the announce: https://parl.ai/projects/bb3x/
Read the paper: https://arxiv.org/abs/2306.04707
Meta MusicGen 3.3B (12/Jun/2023)
Read the paper: https://arxiv.org/pdf/2306.05284.pdf
View the repo: https://github.com/facebookresearch/audiocraft
Playground: https://huggingface.co/spaces/facebook/MusicGen
Playground with looper: https://replicate.com/andreasjansson/musicgen-looper
Meta Voicebox (16/Jun/2023)
Man, those guys are busy. This is the latest text-to-speech model, that can clone any voice with just a few seconds of audio.
RAPHAEL text-to-image model (29/May/2023)
To me, outputs like the one above from this latest text-to-image model are better than real life…
[Compared with] Stable Diffusion XL, DeepFloyd, DALL-E 2, and ERNIE-ViLG 2.0… previous models often fail to preserve the desired concepts… only the RAPHAEL-generated images precisely reflect the prompts such as "pearl earring, Vermeer", "playing soccer", "five cars", "black high-waisted trouser", "white hair, manga, moon", and "sign, RAPHAEL", while other models generate compromised results.
Read the paper: https://arxiv.org/abs/2305.18295
Browse the project page: https://raphael-painter.github.io/
Watch my video:
AMD reveals MI300X (13/Jun/2023)
The MI300X can use up to 192GB of memory, which means it can fit even bigger AI models than other chips. Nvidia’s rival H100 only supports 120GB of memory, for example.
Large language models for generative AI applications use lots of memory because they run an increasing number of calculations. AMD demoed the MI300x running a 40 billion parameter model called Falcon.
Andromeda supercomputer with 10 exaflops for startups (14/Jun/2023)
2,512 H100s on 314 nodes interlinked with 3.2Tbps infiniband
Available for experiments, training runs, and inference
You can queue training runs that use the entire cluster, or part of it, or just ssh in
No minimum duration and superb pricing
Big enough to train llama 65B in ~10 days
Total mass of 3,291 kg (GPUs only; not counting chassis, system, rack)
For use by startup investments of Nat Friedman and Daniel Gross
Read (not very much) more: https://andromedacluster.com/
GPT-4 Outperforms Humans in Pitch Deck Effectiveness Among Investors and Business Owners (Jun/2023)
…investors and business owners were 3x more likely to invest after reading a GPT-4 pitch deck than after reading a human one.
Read more: https://clarifycapital.com/the-future-of-investment-pitching
Leta AI’s avatar creator, Synthesia, valued at $1B (12/Jun/2023)
Since 2021, I’ve been using these guys to drive Leta AI, Una AI, and various other avatars. They recently hit a $1B valuation.
Synthesia, a digital media platform that lets users create artificial intelligence-generated videos, has raked in $90 million from investors — including U.S. chip giant Nvidia…
Read more via CNBC (exclusive).
Google PaLI-X 55B visual language model (29/May/2023)
Our visual backbone is scaled to 22B parameters, as introduced by [ViT-22B], the largest dense ViT model to date. To equip the model with a variety of complex vision-language tasks, we specifically focus on its OCR capabilities… The encoder-decoder backbone is initialized from a variant of the UL2 encoder-decoder model that uses 32B parameters.
Read the paper: https://arxiv.org/abs/2305.18565
Read my summary of all Pathways models: https://lifearchitect.ai/pathways/
Read about the largest visual language model, GPT-4: https://lifearchitect.ai/gpt-4/
Can a chatbot preach a good sermon? Hundreds attend church service generated by ChatGPT to find out (10/Jun/2023)
The 40-minute service — including the sermon, prayers and music — was created by ChatGPT…
Indeed, the believers in the church listened attentively as the artificial intelligence preached about leaving the past behind, focusing on the challenges of the present, overcoming fear of death, and never losing trust…
The entire service was “led” by four different avatars on the screen, two young women, and two young men.
My coworkers are GPT-4 bots, and we all hang out on Slack (25/May/2023)
Most of our bot usage has been as filler – while the humans are talking, our bots will interject and share their thoughts and opinions. It has made the work environment incredibly entertaining.
But this is still GPT-4, the model that passes the bar exam, so you have full access to all its capabilities. We’ve been using Diana for general programming questions and brainstorming and Lucas for product-related stuff. He usually writes cards for us, fleshing them out with detail, acceptance criteria, and testing guidance, all in the correct format. He has also assisted us with creating product ideas and coming up with names, taglines, descriptions, etc. that might take a human quite some time to think up. I want 10 two-syllable product name options to choose from? Off Lucas goes. Need 20 more? Just ask him!
The Secret Sauce behind 100K context window in LLMs: all tricks in one place (16/May/2023)
Techniques to speed up training and inference of LLMs to use large context window up to 100K input tokens during training and inference: ALiBi positional embedding, Sparse Attention, FlashAttention, Multi-Query attention, Conditional computation, and 80GB A100 GPUs.
Policy
The smartest person in the world now chairing US GPT-4 working group committee (May-Jun/2023)
I’m really enjoying seeing former child prodigy (and Aussie native) Terence Tao continue to embrace post-2020 AI. Terence is measurably the smartest man in the world, and I covered his use of GPT-4 in The Memo edition 20/Apr/2023. More recently, he wrote:
As part of my duties on the [US] Presidential Council of Advisors on Science and Technology (PCAST), I am co-chairing (with Laura Greene) a working group studying the impacts of generative artificial intelligence technology (which includes popular text-based large language models such as ChatGPT or diffusion model image generators such as DALL-E 2 or Midjourney…
Read the source via Terry’s blog.
Read Terry’s latest post via Microsoft (Jun/2023).
Read Terry and GPT-4 writing essays (Jun/2023).
OpenAI lobbies EU (20/Jun/2023)