FOR IMMEDIATE RELEASE: 21/Oct/2022
Welcome back to The Memo.
I’ve released my second Devoxx session via YouTube at:
What’s in my AI? (VIMA, Gato, GPT-2, GPT-3, PaLM, Chinchilla)
A subset of this about OpenAI Whisper in a 5-min video:
OpenAI Whisper... GPT-4? (short clip from 'What's in my AI?' presentation)
The BIG Stuff
OpenAI stock sales and new funding (21/Oct/2022)
After the original $1B Microsoft invested in OpenAI back in 2019, the massive AI lab is back for more. As of Jul/2022, OpenAI has even made their user numbers public (‘GPT-3, GitHub Copilot, and DALL-E 2 each have more than 1 million signups!’):

I often cite OpenAI’s Mar/2021 announcement—and that’s a good 1.5 years ago—that the GPT-3 model alone was ‘typing’ 3.1M wpm, 24x7 (they said 4.5B words per day). To put that in perspective, that’s new AI-generated content from just one model, equivalent to:
One new book every second.
One new US public library every day.
As much new text as all 300 million active users on Twitter every day.
That number was months before they opened GPT-3 to the public in Nov/2021.
Today, Oct/2022, we could probably multiply that output by at least 10x.
Also today, a few new interesting details have been revealed:
OpenAI, whose text-generating & image-generating artificial intelligence has become a mainstream hit, is in advanced talks to raise more funding from Microsoft, which previously backed the startup with capital that includes credits to use Microsoft’s Azure cloud computing services to develop its technology, according to a person with knowledge of the discussions. A new deal could help Microsoft grow Azure usage, one of its top priorities, while keeping OpenAI’s business away from rivals including Amazon Web Services and Google Cloud.
The talks follow a previously undisclosed sale of OpenAI stock by existing shareholders last year to investors including Sequoia Capital, Tiger Global Management, Bedrock Capital and Andreessen Horowitz. In that deal, the price of the shares implied a valuation of nearly $20 billion for the seven-year-old startup, said several people with knowledge of the deal.
...A person with direct knowledge of OpenAI’s finances implied the company was on track to generate revenue in the low tens of millions of dollars this year. That means OpenAI’s valuation last year likely was between 500 and 800 times the revenue it projected in 2022
…Jasper AI, which helps marketers produce text for blog posts or advertisements, has targeted annualized revenue of $80 million by the end of this year, up from around $30 million in annualized revenue a year earlier---the first year it started generating revenue. - paywalled, exclusive via The Information (20/Oct/2022)
The Interesting Stuff
Prof David Chalmers: Are Large Language Models Sentient? (13/Oct/2022)
My Aussie colleague Prof David Chalmers is well-known in the philosophy field (also founded PhilPapers), but has more recently been cited in some of the big LLM/Transformer papers, including an acknowledgment in the recent 200-page Stanford report on LLMs (Jul/2022). He shows up in a lot of Kurzweil's work on the Singularity (book and documentary) as well.
He has also previously spoken out about GPT-3, including this quote:
...GPT-3 is instantly one of the most interesting and important AI systems ever produced. This is not just because of its impressive conversational and writing abilities. It was certainly disconcerting to have GPT-3 produce a plausible-looking interview with me. GPT-3 seems to be closer to passing the Turing test than any other system to date (although “closer” does not mean “close”)... More remarkably, GPT-3 is showing hints of general intelligence.
- https://dailynous.com/2020/07/30/philosophers-gpt-3/#chalmers
More on David: https://en.wikipedia.org/wiki/David_Chalmers
This talk is from NYU this week (Oct/2022). You may also recognize my Aug/2022 LLM chart at 11:25 in the video; the chart is already out of date within a few weeks! (You can always find the latest chart at the top of LifeArchitect.ai/models.)
WeChat AI releases WeLM 10B (16/Oct/2022)
WeChat have emulated GPT-3, but used the compute-optimal recommendations from Chinchilla. The result is a 10B-parameter model trained on >300B tokens. It is 13/87 English/Chinese, so may not be as useful for English-only tasks.
Read the paper: https://arxiv.org/abs/2209.10372
Use the playground: https://welm.weixin.qq.com/docs/playground/
Text-to-image for consumers: Microsoft/DALL-E 2 vs Canva/SD (13/Oct/2022)
We are quickly seeing the ramp-up of post-2020 AI (language models and text-to-image models) being integrated into major apps. The first examples that are top of mind for me are Grammarly (Transformer, 30M users) and Replika (GPT-2, 20M users).
Now, Microsoft and Canva have built the latest AI models into their consumer-facing platforms.
Microsoft Designer uses DALL-E 2: https://designer.microsoft.com/
Canva now uses Stable Diffusion: https://www.canva.com/apps/text-to-image-(beta)
Some further reading: https://www.cnbc.com/2022/10/12/microsoft-launches-designer-its-answer-to-highly-valued-startup-canva.html
Google advances Instruct models with Flan (20/Oct/2022)
Google has used a finetuning procedure they are calling ‘Flan’ (Finetuning language models) on top of several models: PaLM, U-PaLM, and T5 (publicly released).
Read the paper: https://arxiv.org/abs/2210.11416
LAION-5B paper updated (12/Oct/2022)
Have a read: https://openreview.net/forum?id=M3Y74vmsMcY
Google continues with robotics (13/Oct/2022)
We present a framework for building interactive, real-time, natural language-instructable robots in the real world, and we open source related assets (dataset, environment, benchmark, and policies). Trained with behavioral cloning on a dataset of hundreds of thousands of language-annotated trajectories, a produced policy can proficiently execute an order of magnitude more commands than previous works: specifically we estimate a 93.5% success rate on a set of 87,000 unique natural language strings specifying raw end-to-end visuolinguo-motor skills in the real world. We find that the same policy is capable of being guided by a human via real-time language to address a wide range of precise long-horizon rearrangement goals, e.g. "make a smiley face out of blocks". The dataset we release comprises nearly 600,000 language-labeled trajectories
Check it out: https://interactive-language.github.io/
Developer combines Stable Diffusion, Whisper and GPT-3 for a futuristic design assistant (14/Oct/2022)

Read the article and watch the video: https://the-decoder.com/developer-combines-stable-diffusion-whisper-and-gpt-3-for-a-futuristic-design-assistant/
BioGPT 347M (20/Oct/2022)
This is an interesting concept: training an old GPT-2 model architecture with PubMed documents.
Params: 'We adopt the GPT-2 model architecture as the backbone of our BioGPT, which is a Transformer decoder. Currently we cannot follow the GPT-3 setting due to its extremely large model with 15 billion parameters. ...our BioGPT has 347M parameters...'
Dataset: 'We collected all the PubMed items that were updated before 2021 from the official site using the wget tool. We then filtered out all the empty items with only title but no abstract. We used the left 15M items (each with both title and abstract) as our pre-training dataset.'
Read the paper: https://arxiv.org/abs/2210.10341
Google UniTune (18/Oct/2022)
UniTune is a fine-tune of the Google Imagen text-to-image model, using just one image, and then a simple text prompt. This is different to Google DreamBooth in its approach (1 image vs DreamBooth’s 3-5 images), and focus. Google UniTune looks at:
Fidelity (faithfulness to the input photo).
Expressiveness (faithfulness to the given edit prompt).
I pulled two important quotes out of the paper:
“This makes UniTune useful by casual users e.g. by speaking to a mobile device.”
“Our work raises interesting questions even more broadly, beyond image generation, on whether we could use similar techniques to imbue large models in other domains (e.g. GPT) with preferences by fine tuning on a single example.”
That second quote sounds a lot to me like fine-tuning language models (PaLM, GPT-3, Chinchilla) on human value-centred data. I go into this a little bit in the video below.
Read the paper: https://arxiv.org/abs/2210.09477
Watch my video:
Toys to Play With
My top two AI apps (Oct/2022)
I recently updated the ‘welcome email’ to new paid subscribers of The Memo. I realized that all of the original members wouldn’t receive the welcome email, so here is a copy of the most useful bit, my top two AI apps…
I use modern (2022ish) AI apps every day. These are the ones I like the most. They are both free.
Language model using Megatron-11B online. Free; no login required; works great on mobile, too. You can do really clever things with this.
Try this prompt, and then click ‘Generate Text’:
This is a poem about the lucky country.
Text-to-image using Stable Diffusion online. Free; no login required; works great on mobile, too. Enhance/upscale to 2048x2048 (and consider printing your results on canvas!).
Try this prompt, and then click ‘Enter’:
australian aboriginal art, dot paintings, blue, very intricate, oil on canvas, photorealistic
GAN vs Diffusion (13/Oct/2022)
Nyx AI uses older GAN models + latent diffusion models (via Stable Diffusion) to produce what looks like completely realistic photos. These can be compared to current state-of-the-art Google TTI outputs in their resolution and clarity.
Take a look: https://nyx.gallery/
Twitch livestream using GPT-3 + Unity avatars (21/Oct/2022)
I can’t believe it’s taken streamers 2.5 years to catch on to the power of language models and avatars! Anyway, this show is interesting in its spontaneity and combination of GPT-2 (Replika), GPT-3, and avatars in Unity.
Live stream (sometimes online).
For those that enjoy reading technical literature…
The Memo by LifeArchitect.ai takes its namesake from the MIT AI Memos (1959-2004). I enjoyed taking a trip down this part of humanity’s history, with this quote from AI pioneer Prof Marvin Minsky’s introduction to the 1983 volumes:
[I find it strange] when entering students ask “what attracted you to AI” or “how did you get interested in computers?” To them it seems such things were always there; to us it seems they’ve barely yet arrived.
So now I’d caution students: “are you sure it’s good to be so interested in computers? Shouldn’t you try to start to work on what will come after computers?”
Of course I’d just pretend to be surprised when they’re surprised, because I haven’t yet myself imagined quite what such a thing might be. (Well, nothing like a present-day computer, but probably some sort of active-memory semantic network-and surely made of solid optics or something, because those 2-D “chips” waste too much space and therefore will not last too long.)
Download PDF of Minsky’s introduction to the 1983 volumes.
Browse the entire archive of the MIT AI Memos, 1959-2004.
Next
The open-source team at EleutherAI (under their child org, ‘CarperAI’, with support from other labs including Hugging Face) are looking to train their own Instruct model. Instruct models use Reinforcement Learning from Human Feedback (RHLF), and some examples include OpenAI’s default InstructGPT, and Google’s Flan models described above. Unusually, EleutherAI has issued a press release about their plans, even though it may be well into 2023 before a trained model is possible…
Read the release: https://carper.ai/instruct-gpt-announcement/
All my very best,
Alan
LifeArchitect.ai
Housekeeping…
Unsubscribe:
Older subscriptions before 17/Jul/2022, please use the older interface or just reply to this email and we’ll stop your payments and take you off the list!
Newer subscriptions from 17/Jul/2022, please use Substack as usual.
Note that the subscription fee for new subscribers will increase from 1/Jan/2023. If you’re a current subscriber, you’ll always be on your old/original rate while you’re subbed.
Gift a subscription to a friend or colleague for the holiday season: