The Memo - 13/Mar/2023
Midjourney v5, PaLM-E, Microsoft VALL-E X (and Elon's gifted school), and much more!
FOR IMMEDIATE RELEASE: 13/Mar/2023
Welcome back to The Memo.
I’m pushing this edition out before the rumoured release of GPT-4 this Thursday 16/Mar/2023. If that announcement happens, you’re welcome to join me for a livestream discussion a few hours after Microsoft’s AI event ‘Reinventing productivity: The future of work with AI’.
Livestream link/notify me: (link)
Our Policy section remains in this edition for the latest ARK and US Govt AI report releases.
In the Toys to play with section, we look at Designer, an open-source ChatGPT model, an option for running LLaMA on your own laptop or Raspberry Pi(!), the return of the 24/7 AI series, a ChatGPT trivia competition, and much more…
The BIG Stuff
Midjourney v5 incoming (10/Mar/2023)
As of mid-March 2023, Midjourney v4 is easily the best text-to-image generator available, outperforming everything else I’ve tried (DALL-E 2, SD2.1, Craiyon, and the image examples from NUWA-Infinity and Parti and Imagen).
Midjourney version 5 is moments away, featuring a higher 1024×1024 default resolution, and a fine-tuned model with a wider range of styles.
Remember, each image below is uniquely ‘conceptualised’ from scratch by AI, and didn’t exist until someone typed a few words into the prompt for the AI to ‘complete’…
View more images using the #midjourneyV5 hashtag on Twitter.
Read (not very much) more via their Discord announcement.
Google PaLM-E (6/Mar/2023)
More embodied language models! We mentioned Microsoft’s ChatGPT embodiment using drones and robot arms in The Memo 25/Feb/2023 edition. This week, Google brought us PaLM-E. PaLM-E stands for Pathways Language Model - Embodied (read my full paper on Google Pathways). PaLM-E is 562B parameters; 540B of PaLM for language, and 22B parameters of ViT for vision.
See the videos: https://palm-e.github.io/
Read the paper: https://palm-e.github.io/assets/palm-e.pdf
The Interesting Stuff
Microsoft VALL-E X: ‘Speak Foreign Languages with Your Own Voice‘ (9/Mar/2023)
All the way back in 2017, I wrote a report on Elon Musk’s gifted school, Ad Astra in California. My discoveries and insights into their process were documented and distributed by the media. But even back then (just slightly ahead of Google’s Transformer release), there was not much in the way of tangible outputs; AI models that could actually do this stuff nicely. I wrote in my 2017 paper:
The future of gifted education assumes integrated artificial intelligence,
so is more focused on the human side of life… Some Australian state education departments have recently enforced mandatory second language teaching (for example, Mandarin Chinese, Spanish, French) in classrooms. Second languages are not taught at Ad Astra. Yes, learning an additional language has been shown to be beneficial in supporting a child’s brain development and understanding of other cultures. Elon’s involvement in Neuralink—an American neurotechnology company developing implantable brain-computer interfaces—gives some indication of why learning languages is part of the past, not part of the future. (—via LifeArchitect or PDF)
Microsoft have this week unveiled their (closed) research model VALL-E X, which ‘can generate high-quality speech in the target language via just one speech utterance in the source language as a prompt while preserving the unseen speaker's voice, emotion, and acoustic environment.’
Read the paper: https://arxiv.org/abs/2303.03926
Listen to the examples: https://vallex-demo.github.io/
MIT: Exclusive interview with ChatGPT creator John Schulman + team (3/Mar/2023)
Jan Leike: ‘In one sense you can understand ChatGPT as a version of an AI system that we’ve had for a while. It’s not a fundamentally more capable model than what we had previously. The same basic models had been available on the API for almost a year before ChatGPT came out. In another sense, we made it more aligned with what humans want to do with it. It talks to you in dialogue, it’s easily accessible in a chat interface, it tries to be helpful. That’s amazing progress, and I think that’s what people are realizing.’
Read the whole interview from MIT.
Humane: ex-Apple staff have a quarter billion to spend on AI (9/Mar/2023)
You heard it here first, Humane AI or hu.ma.ne (not to be confused with the EU-funded HumanE-AI-Net project) is made up of former members of Apple’s iPhone design team. Now partnered with OpenAI, I have a very strong feeling that we’re going to be hearing a lot more about this company soon…
The company, founded in 2018 by Imran Chaudhri and Bethany Bongiorno, has now raised $241 million but has yet to disclose what it is building, saying only that it is a "software platform and consumer device built from the ground up for artificial intelligence."
A video posted by the company and patent filings suggest that a wearable device will project information onto the real world and allow users to manipulate that information with their hands. (—via Reuters)
‘Google’s Plan to Catch ChatGPT Is to Stuff AI Into Everything’ (8/Mar/2023)
…one current employee puts it: “There is an unhealthy combination of abnormally high expectations and great insecurity about any AI-related initiative.”
…The effort has Pichai reliving his days as a product manager, as he’s taken to weighing in directly on the details of product features, a task that would usually fall far below his pay grade, according to one former employee. Google co-founders Larry Page and Sergey Brin have also gotten more involved in the company than they’ve been in years, with Brin even submitting code changes to Bard, Google’s ChatGPT-esque chatbot.
Between ChatGPT launch and Bard launch, there is now a lot of AI in the browser (Mar/2023)
7/Feb/2023: Microsoft launches Bing Chat (GPT-3.5) in Edge.
10/Feb/2023: Opera adds ChatGPT model to browser.
28/Feb/2023: Microsoft adds Bing Chat (GPT-3.5) to Windows 11 taskbar.
2/Mar/2023: Brave adds Brave AI model (BART 140M/DeBERTa 1.5B fine-tuned to search results).
8/Mar/2023: DuckDuckGo’s DuckAssist using OpenAI + Anthropic models via Wikipedia.
9/Mar/2023: Discord builds in chat (GPT-3.5) to their platform.
Read more: https://lifearchitect.ai/bard/#timeline
Countdown to AGI (12/Mar/2023)
With Microsoft’s ChatGPT + drones, and Google’s PaLM-E (above), I think we’re now around 42% of the way to artificial general intelligence, as of this month.
Read more: https://lifearchitect.ai/agi/
Watch my video:
Policy
I’ve released a ‘special edition’ of The Memo to step through ARK’s latest report on AI.
Title: ARK Invest: Big Ideas 2023!
Chapter: Artificial Intelligence: Creating The Assembly Line For Knowledge Workers
By: Will Summerlin, Frank Downing
Date: 31/Jan/2023
Read the special edition of The Memo; ARK’s report with my comments.