The Memo - 26/Jul/2022
AI roadmap, Microsoft NUWA-Infinity, BAAI mid-2022 report, and much more!
FOR IMMEDIATE RELEASE: 26/Jul/2022
Welcome back to The Memo.
This is supposed to be pushed out in monthly editions, but the accelerated pace of change has meant that these updates are a little more frequent right now, maybe just for this quarter!
Let’s start here. Woj is one of the founders of OpenAI, and he is currently working on GPT-4, a large language model predicted to have 5-10 trillion parameters (see my video on the AI report card). His comment from 21/Jul/2022 is very telling (‘I see future AI being incredible in connecting with people and providing emotional support and care.‘):
The BIG Stuff
Roadmap: AI’s next big steps in the world (26/Jul/2022)
My latest article is being released soon.
Title: Roadmap: AI’s next big steps in the world (AI that matters, as it happens, in plain English).
Length: 3,000 words.
Subtitles include:
Brain-machine interfaces are racing ahead.
Education is over.
Distributed income is next, and it is urgently necessary.
Jobs are for robots, high unemployment is a good thing.
Mental wellness will be the new normal.
AI is virtually ready right now, but it will take a while to flow through to all the people who want to use it.
The next step will be giant.
As a subscriber to The Memo, you have exclusive access to the article for the next little while. (You are still free to share though.)
https://lifearchitect.ai/roadmap/
Quick addendum to my ‘use cases’ article (25/Jul/2022)
A few weeks ago, I published a report on the many different use cases for LLMs like GPT-3. I finally found the missing one that was buried in my notes. And it’s a good one…
#11.5: Game design as a use case for large language models. Check out this incredible video from back in Sep/2021, where he uses voice commands sent to GPT-3, which then helps him create and design objects in augmented reality:
BCI/BMI progress
Synchron has implanted its brain-machine devices in four patients in Australia, who can now send WhatsApp messages and make online purchases via thought. In Feb/2022, the ex-President of Neuralink, Max Hodak, invested in Synchron and joined its advisory board. Synchron will implant 16x more devices this year.
My interest in this space is around the upcoming integration of very advanced AI models (Google PaLM, the imminent GPT-4 release) directly with our brains, allowing those who are interested to be integrated with artificial intelligence…
The Interesting Stuff
LLMs used for medical questions (18/Jul/2022)
Teams from MIT, IBM Watson, and various universities are working on using LLMs to augment doctor’s questions.
…their initial goal: building a model that can automatically answer physicians’ questions in an electronic health record (EHR). For the next step, they will use their dataset to train a machine-learning model that can automatically generate thousands or millions of good clinical questions, which can then be used to train a new model for automatic question answering.
https://news.mit.edu/2022/teaching-ai-ask-clinical-questions
Improving truthfulness in models (16/Jul/2022)
Anthropic (ex-OpenAI staff) continues to publish papers, the latest being a look at truthfulness in models. They have integrated a mechanism where the model (like GPT-3), can also output a truth probability, proving that it either ‘knows’ or ‘doesn’t know’ if its own response is honest and factual. Fascinating!
https://arxiv.org/abs/2207.05221
More physical embodiment via WHIRL from Carnegie Mellon (24/Jul/2022)
Researchers at Carnegie Mellon University are continuing to progress their human-to-robot imitation work. The current iteration can copy a human action video directly (same task, same camera angle, and same environment). One of the scientists notes that: “For the "improvement by exploration" phase, we use pre-trained deep visual representations trained from passive internet data to compute the distance between human and robot frames. So, the distance is robust to small changes in the camera, etc. The teaser video… has a few examples (see 0:46 onwards). That being said, human is still acting in the same environment. Our follow-up work to be released soon aims to upgrade WHIRL to learn from human interaction videos from entirely different scenes (let's say even a human video from YouTube).”
https://human2robot.github.io/
Toys to Play With
Microsoft NUWA-Infinity text-to-image/video model (demo 21/Jul/2022)
This thing is beautiful! The largest example image is 38,912x2048. For reference, 4K resolution is 3840x2160, so we’re generating images that are 10x compared to horizontal 4K with this model, and apparently it is ‘infinite’…
Limited demo: https://nuwa-infinity.microsoft.com/
Watch my video:
Microsoft’s other new text-to-image model (demo 7/Jul/2022)
Try your own keyword or phrase in Microsoft’s competitor to DALL-E 2: