The Memo - 15/Sep/2022
Google Pathways PaLI, Andi as replacement to Google search, Adept ACT-1, and much more!
FOR IMMEDIATE RELEASE: 15/Sep/2022
Welcome back to The Memo.
The BIG Stuff
Google Pathways PaLI: Pathways Language and Image model (15/Sep/2022)
Google continues to build on the massive Pathways architecture, announcing a new visual language model today. It is similar to DeepMind Flamingo (which includes Chinchilla), but sorts through 10B+ images (5x bigger than LAION and competitors), plus the mT5 large language model, for a total of 17B parameters. As a reminder, visual models only output text. You can see this in action with my videos on DeepMind Flamingo: the Flamingo video - Part 1, and the model joking about a photo of former President Obama in the Flamingo video - Part 2.
Read the paper: https://arxiv.org/abs/2209.06794
I stand by my assertion that Google Pathways is a sleeping hit, and their entire philosophy to design a model architecture that does everything is groundbreaking. For a look at the state-of-play back in Aug/2022(!), download my independent report. Note that there have been two new models added to the family in the ~4 weeks since then: PaLM-SayCan for embodiment/robotics, and this PaLI vision model:
Read my original Pathways report (Aug/2022): https://lifearchitect.ai/pathways/
Watch the video (Aug/2022):
New viz: Code Generation models (Sep/2022)
I looked at the major code generator models, including those bigger than GitHub CoPilot (OpenAI Codex). Notably, Salesforce’s CodeGen model is open, and far bigger than CoPilot. However, it should be noted that even smaller models like Google’s monorepo is seeing extraordinary results.
https://lifearchitect.ai/models/#code
Adept’s ACT-1 Transformer model (14/Sep/2022)
Headed up by the creator of Google Transformer, Dr Ashish Vaswani, and other researchers coaxed away from DeepMind, Google Brain, and OpenAI, plus more than $65m funding, Adept have finally detailed something amazingly powerful. The ACT-1 model looks like it brings an exponential improvement to human efficiency in browser and application use.
ACT-1 is a large-scale Transformer trained to use digital tools — among other things, we recently taught it how to use a web browser. Right now, it’s hooked up to a Chrome extension which allows ACT-1 to observe what’s happening in the browser and take certain actions, like clicking, typing, and scrolling, etc.
You have to see it to believe it.