The Memo - 24/Nov/2022
Stable Diffusion 2.0, Meta Galactica 120B, Microsoft/NVIDIA H100 supercomputer, and much more!
FOR IMMEDIATE RELEASE: 24/Nov/2022
Welcome back to The Memo.
This edition features a bunch of exclusive content, including a Chinese AI-generated song, one of which has 100M views. We also hear some reggae music via Jukebox, and play with GPT-3 in Roblox (based on Google SayCan) and a “human or AI” GPT-3 game!
I’ve been experimenting with livestreams to allow more interaction and Q&A during videos. You’re welcome to join the next one. You can click the ‘notify’ button to be pinged when a new livestream begins.
The BIG Stuff
Stable Diffusion 2.0 released (24/Nov/2022)
The new Stable Diffusion 2.0 base model ("SD 2.0") was released two hours ago. It was trained from scratch using the OpenCLIP-ViT/H text encoder that generates 512×512 images, with improvements over previous releases (better FID and CLIP-g scores).
It also features upscaling to 2048×2048 and beyond!
Read the release notes: https://github.com/Stability-AI/StableDiffusion
There is a demo at HF, or wait for the update to hit mage.space and the official dreamstudio.
The Interesting Stuff
Meta Galactica 120B (16/Nov/2022)
Meta AI has released Galactica, a 120B-parameter model specializing in scientific data. Meta have hit on some very interesting innovations here. Training on prompts is fascinating. Maintaining full reference data is fascinating.
- “Chinchilla scaling laws”… did not take into the account of fresh versus repeated tokens. In this work, we show that we can improve upstream and downstream performance by training on repeated tokens.
- Our corpus consists of 106 billion tokens from papers, reference material, encyclopedias and other scientific sources.
- We train the models for 450 billion tokens.
- For inference Galactica 120B requires a single A100 node.
See my report card: https://lifearchitect.ai/report-card/
Read the paper: https://galactica.org/static/paper.pdf
Play with the demo: https://galactica.org/
Note: The slick demo site was swiftly pulled within 72 hours, seemingly for political/sensitivity reasons. Read the update by MIT: https://www.technologyreview.com/2022/11/18/1063487/meta-large-language-model-ai-only-survived-three-days-gpt-3-science/
It was reinstated by a HF user without the nice interface.
Watch my 1-hour livestream of the model release a few hours before the model demo was suspended:
VectorFusion by UC Berkeley (21/Nov/2022)
Text-to-image for vectors (SVG exports).
Prompt: the Sydney Opera House. minimal flat 2d vector icon. lineal color. on a white background. trending on artstation
MagicVideo by Bytedance (22/Nov/2022)
Efficient text-to-video by Chinese company, Bytedance.
Read the paper: https://arxiv.org/abs/2211.11018
View the gallery: https://magicvideo.github.io/
SceneComposer by Johns Hopkins & Adobe (22/Nov/2022)
Text-to-image by researchers.
Read the paper: https://arxiv.org/abs/2211.11742
View the gallery: https://zengyu.me/scenec/
Andromeda: Cerebras’ supercomputer (14/Nov/2022)
Andromeda delivers 13.5 million AI cores and near perfect linear scaling across the largest language models. It is not really comparable to a standard supercomputer with GPUs. Andromeda is deployed in Santa Clara, California.
Read a related article by The Verge.
NVIDIA & Microsoft building a supercomputer based on the H100 (16/Nov/2022)
Back in the Jul/2022 edition of The Memo, we talked about NVIDIA’s newest H100 Hopper GPUs, the fastest AI-specific GPUs, designed for—and by—AI training!