The Memo - 17/Sep/2023

Falcon 180B, Apple UniLM, Microsoft phi-1.5, and much more!

Sep 17, 2023

FOR IMMEDIATE RELEASE: 17/Sep/2023

Dr Sébastien Bubeck, Microsoft Research (7/Apr/2023):
There is some intelligence in this [GPT-4] system… Beware of the trillion-dimensional space. It's something which is very, very hard for us as human beings to grasp. There is a lot that you can do with a trillion parameters…

It could absolutely build an internal representation of the world, and act on it as the processing progresses through the layers and through the sentence temporally… We shouldn't think about those neural networks as learning a simple concept like ‘Paris is the capital of France.’ It's doing much more, like learning operators, it’s learning algorithms. It's not just retrieving information, not at all. It has built internal representations that allows it to reproduce the data that it has seen succinctly… Yes, it was trained just to predict the next word. But what emerged out of this is a lot more than just a statistical pattern-matching object.

Welcome back to The Memo.

You’re joining full subscribers from Harvard, Rice, Columbia, MIT, Cornell, Stanford, Brown, UC Berkeley, FSU, Princeton, and more…

This is another long edition. I don’t usually announce my keynotes here (nearly all of them are for private bodies), but I’m really looking forward to opening the next public event in Ukraine by Devoxx, ‘AI – Friend or Foe?’ My keynote is called ‘Superintelligence: No one is smart enough‘ and the recording will be made available to full subscribers. The next roundtable is on 23/Sep/2023, details at the end of this edition.

In the Toys to play with section, we look at a new prediction viz for GPT, a brilliant application of image models for your own family, and a new vision-language-action model for self-driving vehicles.

The BIG Stuff

Andreessen Horowitz: How Are Consumers Using Generative AI? (13/Sep/2023)

ChatGPT: estimated 200M monthly users, 1.6B monthly visits (Jun/2023).
ChatGPT: 24th most visited website globally.
80% of the top 50 AI products didn’t exist a year ago.

Falcon 180B is the largest open(-ish) dense model right now (6/Sep/2023)

TII in Abu Dhabi released a 180B parameter version of Falcon, trained on 3.5T tokens (20:1). It is the largest and highest performing open dense model in the world right now, about twice as big as Llama 2. The definition of ‘open’ does not include full commercial use.

Try the demo: https://huggingface.co/spaces/tiiuae/falcon-180b-demo

See Falcon 180B on the Models Table and Timeline.

Watch a replay of my Falcon 180B livestream.

Apple UniLM 34M + Apple Ajax 200B (7/Sep/2023)

Hacker Jack Cook discovered Apple’s newest LLM, a tiny Transformer model used for predictive text on-device in the next version of macOS (Sonoma 14.0), and the next version of iOS (iOS 17).

From my calculations based on sizes of each layer, Apple’s predictive text model appears to have about 34 million parameters, and it has a hidden size of 512 units. This makes it much smaller than even the smallest version of GPT-2.

The reason this one is so interesting to me is because Apple has 2B+ active devices out there (Feb/2023).

UniLM may be the very first on-device Transformer model, and it’s hitting a huge slice of the population.

Browse the repo: https://github.com/jackcook/predictive-spy

Apple’s most advanced LLM, known internally as Ajax GPT, has been trained on “more than 200 billion parameters” and is more powerful than OpenAI’s GPT-3.5…

Read more via The Verge.

See UniLM on the Models Table.

We covered some of this (UniLM and Ajax) in my recent livestream (replay).

Bonus: There were 9 major model announcements in the first two weeks of Sep/2023.

TinyLlama
Falcon 180B
FLM-101B
Persimmon-8B
UniLM
phi-1.5
NExT-GPT
MoLM
DeciLM

GPT-4 hits 99th percentile in creativity testing (25/Aug/2023)

The gold standard for testing creativity is the Torrance Tests of Creative Thinking, TTCT (wiki). The TTCT was designed to measure six sub-constructs of creativity and creative strengths: fluency, flexibility, originality, elaboration, titles, closure, and creative strength.

GPT-4 scored in the top 1% of test-takers for the originality of its ideas… Scholastic Testing Service is a private company, it does not share its prompts with the public. This ensured that GPT-4 would not have been able to scrape the internet for past prompts and their responses.

I’ve updated my viz to include this important benchmark:

Stability AI releases Stable Audio (13/Sep/2023)

I keep thinking back to when we didn't have Stability AI, and it was just Google and Meta teasing us with mouth watering papers, but never letting us touch them. I'm so thankful Stability exists. (HN user, 13/Sep/2023)

Stable Audio is a latent diffusion model (like Stable Diffusion) trained on >800k [studio quality] sound files. A 907M parameter U-net powers Stable Audio.

Read release: https://stability.ai/research/stable-audio-efficient-timing-latent-diffusion

Try it out: https://stableaudio.com/

Exclusive: Microsoft argues about data quality with phi-1.5 (12/Sep/2023)

Microsoft’s latest LLM is a 1.3B-parameter model trained on 150B tokens, with performance comparable to models 5x larger.

The most interesting part of this entire piece is that Microsoft is arguing against ever-larger datasets, and proposing that dataset quality is more important.

It reminds me of my exploration of this topic more than two years ago, back in Jun/2021, with my paper Integrated AI: Dataset quality vs quantity via bonum (GPT-4 and beyond). Microsoft came to the same conclusion, finding that (I’m gonna bold the whole thing; it’s important!):

…the creation of a robust and comprehensive dataset demands more than raw computational power: It requires intricate iterations, strategic topic selection, and a deep understanding of knowledge gaps to ensure quality and diversity of the data. We speculate that the creation of synthetic datasets will become, in the near future, an important technical skill and a central topic of research in AI.

Read the paper: https://arxiv.org/abs/2309.05463

See phi-1.5 on the Models Table.

Exclusive: Inflection ready to train a 1,000T parameter model (1/Sep/2023)

This whole interview with Inflection’s CEO is great. But there are some monumental points in there…

We [Inflection] have 6,000 H100s in operation today, training models. By December, we will have 22,000 H100s fully operational. And every month between now and then, we’re adding 1,000 to 2,000 H100s. So people can work out what that enables us to train by spring [Mar/2024 in US], by summer of next year [Jun/2024 in US], and we’ll continue training larger models.
We’re going to be training models that are 1,000x larger than they currently are in the next three years. Even at Inflection, with the compute that we have, will be 100x larger than the current frontier models in the next 18 months.
…[and we’re not even] an AGI company; we’re not trying to build a superintelligence. We’re trying to build a personal AI.

Read the transcript.

This says to me that these companies are going big. Really big.

1,000 trillion parameters is 1 quadrillion parameters.

GPT-4 is 1.76T parameters. 1,000× is 1,760T (or 1.76 quadrillion) parameters. The capabilities of a model 1,000× GPT-4 would be something to behold…

I had a hunch we were quickly heading this way when I designed the original LifeArchitect.ai report card ruler in 2022 (shown below) which caps out at 1Qa (quadrillion) parameters. Next stop is quintillion (Qi).

The parameters/tokens ruler from the LifeArchitect.ai report card goes to 1Qa: https://lifearchitect.ai/report-card/

Check out my latest visualization showing humans versus AI (GPT-4 and Gemini):

The Interesting Stuff

Google OPRO self-improves (7/Sep/2023)

…prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K [maths], and by up to 50% on Big-Bench Hard [IQ] tasks.

Read the paper: https://arxiv.org/abs/2309.03409

This milestone bumped up my conservative countdown to AGI from 54% ➜ 55%.

Alan’s conservative countdown to AGI: https://lifearchitect.ai/agi/

Use case: GPT for medical diagnosis (11/Sep/2023)

Article title: A boy saw 17 doctors over 3 years for chronic pain. ChatGPT found the right diagnosis

The frustrated mom made an account and shared with the artificial intelligence platform everything she knew about her son's symptoms and all the information she could gather from his MRIs.
“We saw so many doctors. We ended up in the ER at one point. I kept pushing,” she says. “I really spent the night on the (computer) … going through all these things."
So, when ChatGPT suggested a diagnosis of tethered cord syndrome, "it made a lot of sense," she recalls.

Policy

US restricts exports of some Nvidia chips to certain Middle East countries

Further to our exploration of US restrictions of GPUs to China, the US has now extended this to unspecified Middle East countries. A reason for the ban was listed by Reuters:

Nvidia, which gave no reason for the new restrictions in the filing dated Aug. 28, last year said U.S. officials informed them the rule "will address the risk that products may be used in, or diverted to, a 'military end use' or 'military end user' in China."

Reuters: https://archive.md/Hc9qo

Axios Exclusive survey: Experts favor new U.S. agency to govern AI (6/Sep/2023)

AI experts at leading universities favor creating a federal "Department of AI" or a global regulator to govern artificial intelligence over leaving that to Congress, the White House or the private sector. That's the top-level finding of the new Axios-Generation Lab-Syracuse University AI Experts Survey of computer science professors from top U.S. research universities. The survey includes responses from 213 professors of computer science at 65 of the top 100 U.S. computer science programs, as defined by SCImago Journal rankings.

The survey found experts split over when or if AI will escape human control -- but unified in a view that the emerging technologies must be regulated. "Regulation" was the top response when asked what action would move AI in a positive direction. Just 1 in 6 said AI shouldn't or can't be regulated. Only a handful trust the private sector to self-regulate. About 1 in 5 predicted AI will "definitely" stay in human control. The rest were split between those saying AI will "probably" or "definitely" get out of human control and those saying "probably not."

This is an excellent example of ‘no one is smart enough for AI’. When only 20% of computer science professors believe that AI will definitely stay in human control, we have a problem. If they are like most of the CS profs I’ve met, they are locked in their 1990s mindset, and have no place commenting on post-2020 AI.

Be very careful who you listen to…

Hawley, Blumenthal unveil bipartisan AI framework (8/Sep/2023)

While we wait to see the draft, the announcement is just a screenshot(!) on Twitter (8/Sep/2023):

There has already been quite a bit of criticism:

Licensing for large models is based on illusory risks drummed up by folks who want to create an AT&T 1950's style regulatory moat for existing companies and lock out competition. This is, to put it plainly, un-American. Freedom to compete and to create are what made American business what it is today. Creating this kind of overreaching framework will stifle innovation and smash American competitiveness. (Twitter, 12/Sep/2023)

Toys to Play With

Dr Harvey Castro’s new AI in healthcare course, AIDE (Sep/2023)

Welcome to A.I.D.E - Artificial Intelligence Decoding & Exploring in Healthcare. Taught by Dr. Harvey Castro, this course is your gateway to understanding and applying AI in the medical field. With 65 lessons across eight modules, you'll dive deep into everything from Large Language Models to ethical considerations.

Exclusive 30% discount. Use code THEMEMO30 at checkout.

Join the course: https://www.harveycastromd.info/aide

LINGO-1: Exploring Natural Language for Autonomous Driving (14/Sep/2023)

I’m putting this under ‘Toys to play with’ because you could spend a lot of time watching and reading about this huge application of LLMs/VLMs to the niche field of self-driving cars.

Microsoft-backed self-driving car startup Wayve has pioneered AI-backed cars that explain their decisions while driving.

The commentary technique is reminiscent of roadcraft used by professional driving instructors in their lessons: instructors say interesting aspects of the scene aloud and justify their driving actions using short phrases, helping their students learn by example…
In this first video, LINGO-1 describes the actions it takes when it overtakes a parked car.
LINGO-1: I’m edging in due to the slow-moving traffic.
LINGO-1: I’m overtaking a vehicle that’s parked on the side.
LINGO-1: I’m accelerating now since the road ahead is clear.

Flashback

Two years ago this week… Don’t say I didn’t warn you!

Google DeepMind Gemini is on its way! Some people are claiming to have access already:

I have been testing a version of Google’s Gemini and find it very interesting. It is equivalent to ChatGPT-4 but with newly up to the second knowledge base. This saves it from some hallucinations. (Twitter, 15/Sep/2023)

The next roundtable will be:

Life Architect - The Memo - Roundtable #2 with Harvey Castro
Follows the Chatham House Rule (no recording, no outside discussion)
Saturday 23/Sep/2023 at 5PM Los Angeles
Saturday 23/Sep/2023 at 8PM New York
Sunday 24/Sep/2023 at 8AM Perth (primary/reference time zone)
or check your timezone via Google.

You don’t need to do anything for this; there’s no registration or forms to fill in, I don’t want your email, you don’t even need to turn on your camera or give your real name!

All my very best,

Alan
LifeArchitect.ai

Search | Archives

The Memo by LifeArchitect.ai