The Memo - Special edition: Claude 3 Opus
Anthropic releases Claude 3, outperforming all models including GPT-4
To: US Govt, major govts, Microsoft, Apple, NVIDIA, Alphabet, Amazon, Meta, Tesla, Citi, Tencent, IBM, & 10,000+ more recipients…
From: Dr Alan D. Thompson <LifeArchitect.ai>
Sent: 4/Mar/2024
Subject: The Memo - AI that matters, as it happens, in plain English
AGI: 71%
The BIG Stuff
Anthropic releases Claude 3
Once again, we have this out to The Memo readers within just a few hours of model release.
Key points:
Alan’s estimate for Claude 3 Opus: 2T parameters trained on 40T tokens.
3 models sizes: Haiku (~20B), Sonnet (~70B), and Opus (~2T).
Trained with synthetic data, probably generated by Claude 2.1 or GPT-4.
New highest MMLU score (Claude 3=86.8 vs GPT-4=86.4).
Long context (working memory) = 200K standard, 1M for researchers.
Multimodal: Has vision, like GPT-4V and Gemini. Also has ‘tool use’ built-in.
My initial testing shows Claude 3 Opus to be on par with GPT-4, and perhaps better in some metrics. This is the one to beat! (And we’ll beat it shortly.)
Read the Claude 3 announce: https://www.anthropic.com/news/claude-3-family
Read the paper (no arch details): https://www.anthropic.com/claude-3-model-card
See also my Models table, and Timeline.
Playground: https://www.anthropic.com/claude
You can use it immediately within Poe (paid, login): https://poe.com/Claude-3-Opus
I’d like to invite you to gift a subscription to someone in your world who needs AI that matters, as it happens, in plain English:
All my very best,
Alan
LifeArchitect.ai
In their Gemini release (Dec 6), Google indicated that Gemini Ultra had achieved 90.0% on the MMLU benchmark using CoT@32. Most others are reporting the 5-shot approach as above. Google's release suggests that GPT-4, using CoT@32, scored 87.29%. Do we know what Claude 3 Opus scores using the CoT approach? Is this significant, as this would put Gemini Ultra (and maybe Claude 3 Opus) above the "human expert" at 89.8% on MMLU per your scale at lifearchitect.ai/gpt-4-5?
Dr. Thomson! Alan! Have you seen this conversation with Claude 3? A really, really interesting deep dive into the phenomenology of being an AI! https://github.com/daveshap/Claude_Sentience/blob/main/conversation.md