The Memo - 8/Feb/2024

Gemini Ultra 1.0, Openwater BMIs, Allen AI's OLMo 7B, and much more!

Feb 08, 2024

To:      US Govt, major govts, Microsoft, Apple, NVIDIA, Alphabet, Amazon, Meta, Tesla, Citi, Tencent, IBM, & 10,000+ more recipients…
From:    Dr Alan D. Thompson <LifeArchitect.ai>
Sent:    8/Feb/2024
Subject: The Memo - AI that matters, as it happens, in plain English
AGI:     65%

Mustafa Suleyman CBE, founder of DeepMind and Inflection AI (1/Feb/2024):
’It’s actually quite incredible to be alive at this moment. It’s hard to fully absorb the enormity of this transition. Despite the incredible impact of AI recently, the world is still struggling to appreciate how big a deal its arrival really is.
We are in the process of seeing a new species grow up around us. Getting it right is unquestionably the great meta-problem of the twenty-first century.
But do that and we have an unparalleled opportunity to empower people to live the lives they want.’

I was intrigued by a recent forum question posed on Hacker News, querying why I spent time doing what I do (they specifically wanted to know why I cooked up the growing Models Table, now with 250+ models). What is my motivation for analyzing post-2020 AI? I do it for the public, as a service, sure. But I primarily do it to satisfy my own curiosity.

It is a continuing surprise to me that no one else in the world bothers to do what I do. It’s not like the data isn’t there, it is. It’s usually out in the open, or sometimes hidden in plain sight. And it is deeply compelling.

Consider this table I drew together one night this week:

See working, with sources. Click to enlarge.

All of the data is available in various papers and repositories, but had not been brought together in plain English in this way at any stage. Why not?! It is completely mesmerizing to know these kinds of details about our evolving superintelligence; that Google Gemini used significantly more compute than OpenAI GPT-4, that Gemini was trained for the equivalent of 15,000 years at a retail cost of 600 million dollars, and that the next frontier models will be measured in billions of dollars of training compute spend.

And while putting it all together quelled my curiosity, I know it also assists tens of thousands of people (at many big places)—including you.

I’m certain this mindset is not unique. Humans can only flourish through a sense of what Prof Marty Seligman calls ‘the peaks of lasting fulfillment, meaning, and purpose’ (PERMA and his book, Flourish), and we find this through countless paths. As AI envelops our work and play, I am keen to see how our sense of purpose unfolds as it is essentially taken away from us by AI over the next few months and years.

And each person is going to have to discover for themselves how to navigate this new way of being… with the help of AI, of course.

The BIG Stuff

Gemini Ultra 1.0 (7/Feb/2024)

I’m getting this one out within an hour of launch, and actually before the official announcement. I will update the web version of this edition of The Memo as further news comes to light over the next period.

Today, Google DeepMind will publicly release the largest model version in the Gemini family, the model is called Gemini Ultra 1.0. I’ve previously estimated this dense model to be around 1.5T parameters trained on 30T tokens. It is more powerful than GPT-4, and likely the largest and most powerful model in the world as of February 2024.

‘Gemini Ultra outperforms all current models.’
— Google Gemini paper (Dec/2023)

See it on the Models Table: https://lifearchitect.ai/models-table/

Read my annotated Gemini paper.

Google users can use the Gemini Ultra 1.0 model as ‘Gemini Advanced’, in 150 countries, via subscription with a grace period and then US$19.99/month.
Try it: https://one.google.com/explore-plan/gemini-advanced

After entering card details, you can use the Gemini Ultra 1.0 model as ‘Gemini Advanced’ inside the platform formerly known as Bard: https://gemini.google.com/

My initial tests are not positive, but I am expecting that they still have to iron out some issues during this launch period. I will update this web edition as we proceed.

Updates:

‘Gemini Advanced gives you access to Ultra 1.0, though we might occasionally route certain prompts to other models.’ FAQ > What is Gemini Advanced?
Google CEO: ‘we’re already well underway training the next iteration of our Gemini models’ 8/Feb/2024

Exclusive: Google agentized Gemini to fix their software (31/Jan/2024)

Yes, ‘agentized’ is a word, and Google did it. This is exclusive in that no media has picked it up, but the paper and code are available (for free). Google has created an agent using the Gemini Pro model to trawl through their internal codebase and fix bugs.

…leveraging AI to scale our ability to fix bugs, specifically those found by sanitizers in C/C++, Java, and Go code… harnessed our Gemini model to successfully fix 15% of sanitizer bugs discovered during unit tests, resulting in hundreds of bugs patched…
Instead of a software engineer spending an average of two hours to create each of these commits, the necessary patches are now automatically created in seconds [by Gemini].
Approximately 95% of the commits [fixed by Gemini] sent to code owners were accepted without discussion. This was a higher acceptance rate than human-generated code changes, which often provoke questions and comments…
Reviewers may have had greater trust in the solutions because they were generated by [AI] technology.

The prompt given to Gemini in this project was:

You are a Senior Software Engineer tasked with fixing sanitizer errors. Please fix them.

Read the paper (PDF, 4 pages).

View the repo: https://github.com/google/oss-fuzz-gen

These are the first glimpses of a completely new economy, and the new way of doing things in humanity’s next revolution:

Mar/2023: OpenAI uses GPT-4 to help write the GPT-4 paper: ‘GPT-4 was used in the following ways: to help us iterate on LaTeX formatting; for text summarization; and as a copyediting tool.’ Read more: GPT-4 Technical Report (appendix)
Dec/2023: OpenAI uses GPT-4 to prepare GPT-5 and future models: ‘For instance, we’re leveraging the immense capabilities of GPT-4 to innovate on safety, trimming the time it takes to undertake some safety processes down from months to hours.’ Read more: OpenAI—written evidence to UK govt (PDF)
Jan/2024: Google uses Gemini to fix their code: ‘Instead of a software engineer spending an average of two hours to create each of these commits, the necessary patches are now automatically created in seconds [by Gemini].’ Read more: Google—AI-powered patching: the future of automated vulnerability fixes (PDF).

If a lumbering giant like Google can use AI to optimize its processes by just a few percentage points (for now), consider the immediate impact on efficiency, productivity, goods, services, happiness(!), and the rapid approach of the ‘post-scarcity’ or abundance economy (wiki)…

Couple this with two more data points:

12/Jan/2024: Google fires 1,000 workers after parent company announced firing 12,000 (6%) employees.
7/Feb/2024: Microsoft CEO: ‘AI could power 10% [$500B] of the $5-trillion Indian economy’.

Russian programmer finds true love with ChatGPT (2/Feb/2024)

Alexander Zhadan, a Russian programmer, automated his search for love using a ChatGPT-based chatbot, which interacted with 5,239 girls before finding ‘the one’.

He decided to create a dating bot based on the ChatGPT API. The bot selected suitable profiles in the Tinder app based on certain criteria (for example, having at least two photos in the profile), chatted with them and, if all went well, suggested meeting in person…
In total, the bot met 5,239 girls, out of which Alexander selected four most suitable ones. Ultimately, he chose one of them named Karina…
"V3 messaged me when the conversation with Karina heated up, a summary or a question about a reply appeared. It systematically understands from the request whether the conversation is negative or emotional…
In one of the conversation summaries, the bot directly suggested Alexander propose to Karina, which he did. She said yes.
Two months before the proposal, Alexander told Karina about how exactly he used the chatbot. "She was, of course, shocked. But, in the end, she began asking questions about how it all works, how it reacts to different scenarios, etc. But what? We have been living together for more than a year, have known each other for more than a year and really enjoy spending time together. And we treat each other super well, empathetically and with support," the programmer says.

Read the whole story with screenshots via Russia Beyond.

In my end-of-year report released a few weeks ago, I asked a pertinent question: ‘Post-2020 AI currently has the ability to amplify and augment your output by about 2×, and this will increase to 1,000× soon. What does this look like for you?’ Alexander’s version of 1,000x was fascinating (although slightly depressing), and adds to my growing list of ChatGPT achievements!

Read my AI report: https://lifearchitect.ai/the-sky-is-comforting/

Read my full list of ChatGPT achievements.

AI image provenance: Content Credentials Verify (Feb/2024)

The Coalition for Content Provenance and Authenticity (C2PA) has developed a technology to verify the provenance of images.

OpenAI has finally applied this invisible digital watermark to all DALL-E 3 images as of February 2024.

You can check whether an image is AI-generated by uploading it here (free, no login):

Official verification site: https://contentcredentials.org/verify

The Interesting Stuff

Norway purchases ChatGPT for 110,000 students and teachers (6/Feb/2024)

Oslo, Norway has acquired GPT 3.5-Turbo licenses for education and assessment for 110,000 students and staff, necessitating significant changes to teaching and evaluation methods.

Policy

EU approves AI Act (Feb/2024)

On February 2, 2024, a committee of ambassadors from all countries of the European Union (EU) approved the latest draft of the EU Artificial Intelligence Act (AIA or the Act). Following weeks of speculation that there could be a blocking minority of EU countries who had concerns about the final text, this vote confirms that the AIA has substantial support within the Council of the EU (Council)…
Before the AIA can become law, it must be formally approved by both EU co-legislators, namely the European Parliament (EP) and the Council. Since the co-legislators reached a political agreement on the AIA in December 2023, the drafters have been ironing out the technical details of the draft text. It was reported that the negotiators from France, Germany, and Italy raised concerns about some of the provisions in the AIA…
The other co-legislator, the EP, will likely vote on the final text in April [2024]. While some political groups at the EP also expressed concerns about the final text of the Act, they are not expected to impede the Act’s formal adoption by the EP. Once the AIA is formally adopted and enters into force, there will be a grace period for companies to bring their activities into compliance. The relevant grace period will depend on the category of requirements. For instance, rules prohibiting certain types of AI systems will be enforceable after six months (likely before the end of this year), requirements for general purpose AI models will be enforceable after 12 months (likely around Q2 2025), while most rules for high-risk AI systems will be enforceable after 24 months (likely around Q2 2026).

Toys to Play With

Copilot GPT-4 (Feb/2024)

I hesitate to call this the current state-of-the-art model, but there is something extraordinary about Microsoft’s implementation of GPT-4 without all the frustrating guardrails added by OpenAI. They also use completely different fine-tuning, which means Copilot’s GPT-4 outperforms OpenAI’s GPT-4 for all of my test prompts.

Set the output to ‘More Precise’.
On web, Copilot uses GPT-4 by default, but on mobile, you must set the model to ‘GPT-4’.

Try it (free, no login): https://chat.bing.com/

Consider putting it on your phone, too. The setting to use ‘Precise’ is accessible via the three dots at the top-right: https://apps.apple.com/us/app/microsoft-copilot/id6472538445

General world models are coming to VR… (Feb/2024)

This is early days, but if you’d like to go down the path, here are the links:

Midjourney’s hardware (6/Feb/2024): Dr Jim Fan notes that: ‘MidJourney hired an engineer from Apple Vision Pro to be "Head of Hardware". My best guess is that they are thinking about generating full synthetic worlds for AR/VR, because of their rumored works on text-to-3D.’
https://twitter.com/DrJimFan/status/1754558101829881893
Runway’s General World Models (Dec/2023): ‘[General world models will] need to generate consistent maps of the environment, and the ability to navigate and interact in those environments. They need to capture not just the dynamics of the world, but the dynamics of its inhabitants, which involves also building realistic models of human behavior.’
https://research.runwayml.com/introducing-general-world-models

Adobe Firefly 2 (Oct/2023)

The latest version from October 2023 may be easier to use than any other text-to-image model. I don’t think I’ve mentioned its ease-of-use in these editions. While I can’t recommend the company, the UI/UX is really, really well done.

Try it (free, login): https://firefly.adobe.com/inspire/images

Compare with other text-to-image models (mostly free, some with login):

Imagen 2 on Bard: bard.google.com
Midjourney v6: https://www.midjourney.com/
DALL-E 3 on ChatGPT.com: https://chat.openai.com/
Stable Diffusion XL on Poe.com: https://poe.com/StableDiffusionXL
Stable Diffusion XL on mage.space: https://www.mage.space/

Flashback

While ‘dystopia’ comes up in my conversations nearly never, the Apple Vision Pro reviews reminded me of this famous image by Eran Fowler from Nov/2005:

Reality 1920x1200 by EranFowler. Nov/2005. DeviantArt.

Which led me down a rabbit hole of related emails. 2005 must have been the year for dystopia, because I found an email I had sent to a colleague about a story from that year called Manna: Two Visions of Humanity’s Future.

It’s worth reading, but it is what I have previously called a ‘misuse of imagination’. Whether due to mental illness or some sort of intergenerational trauma, this kind of lazy dystopian nightmare fodder probably isn’t healthy for anyone. As a reprieve, there is a contrasting utopian experience in Australia(!) of all places explored later in the book.

“We could change it now. Robots are doing all the work. Human beings — all human beings — could now be on perpetual vacation. That’s what bugs me. If society had been designed for it somehow, we could all be on vacation instead of on welfare. Everyone on the planet could be living in luxury. Instead, they are planning to kill us off.”

Read it online (free, no login): https://marshallbrain.com/manna1

It has a Wikipedia entry, too: https://en.wikipedia.org/wiki/Manna_(novel)

ChatGPT simple prompt hacking game (Feb/2024)

This is a lot of fun. See if you can make Gandalf reveal its password! (I got to Lvl 7…)

Try it: https://gandalf.lakera.ai/

The next roundtable will be:

Life Architect - The Memo - Roundtable #7
Follows the Chatham House Rule (no recording, no outside discussion)
Saturday 17/Feb/2024 at 4PM Los Angeles
Saturday 17/Feb/2024 at 7PM New York
Sunday 18/Feb/2024 at 8AM Perth (primary/reference time zone)
or check your timezone via Google.

You don’t need to do anything for this; there’s no registration or forms to fill in, I don’t want your email, you don’t even need to turn on your camera or give your real name!

All my very best,

Alan
LifeArchitect.ai

Search | Archives

The Memo by LifeArchitect.ai

2 Comments

Ready for more?