The Memo - 30/Apr/2023

Stability/DeepFloyd IF, RedPajama dataset, 'Laptop model' comparisons (Alpaca, Chimera), and much more!

Apr 29, 2023

FOR IMMEDIATE RELEASE: 30/Apr/2023

Welcome back to The Memo.

Thanks for being a part of history!

In the Policy section we look at what the US Government is doing with LLMs for Congress, take an inside look at how the US is implementing several post-2022 LLMs for military use, the EU’s latest updates to their Draft AI Act, and dive into an English copy of Japan’s latest AI white paper. (By the way, it is often next to impossible to find some of these resources. I should know; it’s my job to source it for you and put it here in The Memo!).

In the Toys to play with section, we look at some huge new dialogue models and interfaces, a free new iPhone app for AI video, a new hands-on tutorial by OpenAI, and much more…

Special: As promised, I’m also providing my recent 2-hour private keynote/workshop that was part of a well-produced multi-camera live and streamed paid event in Sydney (I believe tickets were $4,999). Links provided at the end of this edition for paid readers.
Part I: Large language models (hands-on with ChatGPT for business).
Part II: AI art (hands-on with Midjourney v5 including live audience examples).

The BIG Stuff

Exclusive: Users spent 13.9 billion minutes interacting with ChatGPT in March (Apr/2023)

1.6 billion visits x 8m44s (8.73) = 13.968 billion minutes

[corrected]

Data points: https://www.similarweb.com/website/chat.openai.com/#overview

AI via ChatGPT is now ‘better’ and 980% ‘more empathetic’ than a doctor (28/Apr/2023)

[This study used ChatGPT.] Chatbot responses were rated of significantly higher quality than physician responses…9.8 times higher prevalence of empathetic or very empathetic responses for the chatbot… If more patients’ questions are answered quickly, with empathy, and to a high standard, it might reduce unnecessary clinical visits, freeing up resources for those who need them… High-quality responses might also improve patient outcomes… responsive messaging may collaterally affect health behaviors, including medication adherence, compliance (eg, diet), and fewer missed appointments.

Download paper ‘Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum’ (PDF).

Read my table of other ChatGPT achievements.

Google Brain merges with DeepMind (20/Apr/2023)

As referenced several times in the last few editions of The Memo, DeepMind and Google have formally announced the combining of the two organizations.

The research advances from the phenomenal Brain and DeepMind teams laid much of the foundations of the current AI industry, from Deep Reinforcement Learning to Transformers, and the work we are going to be doing now as part of this new combined unit will create the next wave of world-changing breakthroughs.

Read the announcement via Demis/DeepMind.

Bloomberg: LLMs increase productivity by 14% (24/Apr/2023)

Customer service workers at a Fortune 500 software firm who were given access to generative artificial intelligence tools became 14% more productive on average than those who were not, with the least-skilled workers reaping the most benefit.
That’s according to a new study by researchers at Stanford University and the Massachusetts Institute of Technology who tested the impact of generative AI tools on productivity at the company over the course of a year.
The research marks the first time the impact of generative AI tools on work has been measured outside the lab.

The Interesting Stuff

Datasets: training on code leads to reasoning ability (2022-2023)

Datasets are the big buckets of words used by AI labs to train models. Generally, they consist of web pages, books, articles, and Wikipedia. (Watch my 2-min video on this, or a longer version ‘for humans’.)

Researchers at Allen AI (with review by Google Brain) have noticed that including code in training datasets may be the cause of models learning to reason, especially via chain-of-thought (CoT).

We have concluded:
The ability to perform complex reasoning is likely to be from training on code.

If we consider the logic required to work through a programming language, it’s easy to see how beneficial this might be when applied to both human thought and daily living. For example, it may be that having a model step through a dataset with even a simple program in Pascal or BASIC or C (with its various functions and references) imitates some parts of the routines in our daily lives.

The paper is in very early draft (notes/outline) stage.

Read the draft outline here.

Datasets: Bigger and bigger (Apr/2023)

In a recent video I made an off-the-cuff remark that I’ve been ‘obsessed’ with datasets for a long time… it’s true!

My Mar/2022 paper What’s in my AI? A Comprehensive Analysis of Datasets Used to Train GPT-1, GPT-2, GPT-3, GPT-NeoX-20B, Megatron-11B, MT-NLG, and Gopher was well-received in academic, corporate, government, and intergovernmental circles.

Since that paper’s release, there have been a few more datasets to analyze.

Feb/2023: Meta AI’s LLaMA dataset with its 4TB of Common Crawl.
Mar/2023: OpenAI’s GPT-4 dataset designed by a team of 35 staff. No other information is available, but I’ve put together my best estimates on this dataset.
Apr/2023: Stability AI’s version of The Pile dataset, announced with their StableLM models, but no info has yet been released.
Apr/2023: Together AI’s RedPajama dataset.

This most recent one is interesting, and clones nearly exactly the LLaMA dataset:

RedPajama dataset analysis by Alan. https://lifearchitect.ai/models/#redpajama

RedPajama is more than double the size of the GPT-3 dataset (celebrating its 3rd anniversary in May/2023), but less than 10% of the size of GPT-4’s dataset using my estimates.
There is a seeming ‘duplication’ of web crawl data, using both a standard Common Crawl, as well as Google’s filtered version of the Common Crawl, C4. In effect, this means they have 20% of ‘clean’ common crawl (C4), and are then adding another 80% with ‘work to be done’ in the unfiltered Common Crawl. Work includes removing boilerplate and repeated text like footers, stripping out HTML, and more.
This open-source dataset contains 200GB of GitHub code (40% less than LLaMA), plus another 67GB of StackExchange code discussion. As highlighted above, allowing models to ‘see’ code during training may support complex reasoning abilities.

The dataset fits nicely in with the other recent releases, though it’s interesting to note just how much bigger the GPT-4 dataset is!

See my new table of big datasets.

If you’ve wondered what exactly ChatGPT (and other models) know about you, you can search for your own name or other interesting data in this searchable index here:
https://c4-search.apps.allenai.org/

Embodiment: Boston Dynamics Spot + ChatGPT (25/Apr/2023)

‘We integrated ChatGPT with our [Boston Dynamics Spot] robots.’

Watch: https://twitter.com/svpino/status/1650832349008125952

JPMorgan’s ChatGPT fine-tune rates 25 Years of Fed speeches (27/Apr/2023)

…Fed statements and central-banker speeches going back 25 years, the firm’s economists including Joseph Lupton employed a ChatGPT-based language model to detect the tenor of policy signals, effectively rating them on a scale from easy to restrictive in what JPMorgan is calling the Hawk-Dove Score.

Read more via Bloomberg: https://archive.is/npdCR

Amazon: working on a new LLM for Alexa (29/Apr/2023)

Amazon is building a more “generalized and capable” large language model (LLM) to power Alexa, said Amazon CEO Andy Jassy during the company’s first-quarter earnings call yesterday.

Policy

US Congress begins using ChatGPT with 40x licenses (26/Apr/2023)

A few months ago in The Memo 27/Jan/2023 edition, we talked about ChatGPT being used to craft two speeches and two bills for the US government. Now, Congress has implemented ChatGPT for their new AI working group.

Private email to US Congress AI working group

The House recently created a new AI working group for staff to test and share new AI tools in the congressional office environment and now the House of Representatives‘ digital service has obtained 40 licenses for ChatGPT Plus, which were distributed earlier this month…
“Everything from making it easier to come [up] with ideas, to summarizing information, to draft letters or documents and handle some aspects of constituent engagement. Ultimately it will allow Congressional staff to scale up more quickly regarding the demands placed on them,” said Schuman, who has played a key role in drafting and enacting tech and accountability related legislation in Congress including the DATA Act, FOIA modernization, and dozens of House rules changes.

Palantir’s new LLM platform using various open-source models (26/Apr/2023)

Palantir's solutions are deployed across nearly every Army mission area, ensuring data is accessible across all echelons for fast, agile decision-making that allows the warfighter to out-think and out-pace the adversary. (-via Palantir)

When I hear others say that the US government must have LLM technology beyond what is currently out there from Google and OpenAI, I shake my head. Until recently, they really didn’t have much; they had been caught out. Peter Thiel’s company, Palantir, has integrated a bunch of open-source models into a platform called ‘AIP’ for defence. I wonder if that fits into the original model licenses. It almost definitely does not fit into the spirit of the original model licenses. The models shown in the military platform are:

EleutherAI GPT-J 6B.
Google FLAN-T5 XL 3B.
EleutherAI GPT-NeoX-20B.
Databricks Dolly 2.0 12B.

Expecting this video to be pulled, I have hosted a backup of Palantir’s video for The Memo readers here:

EU Draft AI Act will require dataset detail (28/Apr/2023)

In The Memo edition 15/Sep/2023, I described the EU’s overreach into AI policy as ‘a true abomination’, adding:

The EU is wasting time trying to regulate a revolution that is in its fledgling stages, hamstringing humanity in the process. Their first order of business is blocking open-source models, and limiting visibility of AI models outside of big corporations. I have nothing good to say, and nothing more to add.

(Sidenote: LAION this week 28/Apr/2023 published a letter to the EU addressing this and other concerns with the Draft AI Act.)

Well, it turns out that the EU has more to add—and good stuff!—as this week they asked for AI labs to detail their datasets. Changes were analyzed by the WSJ:

Under the new provisions being added to the EU’s AI bill, developers of generative AI models will have to publish a “sufficiently detailed summary” of the copyright materials they used as part of their creation, the draft says.

Toys to Play With

Microsoft Designer now open (29/Apr/2023)

Microsoft has removed the waitlist for Designer. You’ll still need a Microsoft or Skype login, but use is free. My tests showed this is slightly different to DALL-E 2 (the model under the hood), as Microsoft are trying to steer more towards publishing.

Try it (free, login): https://designer.microsoft.com/

Hugging Face releases HuggingChat 30B (Apr/2023)

Based on OpenAssistant, a fine-tuned version of the mid-sized LLaMA 30B (not the 65B version).

In my testing, it’s decidedly poor. Sorry, guys!

Try it (free, no login): https://huggingface.co/chat/

Stability AI releases StableVicuna 13B (28/Apr/2023)

Vicuna is the first large-scale open source chatbot trained via reinforced learning from human feedback (RHLF). StableVicuna is a further instruction fine tuned and RLHF trained version of Vicuna 1.0 13b, which is an instruction fine tuned LLaMA 13b model!

Try it (free, no login): https://huggingface.co/spaces/CarperAI/StableVicuna

RunwayML releases iPhone app (25/Apr/2023)

Rather than trying to describe this model and app, watch the incredible trailer:

Download now: https://apps.apple.com/app/apple-store/id1665024375

Unrecord: Unreal Engine 5 gameplay (Apr/2023)

‘Unreal Engine 5 is just insane. This isn’t real. This is a video game being made by an indie dev.’

Video:

Download game soon: https://store.steampowered.com/app/2381520/Unrecord/

Tiny language model (Apr/2023)

Tiny Language Model (TLM) is a functional language model based on a small neural network that runs in your browser. It has the capability to learn and generate responses based on a six word customizable vocabulary. While very limited, it can offer insights into vastly more complex language models like ChatGPT.

The Memo by LifeArchitect.ai