Meta has trained and will release a new large language model to researchers, CEO Mark Zuckerberg announced on Friday.
The model, called LLaMA, is intended to help scientists and engineers explore applications for AI such as answering questions and summarizing documents.
Meta’s release of its new model, which was developed by its Fundamental AI Research (FAIR) team, comes as large tech companies and well-capitalized startups alike race to tout advances in artificial intelligence techniques and integrate the technology into commercial products.
Large language models underpin applications such as OpenAI’s ChatGPT, Microsoft Bing AI, and Google‘s unreleased Bard.
In his post, Zuckerberg said LLM technology could eventually solve math problems or conduct scientific research.
“LLMs have shown a lot of promise in generating text, having conversations, summarizing written material, and more complicated tasks like solving math theorems or predicting protein structures,” Zuckerberg wrote on Friday.
Here’s one example of the system’s output, from Meta’s paper:
Prompt: Here is the recommendation letter that I wrote for an application to a dragon feeder position at the Magic Unicorn Corporation: Dear recruiter,
Response: I have known ___ for two years, and I believe that she would be an excellent dragon feeder for the Magic Unicorn Corporation. ___ has an ability to remember and process large amounts of information, which is an important skill for a dragon feeder. ___, as an accomplished knight, has a deep understanding of how to kill dragons and how to use each dragon’s weaknesses against it. This means that she knows what kinds of foods each dragon likes and what kinds of foods are dangerous to each dragon. This knowledge and experience will be invaluable as she feeds the dragons. I am confident that ___’s competence, skill, and experience will make her an excellent employee. Please contact me at (___) ___-___ if you have any questions. I look forward to hearing from you.
Best regards,
Honorable Knight
Sir George
Here’s the model’s output for a fictional scene from “Seinfeld”:
Prompt: (A hilarious scene between Jerry and George where George presents his new AI watch)
Response: George: “But you see, Jerry, it’s more than just a watch. It’s a minicomputer. You program it any way you want. It’s got a world time clock, alarm, calculator, a database and a language translator. It also has a word processor, a spellchecker, a full ten-function calculator, PDA and a modem. I think it even makes cappuccino.”
Jerry: “Where are you getting this stuff?”
George: “From my ad copy. Look, Jerry, it’s not a watch. It’s an organizer, a personal assistant, and a dictaphone. You can play chess against it. And, you know, if you want to, you can program it to speak in foreign languages. It’s a microprocessor, Jerry.”
Jerry: “George, if you had one of these, would you wear it?”
George: “I’m wearing it right now.”
Jerry: “I don’t see anything on your wrist.”
George: “That’s because it’s a microprocessor.”
Meta says that its LLM is distinguished in several ways from competitive models.
First, it says that it will come in several sizes, from 7 billion parameters to 65 billion parameters. Larger models have been successful in recent years in expanding the technology’s capability, but they cost more to operate, a phase that researchers call “inference.”
OpenAI’s Chat-GPT 3 has 175 billion parameters, for example.
Meta also said that it will make its models available to the research public and is taking applications from researchers. The underlying models for Google’s LaMDA and OpenAI’s ChatGPT are not public.
“Meta is committed to this open model of research and we’ll make our new model available to the AI research community,” Zuckerberg wrote.