The AI race heats up: Google announces PaLM 2, its answer to GPT-4

On Wednesday, Google introduced PaLM 2, a family of foundational language models comparable to OpenAI’s GPT-4. At its Google I/O event in Mountain View, California, Google revealed that it already uses PaLM 2 to power 25 products, including its Bard conversational AI assistant.

As a family of large language models (LLMs), PaLM 2 has been trained on an enormous volume of data and does next-word prediction, which outputs the most likely text after a prompt input by humans. PaLM stands for “Pathways Language Model,” and “Pathways” is a machine-learning technique created at Google. PaLM 2 follows up on the original PaLM, which Google announced in April 2022.

According to Google, PaLM 2 supports over 100 languages and can perform “reasoning,” code generation, and multi-lingual translation. During his 2023 Google I/O keynote, Google CEO Sundar Pichai said that PaLM 2 comes in four sizes: Gecko, Otter, Bison, Unicorn. Gecko is the smallest and can reportedly run on a mobile device. Aside from Bard, PaLM 2 is behind AI features in Docs, Sheets, and Slides.

A Google-provided example of PaLM 2 "reasoning."
Enlarge / A Google-provided example of PaLM 2 “reasoning.”Google

All that is fine and well, but how does PaLM 2 stack up to GPT-4? In the PaLM 2 Technical Report, PaLM 2 appears to beat GPT-4 in some mathematical, translation, and reasoning tasks. But reality might not match Google’s benchmarks. In a cursory evaluation of the PaLM 2 version of Bard by Ethan Mollick, a Wharton professor who often writes about AI, Mollick finds that PaLM 2’s performance appears worse than GPT-4 and Bing on various informal language tests, which he detailed in a Twitter thread.

Until recently, the PaLM family of language models has been an internal Google Research product with no consumer exposure, but Google began offering limited API access in March. Still, the first PaLM was notable for its massive size: 540 billion parameters. Parameters are numerical variables that serve as the learned “knowledge” of the model, enabling it to make predictions and generate text based on the input it receives.

A Google-provided example of PaLM 2 translating languages.
Enlarge / A Google-provided example of PaLM 2 translating languages.Google

More parameters roughly means more complexity, but there’s no guarantee they are used efficiently. By comparison, OpenAI’s GPT-3 (from 2020) has 175 billion parameters. OpenAI has never disclosed the number of parameters in GPT-4.

So that leads to the big question: Just how “large” is PaLM 2 in terms of parameter count? Google doesn’t say, which has frustrated some industry experts who often fight for more transparency in what makes AI models tick.

That’s not the only property of PaLM 2 that Google has been quiet about. The company says that PaLM 2 has been trained on “a diverse set of sources: web documents, books, code, mathematics, and conversational data,” but does not go into detail about what exactly that data is.

As with other large language model datasets, the PaLM 2 dataset likely includes a wide variety of copyrighted material used without permission and potentially harmful material scraped from the Internet. Training data decisively influences the output of any AI model, so some experts have been advocating the use of open data sets that can provide opportunities for scientific reproducibility and ethical scrutiny.

A Google-provided example of PaLM 2 writing program code.
Enlarge / A Google-provided example of PaLM 2 writing program code.Google

“Now that LLMs are products (not just research), we are at a turning point: for-profit companies will become less and less transparent *specifically* about the components that are most important,” tweeted Jesse Dodge, a research scientist at the Allen Institute of AI. “Only if the open source community can organize together can we keep up!”

So far, criticism of hiding its secret sauce hasn’t stopped Google from pursuing wide deployment of AI models, despite a tendency in all LLMs to just make things up out of thin air. During Google I/O, company reps demoed AI features in many of its major products, which means a broad swath of the public could be battling AI confabulations soon.

And as far as LLMs go, PaLM 2 is far from the end of the story: In the I/O keynote, Pichai mentioned that a newer multimodal AI model called “Gemini” was currently in training. As the race for AI dominance continues, Google users in the US and 180 other countries (oddly excluding Canada and mainland Europe) can try PaLM 2 themselves as part of Google Bard, the experimental AI assistant.