Google | AGI Progress Tracker

Major

Gemini 3.1 Flash-Lite Released

2026-03-03

Google released Gemini 3.1 Flash-Lite as its fastest and most cost-efficient Gemini 3 series model. The preview focused on high-volume developer workloads where low latency, lower cost, and good quality all matter at scale.

Fastest Gemini 3 series model
Cost-efficient high-volume workload focus
Preview availability in Gemini API and Vertex AI
Built for responsive real-time experiences

googlemodel-releasegeminillmproductivity

Sources

Gemini 3.1 Flash-Lite: Built for intelligence at scale

Major

Gemini 3.1 Pro Released

2026-02-19

Google released Gemini 3.1 Pro as an upgraded core intelligence model for consumer and developer products. The release emphasized improved reasoning on complex tasks and broader availability across Gemini, Gemini API, Vertex AI, Gemini CLI, and Google Antigravity.

Upgraded core intelligence for Gemini 3
Better performance on complex reasoning tasks
Available across consumer and developer surfaces
Preview access for developers and enterprises

googlemodel-releasegeminillmreasoning

Sources

Gemini 3.1 Pro: A smarter model for your most complex tasks

Major

Gemini 3 Released

2025-11-18

Google released Gemini 3, its most intelligent Gemini model to date. The launch highlighted stronger reasoning, improved multimodal understanding, and wider access across the Gemini app, AI Studio, and Vertex AI.

Gemini 3 Pro for general model access
Improved reasoning and coding performance
Stronger multimodal capabilities
Available across Gemini app, AI Studio, and Vertex AI
Deep Think mode announced for advanced use

googlemodel-releasegeminillmmultimodal

Sources

Gemini 3

Major

Gemini 2.5 Flash and Pro Generally Available

2025-06-17

Google made Gemini 2.5 Flash and Gemini 2.5 Pro generally available and introduced Gemini 2.5 Flash-Lite in preview. The release framed Gemini 2.5 as a hybrid reasoning family balancing performance, speed, and cost for production use.

Gemini 2.5 Flash and Pro reached GA
Gemini 2.5 Flash-Lite entered preview
Hybrid reasoning family for production use
Balance of quality, speed, and cost

googlemodel-releasegeminillmreasoning

Sources

We’re expanding our Gemini 2.5 family of models

Major

Google Gemini 2.5 Pro Released

2025-03-25

Google released Gemini 2.5 Pro, their most advanced model with breakthrough reasoning capabilities and a 1 million token context window. The model topped the LM Arena leaderboard, featuring enhanced coding, complex problem-solving, and multimodal reasoning abilities.

1 million token context window
Native multimodal reasoning
Top of LM Arena leaderboard
Enhanced coding capabilities
Complex multimodal problem-solving

googlemodel-releasegeminillmreasoningmultimodal

Sources

Gemini 2.5 Pro

Major

Google Gemini 2.0 Flash Released

2024-12-11

Google released Gemini 2.0 Flash, featuring native multimodal output including text, images, and native audio. The model supported 1M token context windows, tool use, and significantly improved speed and quality over 1.5 Flash, representing a major step in AI agent capabilities.

Native multimodal output (text, image, audio)
1M token context window
Native tool use
2× faster than 1.5 Pro
AI agent capabilities

googlemodel-releasegeminillmmultimodal

Sources

Gemini 2.0 Flash

Major

Google NotebookLM with Audio Overviews

2024-09-19

Google launched NotebookLM with Audio Overviews, an AI research assistant that could generate podcast-style audio summaries of documents. The feature created natural-sounding conversations between two AI hosts discussing user-uploaded content, making research consumption more accessible.

AI-generated podcast summaries
Two AI hosts in conversation
Natural audio generation
Makes research more accessible
Custom sources and documents

googleproduct-launchnotebooklmproductivityllm

Sources

NotebookLM with Audio Overviews

Major

Google Gemini 1.5 Flash Released

2024-08-20

Google released Gemini 1.5 Flash, a lightweight model optimized for speed and efficiency while maintaining strong performance. It featured the same 1M+ token context window as Pro but at much lower latency and cost, making long-context applications more practical.

Lightweight and efficient
1M+ token context window
Much lower latency than Pro
Cost-effective for production
Strong multimodal performance

googlemodel-releasegeminillm

Sources

Gemini 1.5 Flash

Major

Google Veo Video Generation Model

2024-05-14

Google announced Veo, its most capable video generation model, at Google I/O 2024. Veo could generate high-quality 1080p videos from text prompts, with coherent storytelling and realistic physics. It offered features like cinematic styles, camera controls, and extended clip generation, positioning Google as a major competitor in AI video generation.

1080p video generation
Coherent storytelling across clips
Cinematic visual styles
Camera angle and movement control
Realistic physics simulation

googlemodel-releasevideo-generationgenerative-aimultimodal

Sources

Google Veo Announcement

Major

Gemma Open Models Released

2024-02-21

Google released Gemma, a family of open lightweight language models built from the same research and technology used to create Gemini. Available in 2B and 7B parameter sizes, Gemma models were state-of-the-art for their size class, outperforming Llama-2 7B and Mistral 7B on key benchmarks. The models were released with responsible AI safeguards and commercial-friendly terms.

2B and 7B parameter models
Built on Gemini research and technology
State-of-the-art for size class
Outperformed Llama-2 and Mistral 7B
Commercial-friendly open license

googlemodel-releaseopen-sourcegemmallm

Sources

Gemma: Introducing New State-of-the-Art Open Models

Major

Google Gemini 1.5 Pro

2024-02-15

Google released Gemini 1.5 Pro with an unprecedented 1 million token context window (later expanded to 2M). It featured Mixture-of-Experts architecture and dramatically improved performance across benchmarks, including 'needle in a haystack' retrieval at 99% accuracy over 1M tokens.

1 million token context window (2M later)
Mixture-of-Experts (MoE) architecture
99% accuracy on needle retrieval
Strong multilingual capabilities
Long video understanding

googlemodel-releasegeminillmlong-contextmultimodal

Sources

Gemini 1.5 Pro Announcement

Major

Google Gemini 1.0 Pro Public Launch

2024-02-01

Google officially launched Gemini 1.0 Pro to the public through Bard, their most capable model yet. Gemini was built from the ground up to be multimodal, trained jointly across text, code, image, and video, with native multimodal capabilities rather than stitched-together models.

Native multimodal architecture
Trained on text, code, image, video
Available through Bard
1M token context in Ultra
Outperformed GPT-3.5 on most benchmarks

googlemodel-releasegeminillmmultimodal

Sources

Gemini 1.0 Announcement

Major

Google PaLM 2 and Bard Updates

2023-05-10

Google announced PaLM 2, their next-generation large language model, along with major updates to Bard. PaLM 2 featured improved multilingual, reasoning, and coding capabilities. This marked Google's serious entry into the chatbot wars against OpenAI's ChatGPT.

Successor to PaLM (540B)
Improved multilingual support
Better reasoning and coding
Bard available globally
Google's response to ChatGPT

googlemodel-releasepalmllmbard

Sources

Google I/O 2023: PaLM 2

Major

Switch Transformers: Scaling to Trillion Parameter Models

2021-01-11

Google introduced Switch Transformers, a sparse expert model with 1.6 trillion parameters that uses Mixture of Experts (MoE) architecture. Each token routes to only a subset of experts, making inference efficient despite massive parameter count. This architecture enabled training models 7x larger than dense models with same compute budget.

1.6 trillion parameters
Mixture of Experts architecture
Sparse activation - efficient inference
7x faster training than dense models
Pathways architecture foundation

research-papermixture-of-expertsscalingtransformer

Sources

Switch Transformers Paper

Notable

Improved Guarantees and a Multiple-Descent Curve for Column Subset Selection and the Nyström Method

2020-02-24

Research on column subset selection and the Nyström method with improved theoretical guarantees. While technical, this work improved understanding of low-rank approximation methods commonly used in scalable machine learning algorithms.

Improved theoretical bounds
Multiple-descent curve phenomenon
Column subset selection analysis
Nyström method improvements

research-paperoptimizationmachine-learninglow-rank

Sources

Column Subset Selection Paper

Major

T5: Text-to-Text Transfer Transformer

2019-10-23

Google introduced T5, treating every NLP task as a text-to-text problem. This unified approach achieved state-of-the-art results across many benchmarks. The paper also introduced the famous C4 (Colossal Clean Crawled Corpus) dataset for pre-training.

Unified text-to-text framework
All NLP tasks treated as generation
C4 dataset (750GB of clean text)
11B parameter model (T5-11B)
State-of-the-art on multiple benchmarks

googleresearch-papert5llmtransformernlp

Sources

Exploring the Limits of Transfer Learning

Landmark

BERT: Pre-training of Deep Bidirectional Transformers

2018-10-11

Google introduced BERT (Bidirectional Encoder Representations from Transformers), achieving state-of-the-art results on 11 NLP tasks. Unlike previous models that read text left-to-right, BERT uses bidirectional training, reading entire word sequences simultaneously.

Bidirectional training (vs. left-to-right)
340M parameters (BERT-large)
State-of-the-art on 11 NLP tasks
Introduced masked language modeling
Foundation for many downstream applications

googleresearch-paperbertllmtransformernlp

Sources

BERT Paper

Major

Dynamic Routing Between Capsules

2017-10-26

Geoffrey Hinton and Sara Sabour introduced Capsule Networks, a new architecture that preserves spatial hierarchies between features. Unlike CNNs, capsules output vectors representing instantiation parameters (pose, orientation, etc.) and use dynamic routing. The paper achieved state-of-the-art on MNIST with far fewer parameters and showed superior performance on overlapping digits.

Vector outputs preserve spatial hierarchies
Dynamic routing between capsules
Equivariance to transformations
State-of-the-art on MNIST
Superior to CNNs on overlapping objects

research-papercapsule-networksvisionhinton

Sources

Capsule Networks Paper (NIPS 2017)

Landmark

Attention Is All You Need: The Transformer

2017-06-12

Google researchers published the Transformer architecture, completely revolutionizing NLP by replacing recurrence with attention mechanisms. This paper became the foundation for GPT, BERT, and virtually all modern large language models, enabling parallel training and better long-range dependencies.

Introduced self-attention mechanism
Eliminated recurrent connections
Enabled parallel training (much faster)
Foundation for all modern LLMs
GPT, BERT, T5 all derived from this

googleresearch-papernlptransformerattention

Sources

Attention Is All You Need

Major

Google Neural Machine Translation (GNMT)

2016-09-26

Google introduced GNMT, a neural machine translation system that reduced translation errors by 60% compared to phrase-based systems. The system used a deep LSTM network with 8 encoder and 8 decoder layers with attention mechanisms. It was deployed to Google Translate, translating billions of words daily across 100+ languages.

60% reduction in translation errors
Deep LSTM with attention mechanism
Deployed to Google Translate
100+ language pairs supported
Billions of words translated daily

research-papernlptranslationsequence-to-sequence

Sources

GNMT Paper (arXiv 2016)

Landmark

TensorFlow Released by Google

2015-11-09

Google open-sourced TensorFlow, an open-source machine learning framework that would become the dominant platform for deep learning research and production. It provided a flexible ecosystem of tools, libraries, and community resources.

Open-source successor to DistBelief
Flexible architecture for various platforms
Support for deep neural networks and ML
Became industry standard framework
Enabled rapid AI development globally

googleopen-sourceproduct-launchframeworkdeep-learning

Sources

TensorFlow Announcement

Major

Batch Normalization: Accelerating Deep Network Training

2015-02-11

Google researchers introduced Batch Normalization, a technique that normalizes layer inputs to reduce internal covariate shift. This allowed much higher learning rates, reduced training time, and acted as a regularizer. It became essential for training very deep networks like ResNet and is now standard in virtually all deep learning architectures.

Normalizes activations within mini-batches
Allows higher learning rates (up to 100x)
Reduces training time significantly
Acts as regularization
Enables training of deeper networks

research-papernormalizationdeep-learningtraining

Sources

Batch Normalization Paper (ICML 2015)

Major

Word2Vec: Efficient Estimation of Word Representations

2013-01-16

Google researchers introduced Word2Vec, a revolutionary neural network architecture for creating word embeddings that captured semantic meaning. This marked a major breakthrough in natural language processing by showing that neural networks could learn meaningful representations of language from raw text.

Introduced skip-gram and CBOW architectures
Enabled semantic word relationships (king - man + woman = queen)
Trained on 100 billion words from Google News
Became foundational for modern NLP

googleresearch-papernlpword-embeddingsdeep-learning

Sources

Word2Vec Research Paper