Organization page

Google

A company-specific timeline showing the most important milestones for Google.

  • 23milestones
  • 2013-01-16-2026-03-03range

Major

Gemini 3.1 Flash-Lite Released

Google released Gemini 3.1 Flash-Lite as its fastest and most cost-efficient Gemini 3 series model. The preview focused on high-volume developer workloads where low latency, lower cost, and good quality all matter at scale.

  • Fastest Gemini 3 series model
  • Cost-efficient high-volume workload focus
  • Preview availability in Gemini API and Vertex AI
  • Built for responsive real-time experiences
googlemodel-releasegeminillmproductivity

Sources

Major

Gemini 3.1 Pro Released

Google released Gemini 3.1 Pro as an upgraded core intelligence model for consumer and developer products. The release emphasized improved reasoning on complex tasks and broader availability across Gemini, Gemini API, Vertex AI, Gemini CLI, and Google Antigravity.

  • Upgraded core intelligence for Gemini 3
  • Better performance on complex reasoning tasks
  • Available across consumer and developer surfaces
  • Preview access for developers and enterprises
googlemodel-releasegeminillmreasoning

Sources

Major

Gemini 3 Released

Google released Gemini 3, its most intelligent Gemini model to date. The launch highlighted stronger reasoning, improved multimodal understanding, and wider access across the Gemini app, AI Studio, and Vertex AI.

  • Gemini 3 Pro for general model access
  • Improved reasoning and coding performance
  • Stronger multimodal capabilities
  • Available across Gemini app, AI Studio, and Vertex AI
  • Deep Think mode announced for advanced use
googlemodel-releasegeminillmmultimodal

Sources

Major

Gemini 2.5 Flash and Pro Generally Available

Google made Gemini 2.5 Flash and Gemini 2.5 Pro generally available and introduced Gemini 2.5 Flash-Lite in preview. The release framed Gemini 2.5 as a hybrid reasoning family balancing performance, speed, and cost for production use.

  • Gemini 2.5 Flash and Pro reached GA
  • Gemini 2.5 Flash-Lite entered preview
  • Hybrid reasoning family for production use
  • Balance of quality, speed, and cost
googlemodel-releasegeminillmreasoning

Sources

Major

Google Gemini 2.5 Pro Released

Google released Gemini 2.5 Pro, their most advanced model with breakthrough reasoning capabilities and a 1 million token context window. The model topped the LM Arena leaderboard, featuring enhanced coding, complex problem-solving, and multimodal reasoning abilities.

  • 1 million token context window
  • Native multimodal reasoning
  • Top of LM Arena leaderboard
  • Enhanced coding capabilities
  • Complex multimodal problem-solving
googlemodel-releasegeminillmreasoningmultimodal

Sources

Major

Google Gemini 2.0 Flash Released

Google released Gemini 2.0 Flash, featuring native multimodal output including text, images, and native audio. The model supported 1M token context windows, tool use, and significantly improved speed and quality over 1.5 Flash, representing a major step in AI agent capabilities.

  • Native multimodal output (text, image, audio)
  • 1M token context window
  • Native tool use
  • 2× faster than 1.5 Pro
  • AI agent capabilities
googlemodel-releasegeminillmmultimodal

Sources

Major

Google NotebookLM with Audio Overviews

Google launched NotebookLM with Audio Overviews, an AI research assistant that could generate podcast-style audio summaries of documents. The feature created natural-sounding conversations between two AI hosts discussing user-uploaded content, making research consumption more accessible.

  • AI-generated podcast summaries
  • Two AI hosts in conversation
  • Natural audio generation
  • Makes research more accessible
  • Custom sources and documents
googleproduct-launchnotebooklmproductivityllm

Sources

Major

Google Gemini 1.5 Flash Released

Google released Gemini 1.5 Flash, a lightweight model optimized for speed and efficiency while maintaining strong performance. It featured the same 1M+ token context window as Pro but at much lower latency and cost, making long-context applications more practical.

  • Lightweight and efficient
  • 1M+ token context window
  • Much lower latency than Pro
  • Cost-effective for production
  • Strong multimodal performance
googlemodel-releasegeminillm

Sources

Major

Google Veo Video Generation Model

Google announced Veo, its most capable video generation model, at Google I/O 2024. Veo could generate high-quality 1080p videos from text prompts, with coherent storytelling and realistic physics. It offered features like cinematic styles, camera controls, and extended clip generation, positioning Google as a major competitor in AI video generation.

  • 1080p video generation
  • Coherent storytelling across clips
  • Cinematic visual styles
  • Camera angle and movement control
  • Realistic physics simulation
googlemodel-releasevideo-generationgenerative-aimultimodal

Sources

Major

Gemma Open Models Released

Google released Gemma, a family of open lightweight language models built from the same research and technology used to create Gemini. Available in 2B and 7B parameter sizes, Gemma models were state-of-the-art for their size class, outperforming Llama-2 7B and Mistral 7B on key benchmarks. The models were released with responsible AI safeguards and commercial-friendly terms.

  • 2B and 7B parameter models
  • Built on Gemini research and technology
  • State-of-the-art for size class
  • Outperformed Llama-2 and Mistral 7B
  • Commercial-friendly open license
googlemodel-releaseopen-sourcegemmallm

Sources

Major

Google Gemini 1.5 Pro

Google released Gemini 1.5 Pro with an unprecedented 1 million token context window (later expanded to 2M). It featured Mixture-of-Experts architecture and dramatically improved performance across benchmarks, including 'needle in a haystack' retrieval at 99% accuracy over 1M tokens.

  • 1 million token context window (2M later)
  • Mixture-of-Experts (MoE) architecture
  • 99% accuracy on needle retrieval
  • Strong multilingual capabilities
  • Long video understanding
googlemodel-releasegeminillmlong-contextmultimodal

Sources

Major

Google Gemini 1.0 Pro Public Launch

Google officially launched Gemini 1.0 Pro to the public through Bard, their most capable model yet. Gemini was built from the ground up to be multimodal, trained jointly across text, code, image, and video, with native multimodal capabilities rather than stitched-together models.

  • Native multimodal architecture
  • Trained on text, code, image, video
  • Available through Bard
  • 1M token context in Ultra
  • Outperformed GPT-3.5 on most benchmarks
googlemodel-releasegeminillmmultimodal

Sources

Major

Google PaLM 2 and Bard Updates

Google announced PaLM 2, their next-generation large language model, along with major updates to Bard. PaLM 2 featured improved multilingual, reasoning, and coding capabilities. This marked Google's serious entry into the chatbot wars against OpenAI's ChatGPT.

  • Successor to PaLM (540B)
  • Improved multilingual support
  • Better reasoning and coding
  • Bard available globally
  • Google's response to ChatGPT
googlemodel-releasepalmllmbard

Sources

Major

Switch Transformers: Scaling to Trillion Parameter Models

Google introduced Switch Transformers, a sparse expert model with 1.6 trillion parameters that uses Mixture of Experts (MoE) architecture. Each token routes to only a subset of experts, making inference efficient despite massive parameter count. This architecture enabled training models 7x larger than dense models with same compute budget.

  • 1.6 trillion parameters
  • Mixture of Experts architecture
  • Sparse activation - efficient inference
  • 7x faster training than dense models
  • Pathways architecture foundation
research-papermixture-of-expertsscalingtransformer

Sources

Notable

Improved Guarantees and a Multiple-Descent Curve for Column Subset Selection and the Nyström Method

Research on column subset selection and the Nyström method with improved theoretical guarantees. While technical, this work improved understanding of low-rank approximation methods commonly used in scalable machine learning algorithms.

  • Improved theoretical bounds
  • Multiple-descent curve phenomenon
  • Column subset selection analysis
  • Nyström method improvements
research-paperoptimizationmachine-learninglow-rank

Sources

Major

T5: Text-to-Text Transfer Transformer

Google introduced T5, treating every NLP task as a text-to-text problem. This unified approach achieved state-of-the-art results across many benchmarks. The paper also introduced the famous C4 (Colossal Clean Crawled Corpus) dataset for pre-training.

  • Unified text-to-text framework
  • All NLP tasks treated as generation
  • C4 dataset (750GB of clean text)
  • 11B parameter model (T5-11B)
  • State-of-the-art on multiple benchmarks
googleresearch-papert5llmtransformernlp

Sources

Landmark

BERT: Pre-training of Deep Bidirectional Transformers

Google introduced BERT (Bidirectional Encoder Representations from Transformers), achieving state-of-the-art results on 11 NLP tasks. Unlike previous models that read text left-to-right, BERT uses bidirectional training, reading entire word sequences simultaneously.

  • Bidirectional training (vs. left-to-right)
  • 340M parameters (BERT-large)
  • State-of-the-art on 11 NLP tasks
  • Introduced masked language modeling
  • Foundation for many downstream applications
googleresearch-paperbertllmtransformernlp

Sources

Major

Dynamic Routing Between Capsules

Geoffrey Hinton and Sara Sabour introduced Capsule Networks, a new architecture that preserves spatial hierarchies between features. Unlike CNNs, capsules output vectors representing instantiation parameters (pose, orientation, etc.) and use dynamic routing. The paper achieved state-of-the-art on MNIST with far fewer parameters and showed superior performance on overlapping digits.

  • Vector outputs preserve spatial hierarchies
  • Dynamic routing between capsules
  • Equivariance to transformations
  • State-of-the-art on MNIST
  • Superior to CNNs on overlapping objects
research-papercapsule-networksvisionhinton

Sources

Landmark

Attention Is All You Need: The Transformer

Google researchers published the Transformer architecture, completely revolutionizing NLP by replacing recurrence with attention mechanisms. This paper became the foundation for GPT, BERT, and virtually all modern large language models, enabling parallel training and better long-range dependencies.

  • Introduced self-attention mechanism
  • Eliminated recurrent connections
  • Enabled parallel training (much faster)
  • Foundation for all modern LLMs
  • GPT, BERT, T5 all derived from this
googleresearch-papernlptransformerattention

Sources

Major

Google Neural Machine Translation (GNMT)

Google introduced GNMT, a neural machine translation system that reduced translation errors by 60% compared to phrase-based systems. The system used a deep LSTM network with 8 encoder and 8 decoder layers with attention mechanisms. It was deployed to Google Translate, translating billions of words daily across 100+ languages.

  • 60% reduction in translation errors
  • Deep LSTM with attention mechanism
  • Deployed to Google Translate
  • 100+ language pairs supported
  • Billions of words translated daily
research-papernlptranslationsequence-to-sequence

Sources

Landmark

TensorFlow Released by Google

Google open-sourced TensorFlow, an open-source machine learning framework that would become the dominant platform for deep learning research and production. It provided a flexible ecosystem of tools, libraries, and community resources.

  • Open-source successor to DistBelief
  • Flexible architecture for various platforms
  • Support for deep neural networks and ML
  • Became industry standard framework
  • Enabled rapid AI development globally
googleopen-sourceproduct-launchframeworkdeep-learning

Sources

Major

Batch Normalization: Accelerating Deep Network Training

Google researchers introduced Batch Normalization, a technique that normalizes layer inputs to reduce internal covariate shift. This allowed much higher learning rates, reduced training time, and acted as a regularizer. It became essential for training very deep networks like ResNet and is now standard in virtually all deep learning architectures.

  • Normalizes activations within mini-batches
  • Allows higher learning rates (up to 100x)
  • Reduces training time significantly
  • Acts as regularization
  • Enables training of deeper networks
research-papernormalizationdeep-learningtraining

Sources

Major

Word2Vec: Efficient Estimation of Word Representations

Google researchers introduced Word2Vec, a revolutionary neural network architecture for creating word embeddings that captured semantic meaning. This marked a major breakthrough in natural language processing by showing that neural networks could learn meaningful representations of language from raw text.

  • Introduced skip-gram and CBOW architectures
  • Enabled semantic word relationships (king - man + woman = queen)
  • Trained on 100 billion words from Google News
  • Became foundational for modern NLP
googleresearch-papernlpword-embeddingsdeep-learning

Sources