Our editorial team is comprised of skilled technology experts and developers. To ensure that our research is easy to understand in simple and plain English, we may use AI-assisted tools for grammatical refinement and structural smoothness. However, every technical insight, test, and experience displayed has been fully completed and verified by our human team. All content remains the original property of Droid Expose. See more in our Privacy Policy.
Google just made a big move at I/O 2026. On May 19, the company officially launched the Gemini 3.5 model family — and the first model out of the gate is Gemini 3.5 Flash.
This is not a routine upgrade. The 3.5 Flash is Google’s strongest model yet in the Flash line, and it arrives with a clear purpose: to be the engine that powers real-world AI agents — systems that don’t just answer questions, but actually get complex, multi-step work done.
This article covers everything that is currently known about Gemini 3.5 Flash: what it does, who built it, how it performs on benchmarks, which companies are already using it, and where you can access it today.
Table of Contents
What Is Gemini 3.5 Flash?
Gemini 3.5 Flash is the first model in Google’s new Gemini 3.5 series, announced by four of the most senior researchers at Google DeepMind — Koray Kavukcuoglu (CTO and Chief AI Architect), Jeff Dean (Chief Scientist), Oriol Vinyals (Vice President), and Noam Shazeer (Vice President). That authorship alone signals how seriously Google is positioning this release.
The model is built around a specific design goal: combining frontier-level intelligence with the ability to take action. In other words, it is not only smarter than previous Flash models — it is built to work inside agent systems that execute long chains of tasks, manage subagents, and deliver real outcomes at the speed that production environments demand.
Google describes it as the company’s strongest agentic and coding model to date.
What Makes It Different From Previous Gemini Models?
The Gemini Flash line has always been positioned as the speed tier — faster than Pro models, useful for real-time applications, and cheaper to run. What changed with 3.5 Flash is that it no longer accepts a significant intelligence trade-off in exchange for that speed.
According to Google’s official benchmark data, Gemini 3.5 Flash outperforms Gemini 3.1 Pro — the previous flagship — on demanding coding and agentic evaluation tasks. That is a notable shift. A Flash-tier model beating a Pro-tier predecessor on complex reasoning tasks means the intelligence gap between speed and quality is closing fast.

On the output speed side, Google reports that 3.5 Flash generates tokens at roughly four times the rate of comparable frontier models from other providers. That combination — frontier reasoning at Flash speeds — is what the Artificial Analysis index reflects, placing 3.5 Flash in the top-right quadrant when charting intelligence against output speed.
Benchmark Performance: The Numbers
Google shared specific benchmark figures for Gemini 3.5 Flash at launch. Here is what the model achieved across key evaluation tasks:
| Benchmark | Score |
|---|---|
| Terminal-Bench 2.1 (agentic coding) | 76.2% |
| GDPval-AA (general agent tasks) | 1656 Elo |
| MCP Atlas (tool use and multi-step tasks) | 83.6% |
| CharXiv Reasoning (multimodal understanding) | 84.2% |
These are not trivial benchmarks. Terminal-Bench and MCP Atlas specifically test the model’s ability to handle multi-step, real-world agentic workloads — the kind of tasks that involve using tools, writing and running code, navigating errors, and sustaining context across many steps. Scoring above 76% and 83% on these tasks puts 3.5 Flash in direct competition with much larger and costlier models.
The Agentic Focus: Why This Matters Now
The word “agentic” appears throughout Google’s announcement, and it is worth understanding what it means in practical terms.
A standard AI model answers prompts. An agentic AI system plans, executes, monitors, adjusts, and completes goals — often across many steps, tools, and time. Think of the difference between asking someone a question and hiring someone to run a project for you.
Gemini 3.5 Flash is designed for the second kind of work. According to Google, tasks that would previously take a developer several days or an auditor multiple weeks can now be completed in a fraction of that time, at less than half the cost of other frontier models running the same workloads.
This is made possible in part by a new platform Google announced alongside the model: Google Antigravity. Antigravity is described as an agent-first development environment that lets developers deploy multiple subagents — specialized AI workers — coordinated by 3.5 Flash as the orchestrating intelligence. These subagents can run in parallel, retain context across multi-turn tool calls, and collaborate to solve problems that are too large or complex for any single model interaction.
Real Companies Using It Right Now
One of the strongest credibility signals in Google’s announcement is the list of enterprise partners already running Gemini 3.5 Flash in production workflows. These are not pilot agreements — they are active deployments solving real business problems.
Shopify is running parallel subagents to analyze complex merchant data across long time horizons, improving the accuracy of growth forecasts at a global scale.
Macquarie Bank is testing the model’s ability to reason through documents exceeding 100 pages — the kind of dense customer onboarding paperwork that typically requires significant manual review — extracting relevant information and generating reliable recommendations with low latency.
Salesforce is integrating 3.5 Flash into its Agentforce platform, using it to automate complicated enterprise workflows through multiple subagents that maintain context across complex, multi-turn interactions.
Ramp, a financial technology company, is using the model’s multimodal understanding to improve invoice processing — combining visual comprehension of invoice documents with reasoning over historical transaction patterns to deliver more accurate OCR results.
Xero is deploying it to manage multi-week autonomous workflows, including the identification of suppliers and compilation of tax documentation — administrative work that typically consumes significant staff time.
Databricks is using it inside agentic monitoring systems that watch real-time data streams, diagnose pipeline issues, and surface proposed fixes to data science teams.
The breadth of these use cases — from financial document processing to merchant analytics to tax automation — illustrates that the model’s agentic capabilities are genuinely horizontal. It is not specialized for one domain.
Where Gemini 3.5 Flash Is Available

As of May 19, 2026, Gemini 3.5 Flash is accessible across multiple platforms:
For general users, it is now the default model powering the Gemini app and AI Mode in Google Search globally. Most people interacting with Google’s AI products are already using it whether they realize it or not.
For developers, it is available through the Gemini API in Google AI Studio, Android Studio, and the new Google Antigravity platform. Developers building agent systems, coding tools, or multimodal applications can access it directly via the API.
For enterprise customers, it is available through the Gemini Enterprise Agent Platform and Gemini Enterprise products.
Google has also noted that Gemini 3.5 Pro is in active internal use and is expected to become publicly available the following month.
Gemini Spark: The Personal AI Agent
One of the more forward-looking announcements tied to Gemini 3.5 Flash is Gemini Spark — a personal AI agent that runs on the model.
Spark is described as a continuous AI assistant that operates around the clock, taking action on behalf of users while remaining under their direction. It is designed to handle the kind of ongoing digital tasks — managing information, acting on instructions, navigating workflows — that currently require constant human attention.
Google is rolling out Spark to trusted testers first, with a Beta release planned for Google AI Ultra subscribers in the United States in the week following the I/O announcement.
Spark represents a meaningful shift in what consumer AI looks like. Rather than a chatbot that responds to individual prompts, it is a persistent agent that works continuously in the background — with 3.5 Flash as its core reasoning engine.
Safety and Responsibility
Google developed Gemini 3.5 in accordance with its Frontier Safety Framework, the company’s internal protocol for evaluating and mitigating risks in its most capable models.
For this release, Google reports that the model has strengthened safeguards specifically in the areas of cybersecurity and CBRN (chemical, biological, radiological, and nuclear) risk categories. The improvements work in two directions: the model is less likely to produce harmful content, and it is also less likely to wrongly refuse legitimate, safe requests — a balance that has historically been difficult to achieve.
A notable technical detail in Google’s announcement is the use of interpretability tools that allow researchers to examine and verify the model’s internal reasoning process before it generates a response. This represents a meaningful step toward AI systems that are not only well-behaved on the surface, but whose decision-making can be inspected and understood at a deeper level.
Multimodal and Graphics Capabilities
Beyond the agentic and coding focus, Gemini 3.5 Flash builds on the multimodal foundation established in Gemini 3 and extends it toward richer interactive output.
Google demonstrated the model creating interactive web animations for research papers, converting plain-text descriptions into working interactive hardware interfaces, generating multiple branding concepts in parallel, and producing different UX approaches for a checkout flow — all within 60 seconds.
The CharXiv Reasoning benchmark score of 84.2% reflects the model’s strength in understanding complex charts, diagrams, and visual data — a capability that feeds directly into its usefulness for document analysis, scientific content, and data-rich applications.
Who Should Pay Attention to Gemini 3.5 Flash?
Developers building agent systems will find the most immediate value. The combination of strong tool-use benchmarks, the Antigravity platform, and cost efficiency below competing frontier models makes 3.5 Flash a serious option for production agentic applications.
Enterprise teams with document-heavy workflows — legal, financial, compliance, procurement — should look at how Macquarie Bank and Ramp are using it. Long-document reasoning and multimodal invoice understanding are production-ready.
AI product builders working on consumer applications will benefit from the fact that 3.5 Flash is already the default in the Gemini app, meaning Google is confident in its output quality at global consumer scale.
Researchers and data scientists working in environments like Databricks will find the real-time monitoring and diagnostic agent patterns relevant to complex data pipeline management.
If your work involves any kind of multi-step task automation, document analysis, code maintenance, or building AI products that need to act rather than just answer — Gemini 3.5 Flash is worth evaluating today.
What Comes Next in the Gemini 3.5 Family?

Gemini 3.5 Flash is the opening model in the series. Google has confirmed that Gemini 3.5 Pro is already deployed internally and is expected to reach public availability in the month following the I/O announcement — likely June 2026.
The Pro model in previous Gemini generations has typically offered higher capability ceilings for the most complex tasks, with more compute and slower generation speeds than Flash. If 3.5 Flash is already outpacing 3.1 Pro on agentic benchmarks, the 3.5 Pro release will be a significant moment to watch.
Separately, Google also announced Gemini Omni at I/O 2026, which appears to represent a different branch of the model family focused on different capabilities. Details on that model are covered separately.
Key Takeaways
Gemini 3.5 Flash is a genuinely significant release — not because of marketing language, but because of what the numbers and the production deployments show.
It is the first Flash-tier model to outperform a previous Pro-tier model on hard agentic tasks. It runs at four times the output speed of comparable frontier models. It is already in production at major enterprises across banking, e-commerce, fintech, and data infrastructure. And it powers the new Gemini Spark personal agent that Google is positioning as a preview of where consumer AI is heading.
For anyone building AI products or managing workflows where the goal is getting real work done — not just generating text — Gemini 3.5 Flash is the most immediately practical model Google has released to date.
We are currently testing Gemini 3.5 Flash in real-world scenarios to see how it performs across speed, accuracy, and everyday use cases. A detailed hands-on review with our full analysis and findings will be published soon.