This publication runs on Streamed.News. Yours could too.

Get this for your newsroom →

— From video to newspaper —

Thursday, May 7, 2026 streamed.news From video to newspaper
Technology

Google Trains AI Model on 40 Years of Dolphin Research in Bid to Decode Animal Communication

Google Trains AI Model on 40 Years of Dolphin Research in Bid to Decode Animal Communication

🌐 This article is also available in Spanish.

Original source: Google for Developers
This article is an editorial summary and interpretation of that content. The ideas belong to the original authors; the selection and writing are by Streamed.News.


This video from Google for Developers covered a lot of ground. 6 segments stood out as worth your time. Everything below links directly to the timestamp in the original video.

Scientists have spent decades trying to understand what dolphins are actually saying to each other. An AI model trained on 40 years of recordings may finally give them the tools to find out.


Google Trains AI Model on 40 Years of Dolphin Research in Bid to Decode Animal Communication

Google has released DolphinGemma, described as the world's first large language model built specifically to analyse and generate dolphin vocalisations. The model was trained on four decades of underwater field recordings collected by the Wild Dolphin Project, in collaboration with Georgia Tech researchers. It can now synthesise novel dolphin sounds in minutes — work that previously took scientists days — and those sounds are being played back to dolphins in open water using purpose-built underwater speakers.

The project represents an unusual frontier for AI: rather than processing human language, the model attempts to find structure in animal communication, with the long-term goal of enabling rudimentary two-way interaction. Researchers say the model's ability to generate vast quantities of synthetic vocalisations dramatically accelerates a research programme that has been running for 15 years, potentially compressing years of decoding work into a much shorter timeframe.

"The true breakthrough is that I can generate so many sounds — days worth of work in minutes."

▶ Watch this segment — 1:03:07


Google's Gemma 3n Brings Multimodal AI to Devices with as Little as 2GB of RAM

Google has announced Gemma 3n, an open AI model engineered to run on smartphones and other low-memory devices with as little as 2 gigabytes of RAM. The model adds native audio understanding, making it fully multimodal — capable of processing text, images, and sound — while running faster and more efficiently than its predecessor. It will be available through Google AI Studio and distributed via platforms including Hugging Face, Ollama, and Unsloth at launch.

Alongside Gemma 3n, Google introduced MedGemma, a separate suite of open models designed to interpret medical images and clinical text. Together, the announcements extend the reach of capable open AI models into two distinct domains: resource-constrained consumer hardware and specialised healthcare applications. Putting a multimodal model on a low-end phone matters because it removes the need to send sensitive personal data to a remote server.

▶ Watch this segment — 56:24


Chrome 137 Lets Gemini Diagnose and Rewrite CSS Bugs Directly in Developer Source Files

Google's Chrome 137 introduces AI assistance directly into the browser's developer tools, allowing engineers to describe a visual bug in plain English and have Gemini diagnose the underlying CSS problem and propose a fix. More significantly, the update allows those fixes to be written back into the developer's local source files without leaving the browser — collapsing a workflow that previously required switching between the browser, a text editor, and the original codebase. A redesigned Performance panel adds a parallel capability, using AI to explain the cause of layout shifts — the jarring visual jumps that degrade user experience — and suggest remedies.

For the roughly 20 million professional web developers globally, debugging layout and styling issues is among the most time-consuming routine tasks. Embedding AI assistance that understands both the rendered page and the underlying code, and can act on both simultaneously, moves Chrome DevTools closer to functioning as an autonomous coding assistant rather than a passive inspection tool.

"They don't just diagnose. They help you understand what to do next without leaving your workflow."

▶ Watch this segment — 39:38


Pinterest became an early adopter of new CSS carousel APIs built into modern browsers, replacing over 2,000 lines of custom JavaScript with roughly 200 lines of standards-based code — a reduction of about 90 percent. The switch also delivered a measurable performance gain: product pin load times improved by 15 percent. Previously, building and maintaining a performant image carousel required extensive bespoke JavaScript, a common burden for large consumer web platforms.

The results illustrate a broader shift in web development, where capabilities once requiring heavy JavaScript frameworks are increasingly handled natively by browsers. When browsers absorb that complexity, sites become faster and simpler to maintain — a benefit that flows directly to users through quicker page loads and to developers through reduced maintenance overhead.

▶ Watch this segment — 35:03


Firebase Studio Gains Figma Import and Auto-Provisioned Backends for Full-Stack AI App Generation

Google's Firebase Studio, a cloud-based development workspace, now allows developers to import designs directly from Figma and convert them into working application code using an integration with Builder I/O. In a live demonstration, Gemini 2.5 Pro constructed a multi-step product detail page — complete with a component architecture, sample data, and an Add to Cart feature — from a single structured prompt, building each element sequentially rather than dumping undifferentiated code into one file. Google also announced automatic backend provisioning: the platform will detect when an app needs a database or authentication system and configure both without manual setup.

The combination of design-to-code import and instant backend scaffolding compresses what has traditionally been days of setup work into minutes. For independent developers and small teams in particular, the barrier between having an idea and having a deployable prototype shrinks considerably — though the quality and security of auto-generated backend configurations will require scrutiny as the feature matures.

▶ Watch this segment — 48:47


Gemini 2.5 Flash Adds Native Audio and URL-Grounded Responses to Google's Live API

Google has released Gemini 2.5 Flash with native audio capabilities inside its Live API, the real-time interaction layer available through Google AI Studio. The model now processes and responds in audio natively across 24 languages, rather than relying on separate speech-to-text and text-to-speech conversion steps. Google also introduced a URL Context tool, which allows the model to retrieve and reason over the content of specific web pages, grounding its responses in live, external information rather than solely its training data.

These additions matter because low-latency, voice-first AI interfaces require a model that understands speech as a first-class input — not as text with an audio wrapper. Supporting 24 languages natively broadens the potential user base significantly, while URL grounding addresses one of the most persistent limitations of large language models: the inability to reliably reference current, specific online sources during a conversation.

▶ Watch this segment — 3:48


Also mentioned in this video


Summarised from Google for Developers · 1:10:03. All credit belongs to the original creators. Streamed.News summarises publicly available video content.

Streamed.News

This publication is generated automatically from YouTube.

Convert your full video library into a digital newspaper.

Get this for your newsroom →
Share