This publication runs on Streamed.News. Yours could too.

Get this for your newsroom →

— From video to newspaper —

Thursday, May 7, 2026 streamed.news From video to newspaper
Technology

AI Agents Gain Natural Language Control Over Storage Operations with Data Ops Toolkit Evolution

AI Agents Gain Natural Language Control Over Storage Operations with Data Ops Toolkit Evolution

Original source: NetApp
This article is an editorial summary and interpretation of that content. The ideas belong to the original authors; the selection and writing are by Streamed.News.


This video from NetApp covered a lot of ground. 5 segments stood out as worth your time. Everything below links directly to the timestamp in the original video.

Discover how to use natural language to command your storage infrastructure. Learn the specific prompts and integrations that empower AI agents like GitHub Copilot Chat to automate tasks such as creating volumes, cloning data sets, and provisioning Jupyter Lab workspaces on your NetApp systems.


AI Agents Gain Natural Language Control Over Storage Operations with Data Ops Toolkit Evolution

The NetApp Data Ops Toolkit is evolving to integrate a Model Context Protocol (MCP) server, enabling AI agents like GitHub Copilot Chat to manage storage operations through natural language commands. This new capability allows users to prompt an AI agent, via tools such as GitHub Copilot Chat in VS Code, to perform tasks such as creating new volumes, cloning existing data sets, taking snapshots for traceability, or provisioning Jupyter Lab workspaces. The toolkit leverages the MCP, a protocol created by Anthropic, which establishes a common framework for AI agents to discover and interact with defined tools. This development extends beyond simple chat interactions, paving the way for autonomous AI agent workflows. For instance, an AI agent could automatically respond to internal requests, such as a Salesforce ticket, by identifying the need for a new volume, invoking the Data Ops Toolkit via its MCP server to create it, and then fulfilling the ticket without human intervention. This integration transforms routine storage administration into an automated process, enhancing efficiency and enabling intelligent, event-driven infrastructure management.

"It's a common framework for defining tools and exposing them to AI agents. So there's a bunch of client applications out there, agentic AI tools that support MCP. So I mentioned GitHub Copilot Chat, Copilot Studio. There's agent SDK frameworks like Nvidia Agent IQ, Langraph, and others that support MCP."

▶ Watch this segment — 20:48


NetApp Enables KV Cache Offloading to ONTAP Systems Using Standard NFS

NetApp now supports Key-Value (KV) cache offloading to ONTAP storage systems using standard industry protocols, a critical development for optimizing Large Language Model (LLM) inference performance. This capability is facilitated through LM Cache, a plugin for the vLLM inference framework, which allows offloading KV cache data to either CPU memory or GPU Direct Storage (GDS) backends. GDS, an Nvidia framework, enables GPUs to communicate directly with external storage via RDMA, bypassing the CPU to reduce latency. Crucially, NetApp's implementation of GDS over NFS requires no proprietary clients; users can leverage the standard Linux NFS client by configuring appropriate mount options and ONTAP settings. This approach offers a robust and stable solution for managing the KV cache, which stores attention values to prevent recalculation during LLM inference. By offloading this cache to external NetApp ONTAP storage, organizations can free up valuable GPU memory, enabling more efficient deployment of LLMs and mitigating the challenges of GPU scarcity.

"What's interesting for us here at NetApp is it supports any GPU direct storage file system that's mounted on the client. And at NetApp, we support GPU direct storage over NFS without any proprietary client... You can use Linux, you can use the standard NFS client."

▶ Watch this segment — 32:37


Agentic AI Empowers LLMs to Access Private Structured Data Through Specialized Tools

Agentic AI offers a robust solution for Large Language Models (LLMs) to access and leverage private structured data, such as information stored in databases, data lakehouses, or enterprise systems like Salesforce and SAP. This is achieved by wrapping LLMs within AI agents, which are equipped with a suite of specialized tools. These tools can include text-to-SQL converters that generate database queries from natural language prompts, SQL query executors, or API callers designed to interact directly with structured data sources. The process involves an AI agent receiving a user prompt, employing a relevant tool to query the designated structured data source, and then retrieving the results. These retrieved data points are subsequently fed back into the LLM alongside the original prompt, enabling the model to generate more informed and contextually rich responses. This agentic tool-use model represents a significant evolution in augmenting LLMs, moving beyond purely unstructured text augmentation (like Retrieval Augmented Generation, RAG) to integrate diverse, structured enterprise data for comprehensive AI capabilities.

"This new model where you have AI agents which are essentially LLMs with a wrapper that have a set of tools available to them... So there's this third agentic tool use model that's becoming the standard for augmenting LLMs with private structured data."

▶ Watch this segment — 8:29


Enterprises Adopt Retrieval Augmented Generation (RAG) to Integrate Private Data with LLMs

Enterprises are widely adopting Retrieval Augmented Generation (RAG) as a practical method to augment Large Language Models (LLMs) with their private, unstructured data, such as internal codebases, wiki pages, or knowledge base entries. Unlike fine-tuning, which demands significant GPU resources and specialized AI research skills, RAG offers a more approachable solution. The process involves vectorizing a corpus of private text using an embedding model and storing these numerical representations in a vector database. When an LLM receives a user prompt, RAG vectorizes that prompt, conducts a similarity search within the vector database, and retrieves the most relevant private data snippets. These retrieved snippets then augment the original prompt before it is processed by the LLM, ensuring responses are informed by internal company knowledge. While implementing RAG requires some DevOps and platform engineering expertise to set up and maintain data pipelines for vector embeddings, it is significantly less complex than continuous fine-tuning and can be deployed using standard tools like vLLM or Nvidia NIMs.

"What most enterprises, private organizations are doing is implementing RAG or Retrieval Augmented Generation... Basically you take a corpus of source data... and you vectorize it... and then anytime a prompt comes in to the LLM, you can take that prompt, vectorize it, perform a similarity search on the vector database, return any similar vectors and essentially append them to the end of your prompt."

▶ Watch this segment — 4:45


NetApp Data Ops Toolkit Simplifies ONTAP Storage Management and Jupyter Lab Provisioning

The NetApp Data Ops Toolkit is an open-source Python module designed to simplify storage management and application provisioning for data scientists and platform engineers. Available via pip from PyPI, it offers two packages: netapp-data-ops-traditional for general environments and netapp-data-ops-ks for Kubernetes. This toolkit provides straightforward Python functions and a command-line interface (CLI) to leverage key ONTAP capabilities, such as creating snapshots, provisioning new volumes, and near-instantaneously cloning massive datasets using NetApp's FlexClone technology, without requiring extensive storage administration expertise. For Kubernetes environments, the toolkit integrates with NetApp Trident, its CSI driver, to create Kubernetes-native persistent volumes and persistent volume claims. A notable feature of the Kubernetes-focused package is its ability to provision Jupyter Lab workspaces backed by NetApp volumes. Users can, for example, request a 500 terabyte Jupyter Lab workspace, receiving a URL to access it within seconds. This empowers data professionals to easily provision and manage their interactive data science environments, with all data securely stored and managed on NetApp infrastructure, enhancing agility and control.

"Basically what you get when you install it is a library with a set of simple functions that you could import into a Jupyter Notebook... These are just super simple Python functions that make it easy to consume some of our key capabilities in ONTAP."

▶ Watch this segment — 16:14


Also mentioned in this video


Summarised from NetApp · 47:35. All credit belongs to the original creators. Streamed.News summarises publicly available video content.

Streamed.News

This publication is generated automatically from YouTube.

Convert your full video library into a digital newspaper.

Get this for your newsroom →
Share