Building Intelligent Automation with Spring AI and ByteChef

TL;DR: ByteChef's AI Agent is a visual, no-code building block for agentic workflows, powered by Spring AI under the hood. It's composed of five cluster elements: a Model, RAG, Memory, Tools, and Guardrails. Together, they give you everything needed to build production-grade AI agents without writing your own AI infrastructure.
Artificial intelligence is supported for every programming language nowadays. With the right tools, every developer can embed intelligent behavior directly into their workflows and automations. ByteChef's AI Agent component makes this possible by integrating deeply with Spring AI, the leading Java framework for building AI-powered applications. Exposing it through a visual, no-code/low-code interface.
In this post, we'll walk through how ByteChef's AI Agent is structured, what each of its cluster elements does, and how Spring AI powers it all under the hood.
What Is the AI Agent Component?
The AI Agent is ByteChef's core building block for agentic workflows. Rather than making a single call to a language model, an AI Agent can: reason, retrieve context, remember past interactions, call external tools, and even delegate to other agents. It is capable of handling complex, multi-step tasks.

The AI Agent is composed of a set of cluster elements: configurable sub-components that each handle a specific aspect of agentic behavior. These are:
- Model - the language model powering the agent
- RAG - retrieval-augmented generation for grounding responses in your data
- Memory - persistence of conversation history across turns
- Tools - actions the agent can do
- Guardrails - filters for safe and appropriate responses
Let's explore each one.
Model
The model is the brain of the AI Agent. It defines which LLM receives prompts, thinks and generates responses. Spring AI provides a unified ChatModel abstraction that normalizes communication across many different LLM providers, so ByteChef can support a wide range of models without changing the underlying agent logic.
ByteChef currently supports the following models, all integrated through Spring AI:
- Amazon Bedrock Converse - access to AWS-hosted models including Anthropic Claude, Meta Llama, and more via a unified AWS API
- Anthropic - Claude models, known for their strong instruction-following and reasoning
- Azure OpenAI - OpenAI models deployed on Microsoft Azure infrastructure
- DeepSeek - high-performance models with strong coding and reasoning capabilities
- Vertex Gemini - Google's Gemini models via Google Cloud's Vertex AI platform
- Groq - ultra-fast inference for open-source models
- Mistral AI - efficient, open-weight European models
- NVIDIA - models served via NVIDIA's NIM inference platform
- Ollama - open-source models locally with no cloud dependency
- Perplexity - models with built-in web search and citation capabilities
- OpenAI - OpenAI models deployed by OpenAI
In addition to these Spring AI-backed providers, ByteChef also supports OpenRouter, a gateway that aggregates hundreds of models from dozens of providers under a single API. This means that even if your preferred model isn't in the list above, there's a very good chance you can still connect to it through OpenRouter. This makes ByteChef's AI Agent one of the most model-versatile automation platforms available.
RAG (Retrieval-Augmented Generation)
Language models are powerful, but they only know what they were trained on. If you want your agent to answer questions about your internal documents, product catalog, support tickets, or any proprietary data, you need Retrieval-Augmented Generation (RAG).
RAG works by searching a knowledge source for documents relevant to the user's query, then injecting that context into the prompt before the model generates a response. Spring AI provides a rich, modular RAG architecture that ByteChef exposes directly in the AI Agent.
Vector Store Providers
To perform semantic search, documents are embedded into high-dimensional vectors and stored in a vector database. ByteChef supports the following vector stores:
- Couchbase
- MariaDB
- Milvus
- Neo4j
- Oracle
- PostgreSQL (pgvector)
- Pinecone
Knowledge Base
Don't want to set up your own vector database? ByteChef also offers a built-in Knowledge Base — an internal, managed knowledge store where you can upload documents (PDFs, text files, and more) directly. ByteChef handles the chunking, embedding, and storage automatically, so you can start building RAG-powered agents without configuring any external infrastructure.
RAG Strategies
Spring AI supports two RAG approaches:
QuestionAnswerAdvisor is Spring AI's out-of-the-box RAG implementation. When a query comes in, it performs a similarity search against the vector store, retrieves the most relevant documents, and appends them to the prompt as context before the model responds. It supports configurable similarity thresholds, top-K result limits, and dynamic filter expressions so you can scope searches to specific subsets of your data.
Modular RAG is based on Spring AI's RetrievalAugmentationAdvisor and inspired by the research paper "Modular RAG: Transforming RAG Systems into LEGO-like Reconfigurable Frameworks." Instead of a fixed pipeline, it lets you assemble a RAG flow from individual building blocks, each responsible for one well-defined step. ByteChef exposes the following modules:
- Query Transformers - applied before retrieval to reshape the user's query into something that retrieves better results:
- Compression - condenses a long conversation history and a follow-up question into a single standalone query, so the retriever receives focused input rather than a wall of chat context.
- Rewrite - rewrites verbose, ambiguous, or poorly structured queries into a cleaner form that maps more accurately to the content in your knowledge source.
- Translation - translates the query into the language of your documents, enabling cross-lingual retrieval without requiring your data to be multilingual.
- Multi Query Expander - uses a language model to expand the original query into multiple semantically diverse variations, each capturing a different angle or phrasing of the user's intent. Documents are retrieved for all variations in parallel, increasing the chances of surfacing relevant results that a single query might miss. Any model available in ByteChef can be used to power the expansion.
- Document Retriever - the step where documents are actually fetched from a vector store using semantic similarity search. You can select any of the vector stores ByteChef supports (Couchbase, MariaDB, Milvus, Neo4j, Oracle, PostgreSQL, or Pinecone), or point it at the built-in Knowledge Base.
- Document Joiner - when multiple queries or multiple data sources are involved, this module merges all retrieved document sets into a single, deduplicated collection. Duplicate documents are resolved by keeping the first occurrence; relevance scores are preserved as-is from the retriever.
- Contextual Query Augmenter - enriches the user's query with contextual information extracted from the retrieved documents before it is sent to the model. This helps the model produce more grounded, contextually aware responses. Together, these modules let you design a RAG pipeline tailored to your data and use case — from a simple single-retriever setup to a multi-source, multi-query flow with query rewriting and context augmentation — without writing any retrieval infrastructure yourself.
Memory
A single question-and-answer interaction is useful, but many real-world use cases require the agent to maintain context across a conversation. Remembering what was said earlier, tracking user preferences, or picking up where a previous session left off are what Memory provides.
Spring AI's ChatMemory abstraction handles storing and retrieving conversation history. ByteChef exposes multiple memory backend options:
External memory providers - for durable, production-grade memory that persists across sessions and scales with your application:
- Cassandra
- Cosmos DB
- MongoDB
- Neo4j
- Redis
- MySQL
- Oracle
- PostgreSQL
Additionally, all vector stores supported for RAG (Couchbase, MariaDB, Milvus, Neo4j, Oracle, PostgreSQL, Pinecone) can also serve as memory backends, enabling semantic retrieval of past conversation turns rather than just chronological lookups.
InMemory Chat Memory is a lightweight option that stores conversation history in a simple HashMap in application memory. It requires no external setup and works great for development, testing, or short-lived sessions — but the history is wiped when the chat session ends.
Chat Memory (ByteChef's internal store) is the managed alternative to external providers. Like the Knowledge Base for RAG, it lets you persist conversation history without configuring a separate database. ByteChef handles the storage backend for you.
Tools
One of the defining features of an AI agent is its ability to act. Tools let the AI Agent go beyond generating text and actually interact with external systems: querying databases, sending emails, creating records, calling APIs, and more.
ByteChef's tool support is one of its most powerful differentiators, and it comes in several forms:
Component Actions - ByteChef integrates with over 200 applications and services through its component library (think Slack, GitHub, Salesforce, Google Sheets, HubSpot, and many more). Any action within any component can be exposed to the AI Agent as a tool. When configuring a tool, you choose which properties the AI should determine dynamically based on context, and which ones are fixed constants — so you stay in full control of what the agent can and cannot change.
MCP Tool - ByteChef supports the Model Context Protocol (MCP), an emerging open standard for exposing tools to AI models. The MCP Tool cluster element lets the agent connect to any compatible MCP server and use its tools, opening up the ecosystem beyond ByteChef's built-in integrations.
Skills Tool - ByteChef supports the concept of Skills: reusable, importable automation scripts or workflows that can be shared and composed. The Skills Tool cluster element lets the agent invoke any Skill that has been imported into your ByteChef workspace, whether written by your team or sourced from the community.
AI Agent as a Tool - Perhaps the most powerful option: you can configure another AI Agent — complete with its own model, RAG, memory, and tools — as a tool for the current agent. This is the foundation for building agentic patterns such as orchestrator/subagent hierarchies, where a supervisor agent delegates specific tasks to specialized sub-agents. This recursive composability makes ByteChef's AI Agent a genuine platform for multi-agent systems.
Guardrails
Guardrails are ByteChef's own layer of control on top of the Spring AI-powered capabilities. While the other cluster elements are about making the agent smarter and more capable, Guardrails are about keeping it appropriate and safe.
Guardrails can be configured to inspect both incoming requests and outgoing responses. Common use cases include:
- Content filtering - blocking or censoring sensitive, inappropriate, or offensive words and phrases
- Topic restrictions - preventing the agent from discussing subjects outside its intended scope
- Compliance controls - ensuring responses don't contain regulated or legally sensitive information
Unlike the other cluster elements, Guardrails are a ByteChef-native feature, not part of Spring AI. They sit as a wrapper around the agent interaction, giving you a transparent enforcement layer regardless of which model, RAG strategy, or memory backend you've chosen.
Putting It All Together
The power of ByteChef's AI Agent comes from how these cluster elements combine. A production-grade agent might use:

- OpenAI GPT-4o as the model for strong reasoning
- Modular RAG with a Pinecone vector store to ground answers in internal documentation
- PostgreSQL memory to remember past conversations per user
- Component actions to create CRM records, send notifications, or update spreadsheets
- Guardrails to ensure every response is appropriate for the audience
And because ByteChef is built on Spring AI, you benefit from a well-maintained, actively developed foundation that keeps pace with the rapidly evolving AI ecosystem, new models, new vector stores, and new capabilities get integrated continuously.
Whether you're building a customer support agent, an internal knowledge assistant, a data processing pipeline, or a complex multi-agent system, ByteChef's AI Agent gives you the building blocks to do it without needing to write your own AI infrastructure from scratch.
Ready to try building with the AI Agent in ByteChef for yourself?
Subscribe to the ByteChef Newsletter
Get the latest guides on complex automation, AI agents, and visual workflow best practices delivered to your inbox.