Learn how to implement persistent chat memory using Spring AI Chat memory with Cassandra. Complete guide with code examples using Apache Cassandra for robust and scalable conversational experiences.
1. Introduction
Imagine building an AI assistant that remembers your previous conversations. Rather than starting from scratch each time, it recalls earlier discussions, preferences, and context. This ability to maintain context over time is crucial for creating truly useful conversational applications.
In earlier posts, we laid the groundwork for chat memory in Spring AI:
- Spring AI Chat Memory Basics
We explored what chat memory is, why it’s essential for building coherent, context‑aware bots, and how to implement it in memory. (📖 Read the guide) - Spring AI JDBC Chat Memory
We then advanced to persistent stores using relational databases—PostgreSQL and MariaDB—leveraging Spring AI’s JDBC chat memory module for durability across restarts. (📖 Read the guide)
In this blog, we’re taking the next step with Apache Cassandra – a highly scalable NoSQL database perfectly suited for chat applications that need to handle massive amounts of conversation data.
2. Implementing Spring AI Chat Memory with Cassandra
Let’s build two chat applications demonstrating Spring AI Cassandra chat memory implementation. We’ll create:
- Single-User Chat Application: A simple chatbot where all users share the same conversation history. All conversations are stored against a default conversation ID.
- Multi-User Chat Application: An advanced chatbot that maintains separate conversation histories for different users. Each user will have their own conversation history stored separately.
This allows our application to scale from personal assistants to enterprise chat applications serving thousands of users simultaneously. Cassandra’s distributed architecture makes it an excellent choice for such scenarios.
2.1. Setting Up Project
Let’s start by setting up our project with the required dependencies.
Step 1: Add Required Dependencies
Add the required dependencies in pom.xml
<dependencies>
<!-- Spring Boot Web for building RESTful web services -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- OpenAI Model Support – configureable for various AI providers (e.g. OpenAI, Google Gemini) -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-openai</artifactId>
</dependency>
<!-- Cassandra-backed Chat Memory Support -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-model-chat-memory-cassandra</artifactId>
</dependency>
<!-- Spring Data Cassandra for interacting with a Cassandra database -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-cassandra</artifactId>
</dependency>
</dependencies>
<dependencyManagement>
<dependencies>
<!-- Spring AI bill of materials to align all spring-ai versions -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>${spring-ai.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
pom.xmlIn this configuration:
spring-boot-starter-web:
Enables us to build a web application with REST endpointsspring-ai-starter-model-openai:
Provides integration with OpenAI’s API (though we’ll configure it for Google Gemini)spring-ai-model-chat-memory-cassandra:
Offers a Cassandra-backed memory store to save chat history. This lets your application “remember” past messages by keeping context in a Cassandra database.spring-boot-starter-data-cassandra:
Adds Spring Data support for Cassandra, making it easy to read and write data in Cassandra tables using repositories and entity classes.spring-ai-bom:
ThedependencyManagement
section uses Spring AI’s Bill of Materials (BOM) to ensure compatibility between Spring AI components. By importing the BOM, you don’t need to manually specify versions for each Spring AI artifact—it ensures compatibility and prevents version conflicts automatically.
Step 2: Configure Application Properties
Now, let’s configure our application to connect to Cassandra using application.yml
:
spring:
application:
name: spring-boot-ai-chat-memory-cassandra
cassandra:
# The data center name for your Cassandra cluster (used for load balancing and topology)
local-datacenter: datacenter1
data:
cassandra:
# Host or IP address of your Cassandra cluster
contact-points: 127.0.0.1
# Port on which Cassandra is listening
port: 9042
# Username for authenticating with Cassandra
username: ${DB_USERNAME}
# Password for authenticating with Cassandra
password: ${DB_PASSWORD}
# AI configurations
ai:
openai:
api-key: ${GEMINI_API_KEY}
base-url: https://generativelanguage.googleapis.com/v1beta/openai
chat:
completions-path: /chat/completions
options:
model: gemini-2.0-flash-exp
application.yaml📄 Configuration Overview
This configuration focuses on two main areas—Cassandra for chat memory storage and AI integration with Google’s Gemini model via the Spring AI OpenAI starter:
👉 Cassandra Settings
- local‑datacenter: Specifies which data center your application should prefer when reading or writing. Keeping traffic within the same DC reduces latency.
- contact‑points: The IP address or hostname of your Cassandra cluster nodes.
- port: The TCP port Cassandra listens on (default 9042).
- username & password: Credentials the Spring Data Cassandra driver uses to authenticate with your cluster.
👉 AI (OpenAI Starter) Settings
- api‑key: Your secret key for authenticating with the AI service. Keep this safe and out of source control.
- base‑url: Overrides the default OpenAI endpoint so requests go to Google’s Gemini API instead.
- completions‑path: The REST path for chat-based completions—appended to the base URL when making requests.
- model: Chooses which AI model to call (e.g.
gemini-2.0-flash-exp
). This determines the capabilities and response style you’ll get back.
Together, these settings ensure your app can store conversation history in Cassandra and send chat prompts to Gemini, all through Spring’s familiar configuration style.
🤖 Google Gemini APIs are great for proof-of-concept (POC) projects since they offer limited usage without requiring payment. For more details, check out our blog, where we dive into how Google Gemini works with OpenAI and how to configure it in case of our Spring AI application.
Step 3: Configuring the Chat Client with Cassandra Chat Memory
We need to create a configuration class to set up our ChatClient with memory capabilities.
import com.datastax.oss.driver.api.core.CqlSession;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.memory.cassandra.CassandraChatMemory;
import org.springframework.ai.chat.memory.cassandra.CassandraChatMemoryConfig;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class ChatClientConfig {
@Bean
public ChatClient chatClient(ChatClient.Builder chatClientBuilder, ChatMemory chatMemory) {
return chatClientBuilder.defaultAdvisors(new MessageChatMemoryAdvisor(chatMemory)).build();
}
@Bean
public ChatMemory cassandraChatMemory(CqlSession cqlSession) {
// Creates a CassandraChatMemory instance using a configuration builder.
// Needs the active CqlSession to interact with the database.
return CassandraChatMemory.create(CassandraChatMemoryConfig.builder()
// Specify the keyspace where the chat memory table should reside or be created
.withKeyspaceName("my_local_app_keyspace")
// Provide the active CqlSession used for interacting with the Cassandra database
.withCqlSession(cqlSession)
.build());
}
}
ChatClientConfig.javaLet’s break down exactly what this configuration is doing:
- Defining two Spring beans:
cassandraChatMemory
andchatClient
- The
cassandraChatMemory
bean usesCassandraChatMemory.create()
under the hood. It:- Takes an active cqlSession (your live Cassandra connection)
- Specifies the keyspace where the chat table lives (and will be auto-created if missing)
- This bean gives you a durable, Cassandra-backed store for every chat message and AI response—surviving restarts and scaling across nodes.
- The
- Building the
chatClient
bean with memory advisor- We inject the
ChatClient.Builder
and thecassandraChatMemory
intochatClient(...)
- We attach a
MessageChatMemoryAdvisor
to the builder, which:- Automatically intercepts each outgoing user prompt and incoming AI reply
- Persists them into the provided CassandraChatMemory instance
- Retrieves past conversation history on every new request so that the AI can see previous context
- The final
ChatClient
you get is “memory-aware” out of the box: you don’t have to write any code to save or load messages—everything is handled by the advisor and your Cassandra store.
- We inject the
2.2. Single-User Chat Application
Our first application is a simple chatbot where all users share the same conversation context. This is useful for:
- Educational resources where context should be preserved
- Public information kiosks
- FAQ chatbots on websites
- Shared assistant systems in team environments
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.messages.Message;
import org.springframework.web.bind.annotation.DeleteMapping;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
@RestController
@RequestMapping("/api/chat")
public class SingleUserChatMemoryController {
private final ChatClient chatClient;
private final ChatMemory chatMemory;
public SingleUserChatMemoryController(ChatClient chatClient, ChatMemory chatMemory) {
this.chatClient = chatClient;
this.chatMemory = chatMemory;
}
// Endpoint to send messages and get responses
@PostMapping
public String chat(@RequestBody String request) {
return chatClient.prompt()
.user(request)
.call()
.content();
}
// Endpoint to view conversation history
@GetMapping("/history")
public List<Message> getHistory() {
return chatMemory.get("default", 100);
}
// Endpoint to clear conversation history
@DeleteMapping("/history")
public String clearHistory() {
chatMemory.clear("default");
return "Conversation history cleared";
}
}
SingleUserChatMemoryController.javaThis controller exposes three REST endpoints that allow users to interact with an AI model using a conversational interface. It uses Spring AI’s ChatClient to send user messages and get AI responses, and ChatMemory to persist conversation history using Cassandra.
Endpoints Summary:
POST /api/chat
– To send a message and get a responseGET /api/chat/history
– To view the conversation historyDELETE /api/chat/history
– To clear the conversation history
Notice the use of the “default” conversation ID when accessing memory. Chat memory uses this ID as the default identifier for conversations when none is specified.
🧩 Chat Memory in Action: Key Endpoints Explained
a. Send Message – POST /api/chat
How It Works: When a user sends a message via the /api/chat endpoint:
- The chatClient.prompt().user(request).call() method is triggered.
- It takes the user’s input message from the request.
- The MessageChatMemoryAdvisor (configured behind the scenes in the ChatClient) automatically retrieves any previous messages stored in memory for the “default” conversation.
- It sends both the new user message and the previous context to the configured AI model (e.g., Google Gemini).
- The AI processes the full conversation context and returns a response.
- Both the user’s input and the AI’s response are stored in the database-backed memory (using Cassandra), ensuring that context is preserved for future interactions.
b. View Chat History – GET /history
How It Works: When a user calls the /history endpoint:
- The chatMemory.get(“default”, 100) method retrieves up to 100 of the most recent messages stored in memory.
- These messages include both user inputs and AI responses from the “default” conversation.
- This helps you understand the full conversation context currently stored in Cassandra memory, which is useful for debugging or displaying chat history in the UI.
c. Clear Chat History – DELETE /history
How It Works: When the /history endpoint is called with the DELETE method:
- The chatMemory.clear(“default”) method wipes all stored messages for the “default” conversation.
- This removes any context previously stored—essentially resetting the chat memory.
- The next time the user sends a message, it will be treated as a completely new conversation.
- This is especially useful when the user wants to start fresh or if you want to reset the AI’s memory programmatically after a certain point in the flow.
🖥️ Verify the output
Here’s a detailed walkthrough of testing the single-user chat memory implementation:
- Send an initial message
- 💬
POST /api/chat
with:"Hello, my name is Bunty Raghani"
- 🤖 LLM responds with a greeting and remembers the name.
- 💾 Behind the scenes: A new record is inserted into ai_chat_memory with session_id = ‘default’, storing the message “
Hello, my name is Bunty Raghani
” This allows the system to remember the name and context for future conversations.
- 💬
- Ask a follow-up
- 💬
POST /api/chat
with:"what's my name?"
- 🤖 LLM replies:
"Your name is Bunty Raghani."
(thanks to memory) - 🔍 Behind the scenes: The system retrieves the stored message for session_id = ‘default’ from ai_chat_memory, recognizing the name “Bunty Raghani” and providing a relevant response based on the context.
- 💬
- Check memory
- 💬
GET /api/chat/history
- 🤖 Returns full conversation so far.
- 🔍 Behind the scenes: The system fetches all records associated with session_id = ‘default’ from ai_chat_memory, returning a list of user inputs and AI responses from the current session. This helps track the context of the ongoing conversation.
- 💬
- Clear the memory
- 🧹
DELETE /api/chat/history
- 🤖 Returns:
"Conversation history cleared"
- 🗑️ Behind the scenes: The system executes a query to delete all records where
session_id = 'default'
, removing all previously stored conversation data. This effectively resets the memory, clearing any prior context.
- 🧹
- Test again after clearing
- 💬
POST /api/chat
with:"what's my name?"
- 🤖 LLM responds:
"As a large language model, I don't have access to personal information, including your name. You haven't told me your name, so I don't know it."
- 🔍 Behind the scenes: Since the memory has been cleared for session_id = ‘default’, there are no stored records to refer to. The LLM responds with a default reply indicating that no previous context is available, allowing the user to share their name again if they choose.
- 💬
This confirms that the chat memory is working as intended — storing, retrieving, and resetting context effectively.
The single-user approach works well for many applications, but what if we need to support multiple users with individual conversation histories?
2.3. Multi-User Chat Application
When your application needs to handle multiple users, each with their own conversation history, storing all chats in a single memory won’t work. Instead, we need to isolate conversations by a unique identity like a user ID.
Our second application maintains separate conversation histories for different users. This is essential for:
- Personal assistant applications
- Customer support systems
- Personalized learning platforms
- Any application where user-specific context matters
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.messages.Message;
import org.springframework.web.bind.annotation.DeleteMapping;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
import static org.springframework.ai.chat.client.advisor.AbstractChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY;
@RestController
@RequestMapping("/api/users")
public class MultiUserChatMemoryController {
private final ChatClient chatClient;
private final ChatMemory chatMemory;
public MultiUserChatMemoryController(ChatClient chatClient, ChatMemory chatMemory) {
this.chatClient = chatClient;
this.chatMemory = chatMemory;
}
// Endpoint to send messages and get responses
@PostMapping("/{userId}/chat")
public String chat(
@PathVariable String userId,
@RequestBody String request) {
return chatClient
.prompt()
.advisors(advisorSpec -> advisorSpec.param(CHAT_MEMORY_CONVERSATION_ID_KEY, userId))
.user(request)
.call().content();
}
// Endpoint to view conversation history
@GetMapping("/{userId}/history")
public List<Message> getHistory(@PathVariable String userId) {
return chatMemory.get(userId, 100);
}
// Endpoint to clear conversation history
@DeleteMapping("/{userId}/history")
public String clearHistory(@PathVariable String userId) {
chatMemory.clear(userId);
return "Conversation history cleared for user: " + userId;
}
}
MultiUserChatMemoryController.javaThe key difference here is how we’re using the userId as a conversation identifier:
- Dynamic Conversation IDs: Each user gets their own conversation ID (the user ID).
- Isolated Contexts: Users only see responses informed by their own previous interactions.
- User-Specific Endpoints: All endpoints include the user ID in the path.
- Conversation ID Parameter: We explicitly set the conversation ID when prompting the LLM. In the chat endpoint, we use
.advisors(advisorSpec -> advisorSpec.param(CHAT_MEMORY_CONVERSATION_ID_KEY, userId))
to tell the MessageChatMemoryAdvisor which conversation ID to use. In our case, the userId path variable acts as the conversation ID. - The history and clear endpoints now work with user-specific conversation IDs instead of the default one.
This design pattern is perfect for applications requiring personalized conversation experiences. The beauty is that we’re using the same underlying CassandraChatMemory
implementation—we’re just using different conversation IDs.
🖥️ Verify the Output
Let’s test how chat memory works when handling multiple users. We’ll walk through two separate conversations — one for Alice and one for Bob — and see how their chat histories are independently managed.
🧑💻 Alice’s Conversation Flow
- Send a message
💬 Alice says: “Hi, I’m Alice”
🤖 The LLM replies: “Hi Alice! It’s nice to meet you. How can I help you today?”
💾 Behind the scenes: A record is inserted intoai_chat_memory
withsession_id = 'Alice'
- Ask for name
💬 Alice says: “What’s my name?”
🤖 The LLM responds using memory: “Your name is Alice.”
🔍 Behind the scenes: The system retrieves the previous messages forsession_id = 'Alice'
to maintain context
🧑💻 Bob’s Conversation Flow
- Send a message
💬 Bob says: “Hey, I’m Bob”
🤖 The LLM replies: “Hello Bob! It’s nice to meet you. How can I help you today?”
💾 Behind the scenes: A new set of records is created inai_chat_memory
withsession_id = 'Bob'
- Clear Bob’s memory
🧹 Call DELETE /api/users/Bob/history
🤖 Response: “Conversation history cleared for user: Bob”
🗑️ Behind the scenes: Deletes all chat memory entries associated with Bob’s session from the ai_chat_memory table
- Ask for name
💬 Bob says: “What’s my name?”
🤖 Since Bob’s memory was cleared, the LLM says: “As a large language model, I don’t have access to personal information about you. Therefore, I do not know your name.”
❌ No previous context is available forsession_id = 'Bob'
🔁 Alice’s Memory Is Still Intact
- Ask again
💬 Alice says: “What’s my name?”
🤖 The LLM responds correctly: “Your name is Alice.”
🔍 The system successfully retrieves Alice’s conversation context and responds accordingly.
This confirms that Alice’s conversation memory remains intact and unaffected by Bob’s actions — proving that each user has their own isolated memory space.
This demonstrates the isolation between different users’ conversation histories in the database. Each user has their own dedicated conversation space, making this approach ideal for personalized assistant applications.
3. Cassandra Chat Memory Customization Options
3.1. Keyspace and Table Name Customization
By default, Spring AI uses a keyspace named springframework
and a table named ai_chat_memory
. However, you can customize these to fit your application’s needs:
@Bean
public ChatMemory cassandraChatMemory(CqlSession cqlSession) {
// Creates a CassandraChatMemory instance using a configuration builder.
// Needs the active CqlSession to interact with the database.
return CassandraChatMemory.create(CassandraChatMemoryConfig.builder()
// Specifies the keyspace where the chat memory table should reside or be created.
// By default, CassandraChatMemory uses a keyspace named "springframework"
// If the keyspace doesn't exist, Spring AI will attempt to create it.
.withKeyspaceName("my_local_app_keyspace")
// Provide the active CqlSession used for interacting with the Cassandra database
.withCqlSession(cqlSession)
// Specify a custom table name. Defaults to "ai_chat_memory" if not set.
.withTableName("my_custom_table")
.build());
}
ChatClientConfig.java3.2. Setting Time-To-Live (TTL)
Cassandra offers a powerful feature called Time-To-Live (TTL) that allows data to automatically expire after a specified duration. This is particularly useful for chat applications where you might want to automatically purge older conversations.
@Bean
public ChatMemory cassandraChatMemory(CqlSession cqlSession) {
// Creates a CassandraChatMemory instance using a configuration builder.
// Needs the active CqlSession to interact with the database.
return CassandraChatMemory.create(CassandraChatMemoryConfig.builder()
// Specify the keyspace where the chat memory table should reside or be created.
.withKeyspaceName("my_local_app_keyspace")
// Provide the active CqlSession used for interacting with the Cassandra database
.withCqlSession(cqlSession)
// Set a Time-To-Live (TTL) for chat entries.
// TTL is applied per entry
.withTimeToLive(Duration.ofDays(30))
.build());
}
ChatClientConfig.javaIt’s important to understand that TTL in Cassandra is applied per entry, not to the entire table or conversation. This means:
- Each message inserted gets its own individual expiration timestamp
- Messages expire independently based on when they were inserted
- Newer messages in a conversation will remain even after older ones expire
4. Enable Logging
To enable logging for Cassandra operations, we need to configure the logging settings. The DataStax Java driver for Cassandra provides comprehensive logging capabilities to track and monitor database interactions. Let’s take a look at how we can set up the logging configuration in our main class:
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class SpringBootAiChatMemoryCassandraApplication {
public static void main(String[] args) {
// --- Configure Advanced DataStax Driver Logging via System Properties ---
// These properties are set *before* Spring Boot initializes the Cassandra driver,
// ensuring the driver picks them up early. They provide fine-grained control
// over how the driver logs query executions, which is useful for debugging.
// Explicitly tell the driver to use the RequestLogger class for tracking.
// Using the index '.0' as it expects a list via the newer 'classes' property.
System.setProperty("datastax-java-driver.advanced.request-tracker.classes.0", "RequestLogger");
// Enable logging for successful queries.
System.setProperty("datastax-java-driver.advanced.request-tracker.logs.success.enabled", "true");
// Enable logging for queries considered "slow".
System.setProperty("datastax-java-driver.advanced.request-tracker.logs.slow.enabled", "true");
// Enable logging for queries that result in errors.
System.setProperty("datastax-java-driver.advanced.request-tracker.logs.error.enabled", "true");
// IMPORTANT: Enable logging of query parameter values (e.g., values bound to '?').
// Very useful for debugging, but can be verbose. Use with caution in production.
System.setProperty("datastax-java-driver.advanced.request-tracker.logs.show-values", "true");
// Limit the length of logged parameter values to avoid excessively long log lines.
System.setProperty("datastax-java-driver.advanced.request-tracker.logs.max-value-length", "200");
// Limit the number of parameter values logged per query.
System.setProperty("datastax-java-driver.advanced.request-tracker.logs.max-values", "200");
// --- End of Driver Logging Configuration ---
// Run the Spring Boot application.
SpringApplication.run(SpringBootAiChatMemoryCassandraApplication.class, args);
}
}
SpringBootAiChatMemoryCassandraApplication.java👉 Let’s explain what each setting does:
- RequestLogger: This is the class responsible for tracking and logging Cassandra queries. Setting this property tells the driver to use this logger.
- success.enabled: When set to
true
, all successful queries are logged. This is useful during development but might generate too much noise in production. - slow.enabled: When enabled, queries that take longer than a configurable threshold are logged as “slow queries.” This helps identify performance bottlenecks.
- error.enabled: Logs queries that result in errors, which is essential for debugging failed operations.
- show-values: When enabled, the actual parameter values bound to query placeholders are logged. This is incredibly helpful for debugging, as you can see exactly what values were being used in the query.
- max-value-length: Limits how much of each value is logged, preventing overly long log entries.
- max-values: Limits the total number of values logged per query, again to keep logs manageable.
🖥️ With this configuration, you’ll see detailed logs like:
// ================================================
// 1. User sends initial message: "Hello, my name is Bunty Raghani"
// Operations:
// - Retrieve last 100 messages from ai_chat_memory table (context fetch)
// - Insert new user message into ai_chat_memory
// - Insert AI response into ai_chat_memory
// ================================================
2025-05-10T01:56:15.496+05:30 INFO 6295 --- [spring-boot-ai-chat-memory-cassandra] [ s0-io-3] c.d.o.d.i.core.tracker.RequestLogger : [s0|510897510][Node(endPoint=/127.0.0.1:9042, hostId=abc, hashCode=123)] Success (23 ms) [2 values] SELECT * FROM my_local_app_keyspace.ai_chat_memory WHERE session_id=:session_id LIMIT :lastn [session_id='default', lastn=100]
2025-05-10T01:56:15.507+05:30 INFO 6295 --- [spring-boot-ai-chat-memory-cassandra] [ s0-io-3] c.d.o.d.i.core.tracker.RequestLogger : [s0|1563492878][Node(endPoint=/127.0.0.1:9042, hostId=abc, hashCode=123)] Success (10 ms) [3 values] INSERT INTO my_local_app_keyspace.ai_chat_memory (session_id,message_timestamp,user) VALUES (:session_id,:message_timestamp,:message) [session_id='default', message_timestamp='2025-05-10T01:56:15.496+05:30', message='Hello, my name is Bunty Raghani']
2025-05-10T01:56:16.622+05:30 INFO 6295 --- [spring-boot-ai-chat-memory-cassandra] [ s0-io-3] c.d.o.d.i.core.tracker.RequestLogger : [s0|2046870720][Node(endPoint=/127.0.0.1:9042, hostId=abc, hashCode=123)] Success (5 ms) [3 values] INSERT INTO my_local_app_keyspace.ai_chat_memory (session_id,message_timestamp,assistant) VALUES (:session_id,:message_timestamp,:message) [session_id='default', message_timestamp='2025-05-10T01:56:16.616+05:30', message='Hello Bunty Raghani, it''s nice to meet you! How can I help you today?']
// ================================================
// 2. User checks conversation history
// Operation:
// - Fetch the last 100 messages for session_id='default' from Cassandra
// ================================================
2025-05-10T01:56:23.189+05:30 INFO 6295 --- [spring-boot-ai-chat-memory-cassandra] [ s0-io-3] c.d.o.d.i.core.tracker.RequestLogger : [s0|1331473124][Node(endPoint=/127.0.0.1:9042, hostId=abc, hashCode=123)] Success (22 ms) [2 values] SELECT * FROM my_local_app_keyspace.ai_chat_memory WHERE session_id=:session_id LIMIT :lastn [session_id='default', lastn=100]
// ================================================
// 3. User clears conversation history
// Operation:
// - Delete all records for session_id='default' from Cassandra
// ================================================
2025-05-10T01:56:29.079+05:30 INFO 6295 --- [spring-boot-ai-chat-memory-cassandra] [ s0-io-3] c.d.o.d.i.core.tracker.RequestLogger : [s0|592986705][Node(endPoint=/127.0.0.1:9042, hostId=0a60731e-970b-4c3b-8d6f-7a436cbe265d, hashCode=123)] Success (19 ms) [1 values] DELETE FROM my_local_app_keyspace.ai_chat_memory WHERE session_id=:session_id [session_id='default']
terminalWhat you’ll see: INSERT, SELECT, and DELETE statements in your console—great for understanding and troubleshooting.
This detailed logging helps you:
- Verify that the correct queries are being executed
- Debug issues with parameter binding
- Understand the timing and sequence of database operations
- Identify potential performance bottlenecks
5. Video Tutorial
If you prefer visual learning, check out our step-by-step video tutorial demonstrating the complete Spring AI Chat Memory Cassandra implementation for single and multi-user scenarios.
📺 Watch on YouTube:
6. Source Code
For the complete working example of the Cassandra implementation discussed in this blog, check out our GitHub repository:
🔗 Spring Boot AI Chat Memory with Cassandra Example: https://github.com/BootcampToProd/spring-boot-ai-chat-memory-cassandra
7. Things to Consider
When implementing chat memory in your Spring AI applications, consider these important factors:
- TTL Strategy: Decide whether to apply TTL at all, and if so, what duration makes sense for your application. Remember that TTL is per-entry, so conversations will “fade” gradually rather than disappear all at once.
- Conversation ID Management: In production applications, consider using secure, unique identifiers for conversation IDs. For multi-user applications, you might want to add authentication to ensure users can only access their own conversations.
- Backup Strategy: Implement regular backups of your Cassandra data to prevent conversation loss.
- Monitoring: Set up monitoring for your Cassandra cluster to ensure it’s performing optimally. Watch for slow queries and optimize as necessary.
8. FAQs
1. Why use Cassandra instead of a traditional SQL database for chat memory?
Cassandra is designed for high write throughput and horizontal scalability, making it ideal for chat applications that need to handle many concurrent users and conversations. It excels at storing time-series data like conversation histories.
How does Spring AI handle message ordering in conversations?
Spring AI uses timestamps to ensure proper message ordering. Each message gets a timestamp metadata entry, which Cassandra uses as a clustering key to maintain chronological order.
Do I need to manually create the Cassandra table?
No. Given an existing keyspace, Spring AI will auto‑create ai_chat_memory
(or your custom table).
How can I limit the number of messages returned per conversation?
When calling chatMemory.get(conversationId, limit), the limit parameter controls how many of the most recent messages are fetched. For example, chatMemory.get(“Alice”, 50) returns the last 50 chat entries.
9. Conclusion
You now have everything you need to build a chat application that “remembers” users and conversations—even after restarts—by combining Spring AI with Cassandra. Whether you want a quick single-user bot or a full multi-user system, you can easily set up your keyspace and table, add automatic expiration (TTL), and rely on Spring Boot’s auto-configuration to handle the details. This approach scales from one user to millions, keeps your data safe and persistent, and gives you the flexibility to tweak settings as you grow—so you can focus on creating truly conversational AI experiences. Happy coding!
10. Learn More
Interested in learning more?
Spring AI JDBC Chat Memory: Building Persistent Conversational Applications with PostgreSQL and MariaDB
Add a Comment