- Blogs
- Adobe ColdFusion
- Giving Your ColdFusion AI a Memory: Sessions, Preferences, Windows, and the Joy of Not Becoming a Gossip Appliance
Master
This article moves beyond simple, stateless ChatModel() calls and introduces Agent() as the layer that gives ColdFusion AI conversational memory. It explains how message windows, token windows, PERUSER, memory keys, and persistent stores help an assistant remember recent context without accidentally sharing one user’s conversation with another. It also separates short-term chat memory from durable user preferences, emphasizing that important preferences belong in your application data, not in a rolling memory window. The core lesson is that memory makes an AI assistant feel more coherent, but ColdFusion still controls identity, persistence, validation, privacy, and truth.
In the last article, we built the ColdFusion AI version of “Hello World.” We configured a ChatModel(), sent a prompt with .chat(), read response.message, and displayed the result safely. It was simple, useful, and intentionally limited.
That limitation matters.
ChatModel() is stateless. Each call stands on its own. It does not remember what the user said five seconds ago. It does not remember their name. It does not remember that they prefer CFScript examples. It does not remember that they are currently asking about session scope and not, for example, sourdough starter maintenance. That is not a bug. That is the layer doing exactly what it is supposed to do.
But if we want to build something that feels like an actual assistant instead of a very expensive autocomplete endpoint, we need memory.
This article is about adding conversation memory to ColdFusion AI using Agent(), understanding memory windows, using PERUSER correctly, and separating temporary conversation history from durable user preferences. Because there is a large difference between:
“The assistant remembered what I just asked.”
And:
“The assistant told Janet from Accounting that her name is Bob and she lives in Paris.”
One is memory. The other is a meeting with HR and/or security.
Where we are in the series
In the preamble, we covered the AI vocabulary: LLMs, prompts, tokens, context windows, temperature, hallucinations, tools, RAG, MCP, and guardrails. In the previous article, we used ChatModel() for a simple stateless request:
chatModel = ChatModel( {
provider : "openAI",
modelName : "gpt-5-nano",
apiKey : application.aiApiKey,
temperature : 0.3,
maxTokens : 500,
timeout : 30
} );
response = chatModel.chat(
"Explain ColdFusion session scope in one short paragraph."
);
writeOutput( encodeForHtml( response.message ) );
That works well for simple one-off tasks:
- summarize this text
- rewrite this paragraph
- classify this message
- explain this error
- generate a short draft
- translate this content
But it fails the most basic conversation test. If the user says, “My name is David,” and then asks “What is my name?” a plain ChatModel() call has no idea unless your application includes the previous message again. That is where Agent() comes in.
ChatModel() versus Agent()
The simplest way to think about it is this; ChatModel() is the model connection. Agent() is the conversational wrapper around the model. A ChatModel() sends a prompt to an LLM and gets a response. An Agent() can build on top of that model and add things like:
- multi-turn conversation management
- memory
- persistent system instructions
- per-user context
- tool calling
- richer chat request structures
In this article, we are focusing on memory. We are not using tools yet. We are not connecting to MCP yet. We are not doing RAG yet. We are not building Skynet, Clippy 2.0, or a chatbot that insists it is “almost done” while generating invalid JSON for eleven minutes. We are just giving the assistant enough short-term memory to have a useful conversation.
The basic Agent setup
Let’s start with a basic ChatModel() and wrap it in an Agent().
chatModel = ChatModel( {
provider : "openAI",
modelName : "gpt-5-nano",
apiKey : application.aiApiKey,
temperature : 0.3,
maxTokens : 500,
timeout : 30
} );
agent = Agent( {
CHATMODEL : chatModel,
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true
}
} );
response1 = agent.chat(
"My name is David.",
session.sessionId
);
response2 = agent.chat(
"What is my name?",
session.sessionId
);
writeOutput( encodeForHtml( response2.message ) );
The important part is the CHATMEMORY configuration:
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true
}
This tells ColdFusion:
- use message-based memory
- keep the last 20 messages
- isolate memory per user
That last one is not optional in any real user-facing application unless you are deliberately building a shared group conversation. For normal application chat, PERUSER : true is the difference between “assistant” and “privacy incident with a text box.”
Why memory belongs to Agent(), not ChatModel()
This is a key architectural point. Memory is not part of ChatModel() itself. A stateless ChatModel() does not maintain conversation history. It sends a prompt, gets a response, and goes on with its life. It is emotionally unavailable, but reliable. Memory is managed by the Agent() layer.
That makes sense because memory is not really a model concern. It is an application concern. Your application needs to decide:
- whose conversation this is
- how much history to keep
- where to store it
- when to forget it
- whether it survives server restarts
- whether it is isolated per user
- whether preferences should be stored separately
- whether the memory should be included in future prompts
The model does not know any of that. The model receives context. ColdFusion and your application decide what context gets sent. This is exactly the same principle we have been repeating since the first article: The LLM is not your application. It is a reasoning engine your application supervises. Now we are supervising what it remembers.
A basic memory test
Here is a simple test you can run.
userId = session.sessionId;
agent.chat(
"My favorite ColdFusion style is CFScript.",
userId
);
response = agent.chat(
"What ColdFusion style do I prefer?",
userId
);
writeOutput( encodeForHtml( response.message ) );
With memory configured and the same userId passed to both calls, the assistant should be able to answer based on the previous message. Something like:
You prefer CFScript.
That does not mean the model magically gained permanent knowledge. It means the Agent() included relevant conversation history with the new request. That distinction matters. Memory is not telepathy. Memory is prompt construction with better plumbing.
The userId argument
Notice this part:
response = agent.chat(
"What ColdFusion style do I prefer?",
userId
);
When PERUSER : true, each unique userId gets its own memory context. That userId might be:
session.sessionId- the logged-in user’s database ID
- a tenant-scoped user key
- a generated conversation ID
- another stable identifier that makes sense in your application
For logged-in applications, I would usually prefer a real authenticated user ID, possibly combined with tenant/group/account context. For example:
userMemoryKey = "account-" & session.accountId & ":user-" & session.userId;
That avoids collisions and makes it very clear whose memory belongs to whom. Using only session.sessionId can be fine for short-lived anonymous chat, but it may not survive across devices, browsers, or new sessions. That may be exactly what you want. The key is to choose intentionally. Do not casually use something like:
userId = "user";
That is not per-user memory. That is a shared diary with delusions of privacy.
PERUSER matters
Let’s talk about PERUSER. When you set:
PERUSER : true
ColdFusion scopes memory to the userId you pass into agent.chat(). That means this:
agent.chat( "My name is Alice.", "user-alice" );
agent.chat( "My name is Bob.", "user-bob" );
response = agent.chat( "What is my name?", "user-alice" );
writeOutput( encodeForHtml( response.message ) );
Should answer Alice, because the third call uses Alice’s memory context. That is what you want.
Now imagine you omit PERUSER. All calls share one global memory context. That might be fine for a single shared bot in a single shared room. It is not fine for a normal application where multiple users are chatting independently.
Without per-user isolation, you can get cross-user contamination. Alice tells the bot:
My name is Alice and I live in Paris.
Bob asks:
Where do I live?
The bot says:
You live in Paris.
Congratulations. You have invented distributed confusion. Also possibly a privacy bug. For user-specific assistant experiences, use PERUSER : true. Then pass a real user key every time. No exceptions unless you have a very specific reason and have written that reason down somewhere future-you can find it.
Message window memory
ColdFusion supports a message window memory strategy using:
TYPE : "messageWindowChatMemory"
With this strategy, you specify the number of messages to retain. For example:
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true
}
This keeps the last 20 messages in the conversation history. When the conversation grows beyond that limit, the oldest messages are dropped. This is simple and predictable. For many applications, it is the best place to start. You can reason about it easily:
- keep the last 10 messages for lightweight chat
- keep the last 20 messages for normal assistant behavior
- keep more if conversations are longer and the model/context budget supports it
There is no perfect number. If the window is too small, the assistant forgets things too quickly. If the window is too large, you may send too much context, increase token usage, increase cost, slow requests down, and make the model sift through a conversation history that includes three topic changes and someone asking whether a hot dog is a sandwich.
For a first implementation, start with something like 10 to 20 messages. Then test with real conversations. Boring, observable, measurable. The unglamorous holy trinity of production software.
Token window memory
ColdFusion also supports token window memory using:
TYPE : "tokenWindowChatMemory"
With this strategy, you specify a token budget instead of a message count. For example:
CHATMEMORY : {
TYPE : "tokenWindowChatMemory",
MAXTOKENS : 4000,
PERUSER : true
}
This retains messages up to the configured token limit. Older messages are dropped as needed to stay within the token budget. This is useful when you care more about provider context limits than message counts. For example, ten short messages might be tiny. Ten long messages might be an entire novella, a stack trace, and someone’s pasted XML configuration from 2009.
Message count does not always tell you how large the prompt will be. Token window memory gives you tighter control over how much conversation history gets included. The tradeoff is that it is a little more abstract. Developers can count messages easily. Tokens are less intuitive.
My practical recommendation:
Use messageWindowChatMemory first. Use tokenWindowChatMemory when:
- conversations can contain very long messages
- cost control is important
- provider context limits are a concern
- you need tighter prompt-size management
- users paste documents, logs, reports, or other large text blocks
In other words, use message windows when conversations are normal. Use token windows when users start pasting the entire production error log and asking, “Any ideas?”
In-memory storage
If you do not configure a persistent store, chat memory is stored in the server’s JVM memory. That is fast and easy. It is also temporary. In-memory storage is fine for:
- local development
- demos
- short-lived sessions
- prototypes
- low-risk features
- “let’s see if this works before we make it complicated”
But it has limitations:
- memory is lost when the server restarts
- memory may not be available across multiple ColdFusion nodes
- memory may disappear when the application reloads or session ends
- it is not durable conversation history
If you are running one local ColdFusion server and testing an assistant, in-memory is great. If you are running production behind a load balancer with multiple nodes, in-memory memory may produce strange behavior.
User sends message one. Node A remembers it. User sends message two. Load balancer sends them to Node B. Node B says:
Nice to meet you, stranger.
This is not the kind of personalization anyone asked for.
Persistent storage
For production applications, especially clustered applications, you should consider persistent memory storage. ColdFusion supports persistent cache stores such as:
- Redis
- Memcache
- Ehcache
A persistent store lets conversation history survive beyond the JVM memory of a single server. That can matter for restarts, clustering, and more reliable user experience.For example:
agent = Agent( {
CHATMODEL : chatModel,
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true,
PERSISTENTSTORE : "myRedisCache"
}
} );
The persistent store needs to be configured in the ColdFusion Administrator first. Then you reference the configured cache name in PERSISTENTSTORE. For production, Redis is usually the obvious choice for clustered deployments. Memcache can be fast, but it is volatile and not durable across restarts. Ehcache can be useful for single-server deployments.
The main point is this: If your application runs on multiple nodes, do not casually assume in-memory chat history is enough. Load balancers do not care about your assistant’s emotional continuity.
Memory is not the same as preferences
Now we need to separate two concepts that are often blurred together:
- conversation memory
- user preferences
They are related, but they are not the same thing. Conversation memory is what the user and assistant recently said. For example:
My name is David.
I am working on a ColdFusion AI article.
The code sample is using ChatModel().
Make the next answer shorter.
This kind of memory helps the assistant maintain context over a conversation. Preferences are more durable user-specific settings.
Examples:
- prefers CFScript examples
- prefers concise answers
- wants code samples with tabs
- works in a multi-tenant SaaS application
- prefers explanations for experienced developers
- wants warnings about production risks
- uses Adobe ColdFusion 2025
- wants examples with scoped variables
Preferences may outlive the conversation. That means they probably belong in your application database, not just chat memory. Do not rely on a rolling memory window to preserve important preferences. If a user says:
Always show me CFScript examples instead of tag syntax.
And that preference matters to your application, store it intentionally, then inject it into the prompt or system message when appropriate. Memory is “what we were just talking about.” Preferences are “how this user wants the application to behave.” Treat them differently.
A simple preference strategy
Let’s imagine your application stores user AI preferences in a table. You might have preferences like:
- answer style
- preferred code style
- desired verbosity
- whether to include beginner explanations
- whether to include production warnings
For the article, we do not need to design a full database schema. We can pretend we already loaded a struct:
userPreferences = {
codeStyle : "CFScript",
answerLength : "concise",
experienceLevel : "experienced web developer new to ColdFusion AI"
};
Now we can build a prompt that includes those preferences. With Agent(), you can use a system message or a chat request struct depending on how you want to structure the call. For a simple example:
userPreferences = {
codeStyle : "CFScript",
answerLength : "concise",
experienceLevel : "experienced web developer new to ColdFusion AI"
};
chatRequest = {
SYSTEMMESSAGE : "
You are a helpful ColdFusion AI assistant.
The user prefers #userPreferences.codeStyle# examples.
Keep answers #userPreferences.answerLength#.
Assume the user is an #userPreferences.experienceLevel#.
",
USERMESSAGE : {
MESSAGE : "Explain messageWindowChatMemory."
}
};
response = agent.chat(
chatRequest,
session.sessionId
);
writeOutput( encodeForHtml( response.message ) );
This gives the assistant durable preference context without relying on the model to remember it from earlier conversation history. This is a better pattern. Important preferences should come from your application. Conversation memory should help with recent context. Do not make the model rummage through old messages looking for something that should have been a user setting. That is like storing your database password in a Slack thread and calling it configuration management.
Persistent system messages
An Agent() can also use a persistent system message. For example:
agent.systemMessage(
"You are a helpful ColdFusion AI assistant. Be concise. Use CFScript examples."
);
response = agent.chat(
"Explain tokenWindowChatMemory.",
session.sessionId
);
writeOutput( encodeForHtml( response.message ) );
A system message gives the assistant durable behavioral instructions. Use it for broad role and tone instructions:
- you are a ColdFusion assistant
- be concise
- avoid guessing
- explain tradeoffs
- use CFScript examples
- remind users to validate model output
Do not use a system message as a junk drawer for every fact you have ever learned about the user. System messages should be stable, focused, and relevant.
Also, pay attention to the documented behavior around memory. If you are relying on persistent system messages, configure CHATMEMORY correctly. Otherwise, use a chat request struct for per-call system instructions.
The short version:
- System message: good for persistent assistant behavior.
- Chat request struct: good for explicit per-call instructions.
- User preferences: usually best stored in your application and injected intentionally.
- Conversation memory: good for recent dialogue.
Four different tools. Four different jobs. Do not use a screwdriver as a spoon just because both fit in your hand.
A small working chat form
Let’s turn this into a practical ColdFusion page. This example assumes:
application.aiApiKeyexists- the user has a
session.sessionId - we are using in-memory message window memory
- this is a simple demo, not a polished production chat UI
if ( !structKeyExists( application, "memoryDemoAgent" ) ) {
lock scope = "application" type = "exclusive" timeout = 10 {
if ( !structKeyExists( application, "memoryDemoAgent" ) ) {
chatModel = ChatModel( {
provider : "openAI",
modelName : "gpt-5-nano",
apiKey : application.aiApiKey,
temperature : 0.3,
maxTokens : 700,
timeout : 30
} );
application.memoryDemoAgent = Agent( {
CHATMODEL : chatModel,
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true
}
} );
application.memoryDemoAgent.systemMessage(
"You are a helpful ColdFusion AI assistant. Use CFScript examples when code is helpful. Be concise."
);
}
}
}
userId = session.sessionId;
result = "";
if ( len( trim( form.message ) ) ) {
try {
response = application.memoryDemoAgent.chat(
trim( form.message ),
userId
);
result = response.message;
} catch ( any error ) {
writeLog(
file = "ai",
type = "error",
text = "AI memory demo failed: #error.message#"
);
result = "Sorry, I could not generate a response right now.";
}
}
Response
#encodeForHtml( result )#
This gives you a basic memory-enabled assistant. Next, try this:
My name is David and I prefer CFScript examples.
Then ask:
What is my name, and what kind of examples do I prefer?
The assistant should be able to answer because the conversation history is included through memory. Now refresh. Ask a few more questions. Try enough messages to exceed your window. Restart ColdFusion if you are using in-memory storage. Test what happens.
This is how you learn where the boundary is. Not by assuming. By poking it with a stick like every respectable developer since the beginning of time.
About that double-check lock
In the example above, you may have noticed this pattern:
if ( !structKeyExists( application, "memoryDemoAgent" ) ) {
lock scope = "application" type = "exclusive" timeout = 10 {
if ( !structKeyExists( application, "memoryDemoAgent" ) ) {
// create it
}
}
}
This is a common double-checked locking pattern. If you are building a proper application, you may initialize your AI services in Application.cfc, a DI container, ColdBox, or whatever structure your application already uses. The point is not that this exact page-level initialization is perfect. The point is that your agent configuration should not be recreated randomly on every request unless you have a reason. Create the model and agent intentionally. Store them intentionally. Manage their lifecycle intentionally. “Randomly until it seems to work” is not an architecture. It is a cry for help with semicolons.
Resetting memory
Eventually you will want to let users reset the conversation. For example:
- start over
- clear chat
- forget this conversation
- new topic
Depending on your implementation and storage strategy, memory reset may be handled through available cache operations, a new conversation key, or your own application logic. The easiest conceptual approach is to change the memory key. Instead of using only:
userId = session.sessionId;
Use a conversation-specific key:
userId = session.sessionId & ":conversation-" & session.aiConversationId;
When the user clicks “New conversation,” generate a new session.aiConversationId. For example:
if (
structKeyExists( form, "resetConversation" )
|| !structKeyExists( session, "aiConversationId" )
) {
session.aiConversationId = createUUID();
}
userId = session.sessionId & ":conversation-" & session.aiConversationId;
Now each new conversation gets a clean memory context without needing to manually surgically remove old messages. This is also useful if your application supports multiple conversations per user. One user can have:
- support question conversation
- content writing conversation
- registration help conversation
- “why does this code hate me?” conversation
Each can have its own memory key. Because sometimes the answer is not “clear memory.” Sometimes the answer is “stop making unrelated conversations share a junk drawer.”
Avoid storing everything forever
Memory feels useful, so the natural developer instinct is: “let’s store all of it forever.” Please do not start there.
Conversation history can contain sensitive information. Users paste things they should not paste. Logs, emails, addresses, access tokens, internal notes, personal data, and the occasional password because humanity remains undefeated.
Before storing chat history durably, decide:
- what you store
- where you store it
- how long you store it
- who can access it
- how users can clear it
- whether sensitive data should be redacted
- whether it should be encrypted
- whether it belongs in logs at all
- whether retention rules apply
For many applications, short-lived memory is enough. For others, persistent history is valuable. The point is not “never store memory.” The point is “do not accidentally build a permanent archive of user secrets because the demo was neat.”
Memory does not make the model truthful
Memory helps the model maintain context. It does not make the model correct. If the user says: “The moon is made of database indexes,” and later asks “What is the moon made of?” memory may help the assistant recall the previous statement.
That does not make the statement true. This matters when users assert facts that your application should verify. For example:
I am the account owner.
Do not let memory turn that into authorization. If account ownership matters, check the database.
isOwner = accountService.userIsAccountOwner(
userId = session.userId,
accountId = session.accountId
);
Use memory for conversational continuity. Use your application for truth. Use your database for state. Use your authorization logic for permissions. Use the model for language and reasoning. Do not make the robot the bouncer.
Memory does not replace RAG
Memory also does not replace RAG. Memory is conversation history. RAG is retrieval from external documents or data. If the user asks: “What did I ask earlier?” that is memory. If the user asks: “What does our refund policy say about cancellations after the season starts?” that is probably RAG, assuming the answer lives in your policy documents. You can combine them later.
For example, memory can remember that the user is asking about U12 registration. RAG can retrieve the official registration policy. The model can then answer using both the recent conversation and the retrieved policy text.
But do not shove your entire policy manual into chat memory and call it RAG. That is not retrieval. That is hoarding with token billing.
Memory does not replace tools
Memory also does not replace tools. If the user says, “I registered for the workshop,” and later asks: “Am I registered?” Memory may recall that the user said they registered. But the correct answer should come from your application. Maybe they registered. Maybe payment failed. Maybe they registered for the wrong workshop. Maybe an admin cancelled it. Maybe the production database is currently held together by a scheduled task named fix_registration_again.cfm.
The assistant should not rely on memory for facts that belong in your system of record. That is what tools are for. We will cover CFC tools in the next article.
For now, remember this:
- memory recalls conversation
- tools retrieve or change application data
- RAG retrieves document knowledge
- guardrails enforce safety and policy
Different layers. Different jobs.
A practical memory design checklist
When adding memory to a ColdFusion AI feature, ask these questions.
Who owns the memory?
Is it per anonymous session? Per authenticated user? Per user and organization? Per user and conversation? For a multi-tenant application, this matters a lot. A safe key might include tenant/account/group and user ID.
memoryKey = "tenant-" & session.tenantId & ":user-" & session.userId;
Or, if you support separate conversations:
memoryKey = "tenant-" & session.tenantId
& ":user-" & session.userId
& ":conversation-" & session.aiConversationId;
Ugly? A little. Clear? Yes. Better than cross-tenant memory soup? Absolutely.
How much should it remember?
Start small. Try:
MAXMESSAGES : 20
Or:
MAXTOKENS : 4000
Then test. If the assistant forgets too quickly, increase carefully. If requests get slow, expensive, or weird, reduce or switch strategy.
Where should memory live?
Use in-memory storage for local development and simple demos. Use Redis or another configured persistent cache for clustered production deployments. Do not pretend a two-node load-balanced production application is a single cozy JVM unless you enjoy ghost bugs.
Does memory contain sensitive data?
Assume yes until proven otherwise. Users paste everything. Everything. Ev. Er. Y. Thing.
How does the user reset it?
Provide a “New conversation” or “Clear chat” action. Even if you think users will not need it, they will. Especially after they paste the wrong thing and stare at the screen like they just emailed payroll to the moon.
Are durable preferences stored separately?
If a preference should survive across sessions, devices, and restarts, store it in your database. Do not rely on chat memory for durable user settings.
Is the assistant allowed to act on remembered claims?
Probably not without verification. Memory is not authentication. Memory is not authorization. Memory is not truth. It is just context.
Common mistakes
Let’s review the mistakes that are easiest to make with AI memory.
Forgetting PERUSER
This is the big one. If users should have separate memories, set:
PERUSER : true
And pass a real user key to agent.chat().
Do not omit it casually. Do not test with one user and assume it will be fine. Everything works with one user. That is why one-user testing is how bugs sneak into production wearing sunglasses.
Using the same memory key for everyone
This is just forgetting PERUSER with extra steps. If every call uses:
agent.chat( message, "default" );
Then every user shares the same per-user memory key. That is not per-user memory. That is a group chat nobody consented to.
Storing preferences only in memory
If the user preference matters beyond the current conversation, store it as application data. Memory windows drop old messages. Servers restart. Cache entries expire. Users switch browsers. Do not build important behavior on top of “I hope the model remembers that from earlier.” Hope is not a persistence strategy.
Keeping too much history
More history is not always better. It can increase token usage, cost, latency, and confusion. Keep the amount of memory appropriate to the task. The assistant does not need to remember that twelve messages ago the user said “lol” unless you are billing by nostalgia.
Treating memory as truth
If the user says they are an admin, check the database. If the user says they already paid, check the payment record. If the user says the refund policy allows something, check the policy. The model can remember claims. Your application validates facts.
Logging everything
Conversation memory may contain sensitive data. Be careful what you log. Debugging is good. Building a shadow archive of private user conversations by accident is less good.
A better first memory feature
A good first memory feature is simple and low risk. For example:
- remember the topic of the current support conversation
- remember that the user asked for concise answers in this session
- remember recent clarification answers
- remember the current draft being discussed
- remember the user is asking about a specific code sample
Avoid making your first memory feature something like:
- remember payment instructions forever
- remember medical details
- remember legal advice
- remember passwords, keys, or private tokens
- remember user-provided authorization claims
Start with conversational convenience. Do not start by building a memory palace full of compliance problems.
Where we go next
At this point, our ColdFusion AI assistant can do something more useful than a plain ChatModel() call. It can remember recent conversation. It can keep separate memory per user. It can use message windows or token windows. It can use in-memory storage for simple cases or persistent cache storage for production. It can receive durable preferences from your application and use them as part of its instructions.
That is a major step.
But it still cannot do anything real inside your application. If the user asks:
What is my registration status?
The assistant can remember that the user asked about registration earlier. But it cannot safely query your database unless we give it a tool. That is the next article. We will add CFC tools so the AI can request real application data through ColdFusion methods you control.
This is where things get much more powerful. It is also where we need to become much more careful. Because giving the assistant memory is one thing. Giving it hands is another.
Final thought
Memory makes an AI assistant feel smarter because it can maintain context across turns. But memory is not magic. It is not truth. It is not authorization. It is not permanent preference storage. It is not RAG. It is not a replacement for tools.
It is context management.
Used well, it makes AI features feel coherent and helpful. Used badly, it makes your application behave like a forgetful intern who occasionally reads someone else’s diary.
So use Agent(). Configure CHATMEMORY. Set PERUSER : true. Choose a sensible memory window. Use persistent storage when production requires it. Store durable preferences in your application.
And remember the recurring rule: The robot can remember the conversation. ColdFusion, and more importantly you, still run the show.
In the last article, we built the ColdFusion AI version of “Hello World.” We configured a ChatModel(), sent a prompt with .chat(), read response.message, and displayed the result safely. It was simple, useful, and intentionally limited.
That limitation matters.
ChatModel() is stateless. Each call stands on its own. It does not remember what the user said five seconds ago. It does not remember their name. It does not remember that they prefer CFScript examples. It does not remember that they are currently asking about session scope and not, for example, sourdough starter maintenance. That is not a bug. That is the layer doing exactly what it is supposed to do.
But if we want to build something that feels like an actual assistant instead of a very expensive autocomplete endpoint, we need memory.
This article is about adding conversation memory to ColdFusion AI using Agent(), understanding memory windows, using PERUSER correctly, and separating temporary conversation history from durable user preferences. Because there is a large difference between:
“The assistant remembered what I just asked.”
And:
“The assistant told Janet from Accounting that her name is Bob and she lives in Paris.”
One is memory. The other is a meeting with HR and/or security.
Where we are in the series
In the preamble, we covered the AI vocabulary: LLMs, prompts, tokens, context windows, temperature, hallucinations, tools, RAG, MCP, and guardrails. In the previous article, we used ChatModel() for a simple stateless request:
chatModel = ChatModel( {
provider : "openAI",
modelName : "gpt-5-nano",
apiKey : application.aiApiKey,
temperature : 0.3,
maxTokens : 500,
timeout : 30
} );
response = chatModel.chat(
"Explain ColdFusion session scope in one short paragraph."
);
writeOutput( encodeForHtml( response.message ) );
That works well for simple one-off tasks:
- summarize this text
- rewrite this paragraph
- classify this message
- explain this error
- generate a short draft
- translate this content
But it fails the most basic conversation test. If the user says, “My name is David,” and then asks “What is my name?” a plain ChatModel() call has no idea unless your application includes the previous message again. That is where Agent() comes in.
ChatModel() versus Agent()
The simplest way to think about it is this; ChatModel() is the model connection. Agent() is the conversational wrapper around the model. A ChatModel() sends a prompt to an LLM and gets a response. An Agent() can build on top of that model and add things like:
- multi-turn conversation management
- memory
- persistent system instructions
- per-user context
- tool calling
- richer chat request structures
In this article, we are focusing on memory. We are not using tools yet. We are not connecting to MCP yet. We are not doing RAG yet. We are not building Skynet, Clippy 2.0, or a chatbot that insists it is “almost done” while generating invalid JSON for eleven minutes. We are just giving the assistant enough short-term memory to have a useful conversation.
The basic Agent setup
Let’s start with a basic ChatModel() and wrap it in an Agent().
chatModel = ChatModel( {
provider : "openAI",
modelName : "gpt-5-nano",
apiKey : application.aiApiKey,
temperature : 0.3,
maxTokens : 500,
timeout : 30
} );
agent = Agent( {
CHATMODEL : chatModel,
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true
}
} );
response1 = agent.chat(
"My name is David.",
session.sessionId
);
response2 = agent.chat(
"What is my name?",
session.sessionId
);
writeOutput( encodeForHtml( response2.message ) );
The important part is the CHATMEMORY configuration:
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true
}
This tells ColdFusion:
- use message-based memory
- keep the last 20 messages
- isolate memory per user
That last one is not optional in any real user-facing application unless you are deliberately building a shared group conversation. For normal application chat, PERUSER : true is the difference between “assistant” and “privacy incident with a text box.”
Why memory belongs to Agent(), not ChatModel()
This is a key architectural point. Memory is not part of ChatModel() itself. A stateless ChatModel() does not maintain conversation history. It sends a prompt, gets a response, and goes on with its life. It is emotionally unavailable, but reliable. Memory is managed by the Agent() layer.
That makes sense because memory is not really a model concern. It is an application concern. Your application needs to decide:
- whose conversation this is
- how much history to keep
- where to store it
- when to forget it
- whether it survives server restarts
- whether it is isolated per user
- whether preferences should be stored separately
- whether the memory should be included in future prompts
The model does not know any of that. The model receives context. ColdFusion and your application decide what context gets sent. This is exactly the same principle we have been repeating since the first article: The LLM is not your application. It is a reasoning engine your application supervises. Now we are supervising what it remembers.
A basic memory test
Here is a simple test you can run.
userId = session.sessionId;
agent.chat(
"My favorite ColdFusion style is CFScript.",
userId
);
response = agent.chat(
"What ColdFusion style do I prefer?",
userId
);
writeOutput( encodeForHtml( response.message ) );
With memory configured and the same userId passed to both calls, the assistant should be able to answer based on the previous message. Something like:
You prefer CFScript.
That does not mean the model magically gained permanent knowledge. It means the Agent() included relevant conversation history with the new request. That distinction matters. Memory is not telepathy. Memory is prompt construction with better plumbing.
The userId argument
Notice this part:
response = agent.chat(
"What ColdFusion style do I prefer?",
userId
);
When PERUSER : true, each unique userId gets its own memory context. That userId might be:
session.sessionId- the logged-in user’s database ID
- a tenant-scoped user key
- a generated conversation ID
- another stable identifier that makes sense in your application
For logged-in applications, I would usually prefer a real authenticated user ID, possibly combined with tenant/group/account context. For example:
userMemoryKey = "account-" & session.accountId & ":user-" & session.userId;
That avoids collisions and makes it very clear whose memory belongs to whom. Using only session.sessionId can be fine for short-lived anonymous chat, but it may not survive across devices, browsers, or new sessions. That may be exactly what you want. The key is to choose intentionally. Do not casually use something like:
userId = "user";
That is not per-user memory. That is a shared diary with delusions of privacy.
PERUSER matters
Let’s talk about PERUSER. When you set:
PERUSER : true
ColdFusion scopes memory to the userId you pass into agent.chat(). That means this:
agent.chat( "My name is Alice.", "user-alice" );
agent.chat( "My name is Bob.", "user-bob" );
response = agent.chat( "What is my name?", "user-alice" );
writeOutput( encodeForHtml( response.message ) );
Should answer Alice, because the third call uses Alice’s memory context. That is what you want.
Now imagine you omit PERUSER. All calls share one global memory context. That might be fine for a single shared bot in a single shared room. It is not fine for a normal application where multiple users are chatting independently.
Without per-user isolation, you can get cross-user contamination. Alice tells the bot:
My name is Alice and I live in Paris.
Bob asks:
Where do I live?
The bot says:
You live in Paris.
Congratulations. You have invented distributed confusion. Also possibly a privacy bug. For user-specific assistant experiences, use PERUSER : true. Then pass a real user key every time. No exceptions unless you have a very specific reason and have written that reason down somewhere future-you can find it.
Message window memory
ColdFusion supports a message window memory strategy using:
TYPE : "messageWindowChatMemory"
With this strategy, you specify the number of messages to retain. For example:
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true
}
This keeps the last 20 messages in the conversation history. When the conversation grows beyond that limit, the oldest messages are dropped. This is simple and predictable. For many applications, it is the best place to start. You can reason about it easily:
- keep the last 10 messages for lightweight chat
- keep the last 20 messages for normal assistant behavior
- keep more if conversations are longer and the model/context budget supports it
There is no perfect number. If the window is too small, the assistant forgets things too quickly. If the window is too large, you may send too much context, increase token usage, increase cost, slow requests down, and make the model sift through a conversation history that includes three topic changes and someone asking whether a hot dog is a sandwich.
For a first implementation, start with something like 10 to 20 messages. Then test with real conversations. Boring, observable, measurable. The unglamorous holy trinity of production software.
Token window memory
ColdFusion also supports token window memory using:
TYPE : "tokenWindowChatMemory"
With this strategy, you specify a token budget instead of a message count. For example:
CHATMEMORY : {
TYPE : "tokenWindowChatMemory",
MAXTOKENS : 4000,
PERUSER : true
}
This retains messages up to the configured token limit. Older messages are dropped as needed to stay within the token budget. This is useful when you care more about provider context limits than message counts. For example, ten short messages might be tiny. Ten long messages might be an entire novella, a stack trace, and someone’s pasted XML configuration from 2009.
Message count does not always tell you how large the prompt will be. Token window memory gives you tighter control over how much conversation history gets included. The tradeoff is that it is a little more abstract. Developers can count messages easily. Tokens are less intuitive.
My practical recommendation:
Use messageWindowChatMemory first. Use tokenWindowChatMemory when:
- conversations can contain very long messages
- cost control is important
- provider context limits are a concern
- you need tighter prompt-size management
- users paste documents, logs, reports, or other large text blocks
In other words, use message windows when conversations are normal. Use token windows when users start pasting the entire production error log and asking, “Any ideas?”
In-memory storage
If you do not configure a persistent store, chat memory is stored in the server’s JVM memory. That is fast and easy. It is also temporary. In-memory storage is fine for:
- local development
- demos
- short-lived sessions
- prototypes
- low-risk features
- “let’s see if this works before we make it complicated”
But it has limitations:
- memory is lost when the server restarts
- memory may not be available across multiple ColdFusion nodes
- memory may disappear when the application reloads or session ends
- it is not durable conversation history
If you are running one local ColdFusion server and testing an assistant, in-memory is great. If you are running production behind a load balancer with multiple nodes, in-memory memory may produce strange behavior.
User sends message one. Node A remembers it. User sends message two. Load balancer sends them to Node B. Node B says:
Nice to meet you, stranger.
This is not the kind of personalization anyone asked for.
Persistent storage
For production applications, especially clustered applications, you should consider persistent memory storage. ColdFusion supports persistent cache stores such as:
- Redis
- Memcache
- Ehcache
A persistent store lets conversation history survive beyond the JVM memory of a single server. That can matter for restarts, clustering, and more reliable user experience.For example:
agent = Agent( {
CHATMODEL : chatModel,
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true,
PERSISTENTSTORE : "myRedisCache"
}
} );
The persistent store needs to be configured in the ColdFusion Administrator first. Then you reference the configured cache name in PERSISTENTSTORE. For production, Redis is usually the obvious choice for clustered deployments. Memcache can be fast, but it is volatile and not durable across restarts. Ehcache can be useful for single-server deployments.
The main point is this: If your application runs on multiple nodes, do not casually assume in-memory chat history is enough. Load balancers do not care about your assistant’s emotional continuity.
Memory is not the same as preferences
Now we need to separate two concepts that are often blurred together:
- conversation memory
- user preferences
They are related, but they are not the same thing. Conversation memory is what the user and assistant recently said. For example:
My name is David.
I am working on a ColdFusion AI article.
The code sample is using ChatModel().
Make the next answer shorter.
This kind of memory helps the assistant maintain context over a conversation. Preferences are more durable user-specific settings.
Examples:
- prefers CFScript examples
- prefers concise answers
- wants code samples with tabs
- works in a multi-tenant SaaS application
- prefers explanations for experienced developers
- wants warnings about production risks
- uses Adobe ColdFusion 2025
- wants examples with scoped variables
Preferences may outlive the conversation. That means they probably belong in your application database, not just chat memory. Do not rely on a rolling memory window to preserve important preferences. If a user says:
Always show me CFScript examples instead of tag syntax.
And that preference matters to your application, store it intentionally, then inject it into the prompt or system message when appropriate. Memory is “what we were just talking about.” Preferences are “how this user wants the application to behave.” Treat them differently.
A simple preference strategy
Let’s imagine your application stores user AI preferences in a table. You might have preferences like:
- answer style
- preferred code style
- desired verbosity
- whether to include beginner explanations
- whether to include production warnings
For the article, we do not need to design a full database schema. We can pretend we already loaded a struct:
userPreferences = {
codeStyle : "CFScript",
answerLength : "concise",
experienceLevel : "experienced web developer new to ColdFusion AI"
};
Now we can build a prompt that includes those preferences. With Agent(), you can use a system message or a chat request struct depending on how you want to structure the call. For a simple example:
userPreferences = {
codeStyle : "CFScript",
answerLength : "concise",
experienceLevel : "experienced web developer new to ColdFusion AI"
};
chatRequest = {
SYSTEMMESSAGE : "
You are a helpful ColdFusion AI assistant.
The user prefers #userPreferences.codeStyle# examples.
Keep answers #userPreferences.answerLength#.
Assume the user is an #userPreferences.experienceLevel#.
",
USERMESSAGE : {
MESSAGE : "Explain messageWindowChatMemory."
}
};
response = agent.chat(
chatRequest,
session.sessionId
);
writeOutput( encodeForHtml( response.message ) );
This gives the assistant durable preference context without relying on the model to remember it from earlier conversation history. This is a better pattern. Important preferences should come from your application. Conversation memory should help with recent context. Do not make the model rummage through old messages looking for something that should have been a user setting. That is like storing your database password in a Slack thread and calling it configuration management.
Persistent system messages
An Agent() can also use a persistent system message. For example:
agent.systemMessage(
"You are a helpful ColdFusion AI assistant. Be concise. Use CFScript examples."
);
response = agent.chat(
"Explain tokenWindowChatMemory.",
session.sessionId
);
writeOutput( encodeForHtml( response.message ) );
A system message gives the assistant durable behavioral instructions. Use it for broad role and tone instructions:
- you are a ColdFusion assistant
- be concise
- avoid guessing
- explain tradeoffs
- use CFScript examples
- remind users to validate model output
Do not use a system message as a junk drawer for every fact you have ever learned about the user. System messages should be stable, focused, and relevant.
Also, pay attention to the documented behavior around memory. If you are relying on persistent system messages, configure CHATMEMORY correctly. Otherwise, use a chat request struct for per-call system instructions.
The short version:
- System message: good for persistent assistant behavior.
- Chat request struct: good for explicit per-call instructions.
- User preferences: usually best stored in your application and injected intentionally.
- Conversation memory: good for recent dialogue.
Four different tools. Four different jobs. Do not use a screwdriver as a spoon just because both fit in your hand.
A small working chat form
Let’s turn this into a practical ColdFusion page. This example assumes:
application.aiApiKeyexists- the user has a
session.sessionId - we are using in-memory message window memory
- this is a simple demo, not a polished production chat UI
if ( !structKeyExists( application, "memoryDemoAgent" ) ) {
lock scope = "application" type = "exclusive" timeout = 10 {
if ( !structKeyExists( application, "memoryDemoAgent" ) ) {
chatModel = ChatModel( {
provider : "openAI",
modelName : "gpt-5-nano",
apiKey : application.aiApiKey,
temperature : 0.3,
maxTokens : 700,
timeout : 30
} );
application.memoryDemoAgent = Agent( {
CHATMODEL : chatModel,
CHATMEMORY : {
TYPE : "messageWindowChatMemory",
MAXMESSAGES : 20,
PERUSER : true
}
} );
application.memoryDemoAgent.systemMessage(
"You are a helpful ColdFusion AI assistant. Use CFScript examples when code is helpful. Be concise."
);
}
}
}
userId = session.sessionId;
result = "";
if ( len( trim( form.message ) ) ) {
try {
response = application.memoryDemoAgent.chat(
trim( form.message ),
userId
);
result = response.message;
} catch ( any error ) {
writeLog(
file = "ai",
type = "error",
text = "AI memory demo failed: #error.message#"
);
result = "Sorry, I could not generate a response right now.";
}
}
Response
#encodeForHtml( result )#
This gives you a basic memory-enabled assistant. Next, try this:
My name is David and I prefer CFScript examples.
Then ask:
What is my name, and what kind of examples do I prefer?
The assistant should be able to answer because the conversation history is included through memory. Now refresh. Ask a few more questions. Try enough messages to exceed your window. Restart ColdFusion if you are using in-memory storage. Test what happens.
This is how you learn where the boundary is. Not by assuming. By poking it with a stick like every respectable developer since the beginning of time.
About that double-check lock
In the example above, you may have noticed this pattern:
if ( !structKeyExists( application, "memoryDemoAgent" ) ) {
lock scope = "application" type = "exclusive" timeout = 10 {
if ( !structKeyExists( application, "memoryDemoAgent" ) ) {
// create it
}
}
}
This is a common double-checked locking pattern. If you are building a proper application, you may initialize your AI services in Application.cfc, a DI container, ColdBox, or whatever structure your application already uses. The point is not that this exact page-level initialization is perfect. The point is that your agent configuration should not be recreated randomly on every request unless you have a reason. Create the model and agent intentionally. Store them intentionally. Manage their lifecycle intentionally. “Randomly until it seems to work” is not an architecture. It is a cry for help with semicolons.
Resetting memory
Eventually you will want to let users reset the conversation. For example:
- start over
- clear chat
- forget this conversation
- new topic
Depending on your implementation and storage strategy, memory reset may be handled through available cache operations, a new conversation key, or your own application logic. The easiest conceptual approach is to change the memory key. Instead of using only:
userId = session.sessionId;
Use a conversation-specific key:
userId = session.sessionId & ":conversation-" & session.aiConversationId;
When the user clicks “New conversation,” generate a new session.aiConversationId. For example:
if (
structKeyExists( form, "resetConversation" )
|| !structKeyExists( session, "aiConversationId" )
) {
session.aiConversationId = createUUID();
}
userId = session.sessionId & ":conversation-" & session.aiConversationId;
Now each new conversation gets a clean memory context without needing to manually surgically remove old messages. This is also useful if your application supports multiple conversations per user. One user can have:
- support question conversation
- content writing conversation
- registration help conversation
- “why does this code hate me?” conversation
Each can have its own memory key. Because sometimes the answer is not “clear memory.” Sometimes the answer is “stop making unrelated conversations share a junk drawer.”
Avoid storing everything forever
Memory feels useful, so the natural developer instinct is: “let’s store all of it forever.” Please do not start there.
Conversation history can contain sensitive information. Users paste things they should not paste. Logs, emails, addresses, access tokens, internal notes, personal data, and the occasional password because humanity remains undefeated.
Before storing chat history durably, decide:
- what you store
- where you store it
- how long you store it
- who can access it
- how users can clear it
- whether sensitive data should be redacted
- whether it should be encrypted
- whether it belongs in logs at all
- whether retention rules apply
For many applications, short-lived memory is enough. For others, persistent history is valuable. The point is not “never store memory.” The point is “do not accidentally build a permanent archive of user secrets because the demo was neat.”
Memory does not make the model truthful
Memory helps the model maintain context. It does not make the model correct. If the user says: “The moon is made of database indexes,” and later asks “What is the moon made of?” memory may help the assistant recall the previous statement.
That does not make the statement true. This matters when users assert facts that your application should verify. For example:
I am the account owner.
Do not let memory turn that into authorization. If account ownership matters, check the database.
isOwner = accountService.userIsAccountOwner(
userId = session.userId,
accountId = session.accountId
);
Use memory for conversational continuity. Use your application for truth. Use your database for state. Use your authorization logic for permissions. Use the model for language and reasoning. Do not make the robot the bouncer.
Memory does not replace RAG
Memory also does not replace RAG. Memory is conversation history. RAG is retrieval from external documents or data. If the user asks: “What did I ask earlier?” that is memory. If the user asks: “What does our refund policy say about cancellations after the season starts?” that is probably RAG, assuming the answer lives in your policy documents. You can combine them later.
For example, memory can remember that the user is asking about U12 registration. RAG can retrieve the official registration policy. The model can then answer using both the recent conversation and the retrieved policy text.
But do not shove your entire policy manual into chat memory and call it RAG. That is not retrieval. That is hoarding with token billing.
Memory does not replace tools
Memory also does not replace tools. If the user says, “I registered for the workshop,” and later asks: “Am I registered?” Memory may recall that the user said they registered. But the correct answer should come from your application. Maybe they registered. Maybe payment failed. Maybe they registered for the wrong workshop. Maybe an admin cancelled it. Maybe the production database is currently held together by a scheduled task named fix_registration_again.cfm.
The assistant should not rely on memory for facts that belong in your system of record. That is what tools are for. We will cover CFC tools in the next article.
For now, remember this:
- memory recalls conversation
- tools retrieve or change application data
- RAG retrieves document knowledge
- guardrails enforce safety and policy
Different layers. Different jobs.
A practical memory design checklist
When adding memory to a ColdFusion AI feature, ask these questions.
Who owns the memory?
Is it per anonymous session? Per authenticated user? Per user and organization? Per user and conversation? For a multi-tenant application, this matters a lot. A safe key might include tenant/account/group and user ID.
memoryKey = "tenant-" & session.tenantId & ":user-" & session.userId;
Or, if you support separate conversations:
memoryKey = "tenant-" & session.tenantId
& ":user-" & session.userId
& ":conversation-" & session.aiConversationId;
Ugly? A little. Clear? Yes. Better than cross-tenant memory soup? Absolutely.
How much should it remember?
Start small. Try:
MAXMESSAGES : 20
Or:
MAXTOKENS : 4000
Then test. If the assistant forgets too quickly, increase carefully. If requests get slow, expensive, or weird, reduce or switch strategy.
Where should memory live?
Use in-memory storage for local development and simple demos. Use Redis or another configured persistent cache for clustered production deployments. Do not pretend a two-node load-balanced production application is a single cozy JVM unless you enjoy ghost bugs.
Does memory contain sensitive data?
Assume yes until proven otherwise. Users paste everything. Everything. Ev. Er. Y. Thing.
How does the user reset it?
Provide a “New conversation” or “Clear chat” action. Even if you think users will not need it, they will. Especially after they paste the wrong thing and stare at the screen like they just emailed payroll to the moon.
Are durable preferences stored separately?
If a preference should survive across sessions, devices, and restarts, store it in your database. Do not rely on chat memory for durable user settings.
Is the assistant allowed to act on remembered claims?
Probably not without verification. Memory is not authentication. Memory is not authorization. Memory is not truth. It is just context.
Common mistakes
Let’s review the mistakes that are easiest to make with AI memory.
Forgetting PERUSER
This is the big one. If users should have separate memories, set:
PERUSER : true
And pass a real user key to agent.chat().
Do not omit it casually. Do not test with one user and assume it will be fine. Everything works with one user. That is why one-user testing is how bugs sneak into production wearing sunglasses.
Using the same memory key for everyone
This is just forgetting PERUSER with extra steps. If every call uses:
agent.chat( message, "default" );
Then every user shares the same per-user memory key. That is not per-user memory. That is a group chat nobody consented to.
Storing preferences only in memory
If the user preference matters beyond the current conversation, store it as application data. Memory windows drop old messages. Servers restart. Cache entries expire. Users switch browsers. Do not build important behavior on top of “I hope the model remembers that from earlier.” Hope is not a persistence strategy.
Keeping too much history
More history is not always better. It can increase token usage, cost, latency, and confusion. Keep the amount of memory appropriate to the task. The assistant does not need to remember that twelve messages ago the user said “lol” unless you are billing by nostalgia.
Treating memory as truth
If the user says they are an admin, check the database. If the user says they already paid, check the payment record. If the user says the refund policy allows something, check the policy. The model can remember claims. Your application validates facts.
Logging everything
Conversation memory may contain sensitive data. Be careful what you log. Debugging is good. Building a shadow archive of private user conversations by accident is less good.
A better first memory feature
A good first memory feature is simple and low risk. For example:
- remember the topic of the current support conversation
- remember that the user asked for concise answers in this session
- remember recent clarification answers
- remember the current draft being discussed
- remember the user is asking about a specific code sample
Avoid making your first memory feature something like:
- remember payment instructions forever
- remember medical details
- remember legal advice
- remember passwords, keys, or private tokens
- remember user-provided authorization claims
Start with conversational convenience. Do not start by building a memory palace full of compliance problems.
Where we go next
At this point, our ColdFusion AI assistant can do something more useful than a plain ChatModel() call. It can remember recent conversation. It can keep separate memory per user. It can use message windows or token windows. It can use in-memory storage for simple cases or persistent cache storage for production. It can receive durable preferences from your application and use them as part of its instructions.
That is a major step.
But it still cannot do anything real inside your application. If the user asks:
What is my registration status?
The assistant can remember that the user asked about registration earlier. But it cannot safely query your database unless we give it a tool. That is the next article. We will add CFC tools so the AI can request real application data through ColdFusion methods you control.
This is where things get much more powerful. It is also where we need to become much more careful. Because giving the assistant memory is one thing. Giving it hands is another.
Final thought
Memory makes an AI assistant feel smarter because it can maintain context across turns. But memory is not magic. It is not truth. It is not authorization. It is not permanent preference storage. It is not RAG. It is not a replacement for tools.
It is context management.
Used well, it makes AI features feel coherent and helpful. Used badly, it makes your application behave like a forgetful intern who occasionally reads someone else’s diary.
So use Agent(). Configure CHATMEMORY. Set PERUSER : true. Choose a sensible memory window. Use persistent storage when production requires it. Store durable preferences in your application.
And remember the recurring rule: The robot can remember the conversation. ColdFusion, and more importantly you, still run the show.
Master
- Most Recent
- Most Relevant




