AI Conversation threads saving
Valid from Datafari 7.0
Introduction
The Datafari Assistant chatbot is able to save and store users’ conversations history in order to:
preserve chat continuity for logged users
retrieve previous chat messages
attach documents extracted text to a conversation
Conversations, messages and attached documents are persisted in Datafari’s PostgreSQL
Data model
Global view
users
└── conversation
└── messages
└── search_results (JSONB)
└── docsbasket
A conversation belongs to a authenticated user.
A conversation can hold multiple messages.
A conversation can hold multiple documents (called “docs basket”)
One message can hold:
one text content
OR search results, stored as a String “JSONB”.
PostgreSQL tables
conversation
CREATE TABLE conversation (
id UUID PRIMARY KEY,
title TEXT,
username VARCHAR(255) NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
last_refresh TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT conversation_user_fk
FOREIGN KEY (username) REFERENCES users(username) ON DELETE CASCADE
);Field | Description |
|---|---|
| Unique identifier of the conversation |
| Title of the conversation (currently auto generated, based on the creation date) |
| Owner’s username |
| Creation date |
| Last update date. If the user selects a conversation, the last_refresh is set to the current date to mark it as the most recently used conversation. |
messages
CREATE TABLE messages (
id UUID PRIMARY KEY,
conversation_id UUID NOT NULL,
role VARCHAR(32) NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
search_results JSONB,
CONSTRAINT messages_conversation_fk
FOREIGN KEY (conversation_id) REFERENCES conversation(id) ON DELETE CASCADE,
CONSTRAINT messages_role_chk
CHECK (role IN ('user', 'assistant', 'system', 'tool'))
);Field | Description |
|---|---|
| ID of the message |
| ID of the associated conversation |
| Message role ( |
| Message text content. Can be empty if search_result is not. |
| If the message contains search results, information about retrieved documents are stored here. The JSONB is converted into a JSONArray in the Example of JSON: [
{
"docId": "9bde77d1455c9252f40632d67ab3261d79fb4e9423601ea6b032ffff86b2e912",
"title": "23f003%20cctp%20tvx%20videosurveillance.docx",
"url": "[url...]",
"content": "[truncated content...]"
},
{
"docId": "dbbb9089fdde83c2cf45a13db1f98c6f044b2ae6a1453cef06cba06f1e48a385",
"title": "23f003%20bpu%20dded%20videosurveillance.xlsx",
"url": "[url...]",
"content": "[truncated content...]"
},
{
"docId": "77ce58b02aabe03a8e358bfdac3eb2ef77b25fd09cc676b06005c091d2731796",
"title": "Lapresse-journal-web.pdf",,
"url": "[url...]",
"content": "[truncated content...]"
}
] |
| Message emission date |
docsbasket
CREATE TABLE docsbasket (
id UUID PRIMARY KEY,
conversation_id UUID NOT NULL,
document_id TEXT NOT NULL,
document_title TEXT,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT docsbasket_conversation_fk
FOREIGN KEY (conversation_id) REFERENCES conversation(id) ON DELETE CASCADE
);Field | Description |
|---|---|
| ID of the docsbasket entry |
| ID of the associated conversation |
|
|
| Title of the document. |
| Entry creation date. |
Indexes
The following indexes have been created in order to optimize common queries:
CREATE INDEX conversation_username_idx ON conversation(username);
CREATE INDEX conversation_username_created_idx ON conversation(username, created_at DESC);
CREATE INDEX messages_conversation_created_idx ON messages(conversation_id, created_at DESC);
CREATE INDEX docsbasket_conversation_idx ON docsbasket(conversation_id);
CREATE INDEX docsbasket_document_id_idx ON docsbasket(document_id);
CREATE UNIQUE INDEX docsbasket_unique_doc_per_conv ON docsbasket(conversation_id, document_id);Cascade deletion
The user has the possibility to delete a conversation. If that happens, the conversation entry, as well as all associated messages and docsbasket is automatically deleted.
Security
Queries, as defined in ConversationDataService.java, are built in a way that prevent users to create, read or edit conversation that they don’t own. In consequence, an anonymous user’s will not be saved in database.
API and services
A service class provides a collection of methods that interact with the COLLECTIONS, MESSAGES and DOCSBASKET tables.
On API’s side, a collection of endpoints can be called be the UI to GET information (such as user’s collection), POST data (add a documents to a docsbasket), or DELETE entries (delete conversations, remove documents…).
For more information about API endpoints, see https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/1672871937 .