AI Conversation threads saving

AI Conversation threads saving

Valid from Datafari 7.0

Introduction

The Datafari Assistant chatbot is able to save and store users’ conversations history in order to:

  • preserve chat continuity for logged users

  • retrieve previous chat messages

  • attach documents extracted text to a conversation

Conversations, messages and attached documents are persisted in Datafari’s PostgreSQL

Data model

Global view

users └── conversation └── messages └── search_results (JSONB) └── docsbasket
  • A conversation belongs to a authenticated user.

  • A conversation can hold multiple messages.

  • A conversation can hold multiple documents (called “docs basket”)

  • One message can hold:

    • one text content

    • OR search results, stored as a String “JSONB”.

PostgreSQL tables

conversation

CREATE TABLE conversation ( id UUID PRIMARY KEY, title TEXT, username VARCHAR(255) NOT NULL, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, last_refresh TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT conversation_user_fk FOREIGN KEY (username) REFERENCES users(username) ON DELETE CASCADE );

Field

Description

Field

Description

id

Unique identifier of the conversation

title

Title of the conversation (currently auto generated, based on the creation date)

username

Owner’s username

created_at

Creation date

last_refresh

Last update date. If the user selects a conversation, the last_refresh is set to the current date to mark it as the most recently used conversation.

messages

CREATE TABLE messages ( id UUID PRIMARY KEY, conversation_id UUID NOT NULL, role VARCHAR(32) NOT NULL, content TEXT NOT NULL, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, search_results JSONB, CONSTRAINT messages_conversation_fk FOREIGN KEY (conversation_id) REFERENCES conversation(id) ON DELETE CASCADE, CONSTRAINT messages_role_chk CHECK (role IN ('user', 'assistant', 'system', 'tool')) );

Field

Description

Field

Description

id

ID of the message

conversation_id

ID of the associated conversation

role

Message role (user, assistant, etc.)

content

Message text content. Can be empty if search_result is not.

search_results

If the message contains search results, information about retrieved documents are stored here.
This field is optional. It contains a String JSON (JSONB).

The JSONB is converted into a JSONArray in the propertiesToJson() method, in Assistant.java.

Example of JSON:

[ { "docId": "9bde77d1455c9252f40632d67ab3261d79fb4e9423601ea6b032ffff86b2e912", "title": "23f003%20cctp%20tvx%20videosurveillance.docx", "url": "[url...]", "content": "[truncated content...]" }, { "docId": "dbbb9089fdde83c2cf45a13db1f98c6f044b2ae6a1453cef06cba06f1e48a385", "title": "23f003%20bpu%20dded%20videosurveillance.xlsx", "url": "[url...]", "content": "[truncated content...]" }, { "docId": "77ce58b02aabe03a8e358bfdac3eb2ef77b25fd09cc676b06005c091d2731796", "title": "Lapresse-journal-web.pdf",, "url": "[url...]", "content": "[truncated content...]" } ]

created_at

Message emission date

docsbasket

CREATE TABLE docsbasket ( id UUID PRIMARY KEY, conversation_id UUID NOT NULL, document_id TEXT NOT NULL, document_title TEXT, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT docsbasket_conversation_fk FOREIGN KEY (conversation_id) REFERENCES conversation(id) ON DELETE CASCADE );

Field

Description

Field

Description

id

ID of the docsbasket entry

conversation_id

ID of the associated conversation

document_id

docId of the document

document_title

Title of the document.

created_at

Entry creation date.

Indexes

The following indexes have been created in order to optimize common queries:

CREATE INDEX conversation_username_idx ON conversation(username); CREATE INDEX conversation_username_created_idx ON conversation(username, created_at DESC); CREATE INDEX messages_conversation_created_idx ON messages(conversation_id, created_at DESC); CREATE INDEX docsbasket_conversation_idx ON docsbasket(conversation_id); CREATE INDEX docsbasket_document_id_idx ON docsbasket(document_id); CREATE UNIQUE INDEX docsbasket_unique_doc_per_conv ON docsbasket(conversation_id, document_id);

Cascade deletion

The user has the possibility to delete a conversation. If that happens, the conversation entry, as well as all associated messages and docsbasket is automatically deleted.

Security

Queries, as defined in ConversationDataService.java, are built in a way that prevent users to create, read or edit conversation that they don’t own. In consequence, an anonymous user’s will not be saved in database.

API and services

A service class provides a collection of methods that interact with the COLLECTIONS, MESSAGES and DOCSBASKET tables.

On API’s side, a collection of endpoints can be called be the UI to GET information (such as user’s collection), POST data (add a documents to a docsbasket), or DELETE entries (delete conversations, remove documents…).

For more information about API endpoints, see https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/1672871937 .