AI Powered Datafari API
Valid from Datafari 7.0
Introduction
This feature is part of the Datafari APIs available, and it may be added later to the page Datafari API.
This part of the API contains all the AI-related endpoints, that may be called by the Datafari Chat Bot.
Configuration
Before using the AiPowered endpoint, your Datafari needs to be properly configured. Use the “RAG & AI configuration” AdminUI to set up your environment. You will also need an AI web service that can run Large Language Models.
API endpoints
Every endpoint in this API are prefixed by /rest/v2.0/.
Do also remember to add the relevant path to your Datafari web app (https://[datafaridomain]/Datafari/rest/v2.0/ai/[endpoint])
Currently, the AiPowered API provides two endpoints.
| POST | POST |
|---|---|---|
Description | The “no-stream” endpoints takes a JSON as request body, and returns a full JSON once the process is done. | The “stream” endpoints takes a JSON as request body, and streams “NDJSON” (Newline-Delimited JSON) events to the client, allowing progressive rendering of the messages and transparence on the progression in the Chat Bot. |
Payload structure | The endpoints takes a JSON input body. Each parameter is detailed below. Some may be optional or required depending on the {
"query": "[user_query]",
"action": "[rag|summarize|agentic]",
"id": "[any_solr_document_id]",
"agent": "[any_solr_document_id]",
"lang": "[language_code]",
"history": [
[chat_history]
],
"filters": {
"id": [ [list_of_ids] ],
"fq": [ [list_of_solr_filter_queries] ],
},
"conversationId": "[conversationId]"
} | The endpoints takes a JSON input body. Each parameter is detailed below. Some may be optional or required depending on the {
"query": "[user_query]",
"action": "[rag|summarize|agentic]",
"id": "[any_solr_document_id]",
"agent": "[any_solr_document_id]",
"lang": "[language_code]",
"history": [
[chat_history]
],
"filters": {
"id": [ [list_of_ids] ],
"fq": [ [list_of_solr_filter_queries] ],
},
"conversationId": "[conversationId]"
} |
Response structure | The returned response is a standard Datafari API JSON response, with a status (OK or ERROR) and a content (that can be an “error” object”). {
"status":"OK|ERROR",
"content": {
"message":"[ai_generated_response]",,
"conversationId": "[conversationId]",
"sources": [
[list_of_retrieved_sources]
],
"docs": [
[search_results]
],
"error":{
"code": "[error_code]",
"label":"[error_label]",
"message": "[error_userfriendly_message]",
"reason": "[technical_error_desc]"
}
}
} | The server progressively streams events with the form of Newline-delimited JSON (
{"type":"[event_type]","data":{ [event_payload] },"ts":[timestamp]}The different event types and their associated payloads are detailed in the “Stream events” section. The connection can be closed in client’s side when an “error” or “stream.complete” is received. {"type":"stream.completed","data":{"status":"OK"},"ts":1759406565419}{"type":"error","data":{"code":"[err_code]","label":"[err_label]",...},"ts":1759406565419} |
In case of error, the API provides a default response in English that can be displayed via a chatbot. However, since this “error.message” is not localized, we recommend using “error.label” as an i18n translation key to render the message in the proper language.
Request body fields description
Field | Required for | Optional for | Description |
|---|---|---|---|
| SUMMARIZE, AGENTIC, SEARCH, SYNTHESIZE | RAG | The action that must be processed. Available actions are:
|
| RAG, AGENTIC, SEARCH | SUMMARIZE, SYNTHESIZE | The user query, for RAG or AGENTIC. For SUMMARIZE or SYNTHESIZE, an optional query can be provided to set the content of the message stored in database. |
| SUMMARIZE | RAG | The ID of a Solr document.
|
| SYNTHESIZE | RAG, AGENTIC | Add one filter or more to restrict document retrieval in RAG and agentic processes. Current allowed values are:
You can provide search parameters other than “fq” to customize retrieval operations. However, we highly recommend sticking to the “fq” to avoid search failures. The LLM will not be able to use excluded documents. For SYNTHESIZE action, the Example: "filters": {
“id”: [“docIdNumber1”, “docIdNumber2”, “docIdNumber6”],
"fq": ["repo_source:FileShare", "{!tag%3Dlanguage}(language%3A"en")"]
}In the example above, any document with an ID different from the ones listed, that does not belong to the “FileShare” repository OR that is not in English will be excluded. |
|
| AGENTIC | Only for AGENTIC. Select one of the available agents (default: Currenty, only the “rag” agent is available. More agents may be added in the future. |
|
| ALL SERVICES | A two letters language code to specify the expected language of the response ( If empty, the server will try to retrieve the user’s favorite language in database. English is used by default. |
| ALL SERVICES |
| The ID of the conversation.
|
|
| RAG, AGENTIC | Include the chat history to the request. If enabled in RAG configuration, the LLM can use this history during query rewriting and response generation. This field is a list of ChatMessage objects, with a [
{
"role": "user",
"message": "what is enron ?"
},
{
"role": "assistant",
"message": "ENRON is a multinational (...)"
}
] |
Error messages and label
Above is the list of the existing error labels, with their associated default message (in English):
ragErrorNotEnabled: Sorry, it seems the feature is not enabled.
ragNoFileFound: Sorry, I couldn't find any relevant document to answer your request.
ragTechnicalError: Sorry, I met a technical issue. Please try again later, and if the problem remains, contact an administrator.
ragNoValidAnswer: Sorry, I could not find an answer to your question.
ragBadRequest: Sorry, It appears there is an issue with the request. Please try again later, and if the problem remains, contact an administrator.
summarizationErrorNotEnabled: Sorry, it seems the feature is not enabled.
summarizationNoFileFound: Sorry, I cannot find this document.
summarizationTechnicalError: Sorry, I met a technical issue. Please try again later, and if the problem remains, contact an administrator.
summarizationBadRequest: Sorry, It appears there is an issue with the request. Please try again later, and if the problem remains, contact an administrator.
summarizationEmptyFile: Sorry, I am unable to generate a summary, since the file has no content.
synthesisErrorNotEnabled: Sorry, it seems the feature is not enabled.
synthesisNoFileContent: Sorry, I am unable to generate a synthesis of those documents, since the files are missing or have no available content.
synthesisTechnicalError: Sorry, I met a technical issue. Please try again later, and if the problem remains, contact an administrator.
Available services
WORK IN PROGRESS
Both endpoint (/ai and /ai/stream) can be used to call any available AI Services, by providing the action parameter.
Service | Value of “action” field | Description | Parameters |
|
|---|---|---|---|---|
RAG |
| The services runs a search in Datafari to retrieve relevant sources, and provides them to the LLM so it can answer the user query or question. More information about RAG here: Retrieval-Augmented Generation (RAG) |
|
|
SUMMARIZE |
| The service retrieves or generates the summary of a document. |
|
|
SYNTHESIZE |
| The service generates a synthesis of multiple documents, based on individual summaries. |
|
|
AGENTIC |
| The service calls an “agent”, an AI Augmented program that is able to use various tools at its disposal (such as search, entity extraction, RAG, summarization…) to answer the user query. Currenty, only the “rag” agent is available. More agents may be added in the future. |
|
|
SEARCH |
| Runs a simple search in Datafari. The search results are stored in the “docs” section of the API response or stream event. |
|
|
No-stream responses
Response structure
The POST /ai endpoints returns a JSON with the following structure:
{
"status":"OK|ERROR",
"content": {
"message": "[ai_generated_response]",
"conversationId": "[conversationId]",
"sources": [
[list_of_retrieved_sources]
],
"docs": [
[list_of_search_results]
],
"error":{
"code": "[error_code]",
"label": "[error_label]",
"message": "[error_userfriendly_message]",
"reason": "[technical_error_desc]"
}
}
}This structure is always the same, whatever the action is.
Field | Description | Application in the Chatbot |
|---|---|---|
status | “ERROR” or “OK”. The status indicates if the process was successful, or if it met an error. | - |
content.message | The AI generated message. Can be null or empty in case of error. | If the message is neither null nor empty, it must be rendered in the chatbot, and added to the chat history as an “assistant” message. |
content.conversationId | The conversation ID. It is only present for logged users. | The Only for logged users. |
content.docs | Used for search results. If provided, It contains a list of document, using the following structure: "docs": [
{
"url": "...", // The URL of the document
"docId": "...", // The Solr ID of the document
"title": "...", // The first title of the document
"content": "..." // A truncated part of the document
},
{
"url": "...",
"docId": "...",
"title": "...",
"content": "..."
},
...
] | When handling a message with search results, the chatbot renders of formatted assistant message instead of a regular message. |
content.sources | A list of sources retrieved during the process. Sources use the following structure: "documents": [
{
"url": "...", // The URL of the document
"id": "...", // The Solr ID of the document
"title": "...", // The first title of the document
"content": "..." // A truncated part of the document
},
{
"url": "...",
"id": "...",
"title": "...",
"content": "..."
},
...
]This field can be null or empty. | The sources must be displayed as clickable links, as long as the URL is not empty. <a target="_blank" rel="noopener noreferrer" href="${url}">${title}</a>
|
content.error | Only present in case of error or exception. Is null if the process is successful. | - |
content.error.code | An HTTP error code of the error (ex: | - |
content.error.label | A code that can be used as an i18n key for translation, when rendering a message in the chatbot. | If |
content.error.message | An English, user-friendly message that can be displayed in the chatbot if no translation is available. | If |
content.error.reason | A technical description of the error. | - |
Examples
Request | Response (OK) | Response (ERROR) |
|---|---|---|
RAG curl -k -X POST "https://localhost/Datafari/rest/v2.0/ai" \
-H "Content-Type: application/json" \
-d '{
"query": "What is enron ?",
"action": "rag",
"lang": "fr",
"history": []
}' | Success {
"content": {
"message": "Enron est l'une des principales entreprises mondiales dans le secteur de l'énergie (...) Enron a été reconnue par le magazine Fortune comme \"l'entreprise la plus innovante d'Amérique\" pendant six années consécutives.",
"sources": [
{
"content": "Enron Offgrid\n\tEnron Financial(...) \nRenewable Power Desk\nEnron North A…",
"id": "file://///localhost/enron/ElliotRPD%20Overview%205_18_012.ppt",
"title": "ElliotRPD Overview 5_18_012.ppt",
"url": "file://///localhost/enron/ElliotRPD%20Overview%205_18_012.ppt"
},
{
"content": "ENRON RESERVATION PRICE\t\tCOUNTDOWN (...) AUCTION NEWS\n\t\t\tHOW TO SUBMIT A BID OR OFFE…",
"id": "file://///localhost/enron/Emissions%20Auction%20SiteText.doc",
"title": "Emissions Auction SiteText.doc",
"url": "file://///localhost/enron/Emissions%20Auction%20SiteText.doc"
},
(...)
{
"content": "113\n7\nContinental Power\n(Millions MWh)(...) and Physical Settled Volumes\n1999\n200…",
"id": "file://///localhost/enron/EGM_Final.ppt",
"title": "EGM_Final.ppt",
"url": "file://///localhost/enron/EGM_Final.ppt"
}
]
},
"status": "OK",
"conversationId": "ag9dfoilb-bdun2hh5j-15vio77fz"
} | Error: No document retrieved. {
"content": {
"error": {
"code": "428",
"label": "ragNoFileFound",
"message": "Sorry, I couldn't find any relevant document to answer your request.",
"reason": "The query cannot be answered because no associated documents were found."
},
"sources": [
]
},
"status": "ERROR"
} |
SUMMARIZE curl -k -X POST "https://localhost/Datafari/rest/v2.0/ai" \
-H "Content-Type: application/json" \
-d '{
"id": "file://///localhost/enron/ELPASO.pdf",
"action": "summarize",
"lang": "en",
"conversationId": "ag9dfoilb-bdun2hh5j-15vio77fz"
}' | Success {
"content": {
"message": "The document provides a detailed report on gas capacity and deliveries (...) useful for analyzing trends in the energy sector.",
"sources": [
],
"conversationId": "ag9dfoilb-bdun2hh5j-15vio77fz"
},
"status": "OK"
} | Error: Invalid document ID {
"content": {
"error": {
"code": "422",
"label": "summarizationNoFileFound",
"message": "The document cannot be retrieved.",
"reason": "Index 0 out of bounds for length 0"
},
"sources": [
],
"conversationId": "ag9dfoilb-bdun2hh5j-15vio77fz"
},
"status": "ERROR"
} |
AGENTIC curl -k -X POST "https://localhost/Datafari/rest/v2.0/ai" \
-H "Content-Type: application/json" \
-d '{
"query": "Cite moi le nom de trois employés féminins de ENRON",
"action": "agentic",
"agent": "rag",
"lang": "fr"
}' | Success {
"content": {
"message": "Trois employés féminins de Enron sont :\n\n1. Cindy Olson - Vice-présidente des ressources humaines.\n2. Wanda Curry - Vice-présidente.\n3. Peggy Fowler - Vice-présidente et conseillère générale.",
"sources": [
{
"content": "Cindy Olson is Enron’s EEO officer. As Executive Vice President (...) we achiev…",
"id": "file://///localhost/enron/EEO.doc",
"title": "EEO.doc",
"url": "file://///localhost/enron/EEO.doc"
},
{
"content": "ENRON\nENRON ENERGY SERVICES (...) Wanda Curry\nVice Presi…",
"id": "file://///localhost/enron/EES%20Org%20Chart.ppt",
"title": "EES Org Chart.ppt",
"url": "file://///localhost/enron/EES%20Org%20Chart.ppt"
},
(...)
{
"content": "ENRON EMPLOYEE REFERRAL INCENTIVE PROGRAM\n\nEnron Corp.,(...) To provide incentives for cu…",
"id": "file://///localhost/enron/Employee%20Referral.doc",
"title": "Employee Referral.doc",
"url": "file://///localhost/enron/Employee%20Referral.doc"
}
]
},
"status": "OK"
} | Error: Disabled feature {
"content": {
"error": {
"code": "422",
"label": "ragErrorNotEnabled",
"message": "Sorry, it seems the feature is not enabled.",
"reason": "Agentic service is disabled in configuration."
},
"sources": [
]
},
"status": "ERROR"
} |
SEARCH curl -k -X POST "https://localhost/Datafari/rest/v2.0/ai" \
-H "Content-Type: application/json" \
-d '{
"query": "enron",
"action": "search"
}' | {
"status": "OK",
"content": {
"message": "",
"sources": [
{
"id": "83e03438969c5702091caf38e9e49b5ccd868a68f2689c24dac46561b2315115",
"title": "McConnell.Shankman%20List.doc",
"url": "file://///fileshare.datafari.com/share/McConnell.Shankman%20List.doc",
"content": "[\"Director \/ Officer Positions, etc... \\nNovember 26, 2001\\n\\nMichael S. McConnell \\n\\tCompany\/Title\\n\\t\\n\\t\\n\\t\\n\\n\\nECT Overseas Holding Corp.\\n\\n\\tDirector\\n\\t\\n\\t\\n\\n\\tChairman and Preside…"
},
{
"id": "cc34c60dcdb23f401267349e621bf59705751cff2b473200ecf0cbe3d23e49b0",
"title": "eNovateFinSvcsAgrmt.doc",
"url": "file://///fileshare.datafari.com/share/eNovateFinSvcsAgrmt.doc",
"content":"[\"FINANCIAL SERVICES AGREEMENT\\n\\n\\nTHIS FINANCIAL SERVICES AGREEMENT (this \“Agreement\”), dated as of April ___, 2001, by and between ENRON CORP., an Oregon corporation (\“Enron\”), …"
}
(...)
],
"docs": [
{
"docId": "83e03438969c5702091caf38e9e49b5ccd868a68f2689c24dac46561b2315115",
"title": "McConnell.Shankman%20List.doc",
"url": "file://///fileshare.datafari.com/share/McConnell.Shankman%20List.doc",
"content": "Director / Officer Positions, etc... \nNovember 26, 2001\n\nMichael S. McConnell \n\tCompany/Title\n\t\n\t\n\t\n\n\nECT Overseas Holding Corp.\n\n\tDirector\n\t\n\t\n\n\tChairman and President\n\t\n\t\n\n\nEI Global Fuels Ltd.\n\n\tDirector\n\t\n\t\n\n\tChairman\n\t\n\t\n\n\nEnron (Bermuda) Limited\n\n\tDirector\n\t\n\t\n\n\tChairman\n\t\n\t\n\n\nEnron A..."
},
{
"docId": "cc34c60dcdb23f401267349e621bf59705751cff2b473200ecf0cbe3d23e49b0",
"title": "eNovateFinSvcsAgrmt.doc",
"url": "file://///fileshare.datafari.com/share/eNovateFinSvcsAgrmt.doc",
"content": "FINANCIAL SERVICES AGREEMENT\n\n\nTHIS FINANCIAL SERVICES AGREEMENT (this “Agreement”), dated as of April ___, 2001, by and between ENRON CORP., an Oregon corporation (“Enron”), and ENRON SUB [insert name of the Enron entity that is the managing member of enovate, LLC], a ____________ (“Enron Sub”)..."
},
(...)
{
"docId": "5c96241ada7160ebbad05f4cbbaf903b5757a9f8484123396f155a4fb1af8f92",
"title": "form%20financial%20services.doc",
"url": "file://///fileshare.datafari.com/share/form%20financial%20services.doc",
"content": "FINANCIAL SERVICES AGREEMENT\n\n\nTHIS FINANCIAL SERVICES AGREEMENT (this “Agreement”), dated as of April ___, 2001, by and between ENRON CORP., an Oregon corporation (“Enron”), and ENRON MW, L.L.C., a Delaware limited liability company (“Enron MW”).\n\nW I T N E S S E T H:\n\nWHEREAS, Enron MW is the m..."
}
]
}
} |
|
|
|
|
Streaming events
Traditional API calls return the full response only once the processing is complete. When using a slow model API, this can lead to noticeable latency, especially when heavy pipelines or Agents are involved.
Streaming solves this problem by:
Reducing perceived latency: tokens appear as they are generated, the progression of the process is visible by the user.
Increasing transparency: intermediate events (tool calls, source retrieval) are visible as they happen.
Improving UX: users receive immediate feedback and understand what the system is doing.
Event types
The response stream is composed of a sequence of events. Each event has:
event: the type of message (string)
data: the associated payload (object)
{"type":"<TYPE>","data":"<DATA>"}The following events can be sent by any streaming chatbot-related endpoint (RAG, agentic, summarization…). More types can be implemented in the future.
type | data | Description | Application in chatbot |
| {} | The connection is open. | The “Source section” is cleared. |
| {
"name": "<phase>"
}
| Indicates the phase of the current process. The data contains a single String. Below is a non-exhaustive of phase name examples.
| The phase name is displayed in the “Phase indicator”. |
| {
"text": "<text_fragment>"
} | Add a response fragment (token) to progressively display the response tokens, one by one. | The token is added to the assistant’s message in the chat window. Due to the recursive algorithms used in our processed, summarize and RAG services do not support progressive response writing, and do not send message.delta events. The generated response is sent in a message.final event. |
| {
"text": "<full_response>"
} |