Differences between the Community and Enterprise Editions of Datafari

 

Valid from 5.4

The documentation below is valid starting from Datafari 5.4

The conceptual difference between the community and enterprise editions relate to the following themes: security, big data, exploitation and relevance. This means that whenever we provide advanced functionalities for one these axis, we tend to keep it for the enterprise edition rather than the community one, in order to maintain a strong interest for potential customers to get the Enterprise Edition. We do not put any particular constraints on the community edition, therefore it remains possible to implement by yourself any functionality that we do not provide outside of the Enterprise Edition, and there are high chances that we will not accept the work as a contribution that would be integrated to the community edition (as it would be conflicting with our own functionality)..

If we take these axis and drill down, this gives us, in terms of functionalities only available in our Enterprise Edition:

  1. Security

    1. Graphical AD configuration: the possibility from the Admin UI to setup the connection to your AD;

    2. Multiple AD OU: the technical capacity for Datafari to manage several AD organisation units;

    3. SSO: Integration with Kerberos and SAMLv2;

    4. Caching mechanism for the users roles: Caching user connection to lower the load.

  2. Big Data

    1. Multiservers clusterisation: preconfigured environments for a multiserver environments (including the SolrCloud capability);

    2. Full compenentisation via Docker containers: all the components of Datafari (Cassandra, Postgre, MCF, Solr ...) are packaged within their own container, allowing for a "micro-service" approach;

    3. Enhanced Data Extraction Server: Preconfigured and faster indexation phase.

  3. Exploitation

    1. Backup scripts: automatically backup your MCF configuration, your Solr index. Scripts and documentation to reload the backups.

    2. Advanced Monitoring: Configuration of Glances, logs downloadable per component from the admin UI, cron job watching the processes and restarting them automatically

    3. Monitoring Dashboards: Dashboards (based on Apache Zeppelin) to analyse the technical components, and to quickly find file indexations problems (such as missing files reported by an employee)

  4. Relevance

    1. Automatic Relevancy Algorithm optimizer: Graphical tool that automatically calculates a local optimal solution for a given number of fields as part of the relevancy algorithm (leveraging the Golden Query Retriever functionality).

    2. Golden Query retriever: Graphical tool in the search results, to compile the golden queries consisting of queries/relevant documents.

    3. Machine Learning for documents real time ranking:

    4. Query time concept extractor:

    5. Indexing time concept extractor:

    6. On the fly semantic annotator: Able to connect asynchronously to 3rd party tools, such as AI tools from Google or Amazon, to extract/analyse documents before indexation. For instance, images of cars would be enriched with metadata giving the brand and color of the car; videos would get back the transcription text.

This separation means that at France Labs, by default (but there can be exceptions), work done on these axis would be put only on the Enterprise Edition, not the Community Edition. This also means that work done on other axis is by default for both versions.

Our vision is that the Community Edition should provide a smooth experience for users: easy installation, easy configuration (via the simplified connectors), and fast speed of search. There should be no particular difference on that point between the Community and Enterprise Edition. 

As for connectors, it should be decided on a case by case.

We compiled below a comparison matrix of both versions, which we try to keep as exhaustive as possible:

 

 

Datafari Community Edition

Datafari Enterprise Edition

 

 

Datafari Community Edition

Datafari Enterprise Edition

Security

 

 

Graphical AD Configuration

NO

YES

 

Multiple AD Org Units

NO

YES

 

HTTPS (from web browser to Datafari proxy)

YES

YES

 

Caching Mechanism for users roles

NO

YES

 

Single Sign On

NO

YES

 

OAUTH, SAML

NO

YES

 

Datafari roles assignement

YES

YES

 

SMB1 and SMB2

YES

YES

 

Optimised SMB2

NO

YES

 

Dedicated Apache Proxy

YES

YES

 

Intercomponents SSL/TLS Security (optional)

NO

YES

 

Datafari roles assignement

YES

YES

Big Data

 

 

Multiservers clusterisation preconfigured

NO

YES

 

Full dockerisation (not released yet in 6.0)

NO

YES

 

Enhanced Data Extraction Server (External Tika)

NO

YES

 

Zookeeper for the Connectors Framework (MCF)

YES

YES

 

Multisource multiformats crawling connectors

YES

YES

 

Search Aggregator to federate several Datafaris

YES

YES

 

Third Party Search integration

NO

YES

 

Connector for Solr source w/o security

YES

YES

 

Connector for Solr source incl. security

NO

YES

 

Connector for Drupal w/o security

YES

YES

 

Connector for Drupal incl. security

NO

YES

 

Connector for Web sources

YES

YES

 

Connector for O365 incl. security

NO

YES

 

Connector for Sharepoint 2010-2019 / Online w/o security

YES

YES

 

Connector for Sharepoint 2010-2019 / Online incl. security

NO

YES

 

Connector for Tuleap incl. security

NO

YES

 

Connector for Documentum w/o security

YES

YES

 

Connector for Documentum incl. security

NO

YES

 

Connector for file share w/o security

YES

YES

 

Connector for file share incl. security with smart incremental features

NO

YES

 

Connector for Alfresco w/o security

YES

YES

 

Connector for Alfresco incl. security

NO

YES

 

Connector for DB (JDBC) w/o security

YES

YES

 

Connector for DB (JDBC) incl. security

NO

YES

 

Connector for Jira and Confluence w/o security

YES

YES

 

Connector for Jira and Confluence incl. security

NO

YES

 

Connector for XWiki incl. security

NO

YES

Exploitation

 

 

Backup scripts

NO

YES

 

Basic backup and restore MCF

YES

NO

 

Advanced backup and restore for MCF

NO

YES

 

Advanced Monitoring

NO

YES

 

Direct logs download

NO

YES

 

Dashboards for exploitation monitoring

NO

YES

 

Graphical reinitialisation functionality for MCF

NO

YES

 

Automatic HTTPS Certificates update

NO

YES

 

Scriptable installation and deployement (ready for Puppet or Chef)

NO

YES

 

Email alerts to the Datafari admin

NO

YES

 

Docker version

YES

YES

 

VM OVA version

YES

YES

 

Step by step graphical install procedure

YES

YES

Relevance

 

 

Automatic Relevancy Algorithm Optimiser

NO

YES

 

Golden Queries Retriever

NO

YES

 

Machine Learning for smart re-ranking

NO

YES

 

Query time Concept Extractor

NO

YES

 

Simple indexing time Concept Extractor

YES

YES

 

Advanced indexing time Concept Extractor (STT based NER)

NO

YES

 

On the fly Semantic Annotator

NO

YES

 

Document boost

YES

YES

 

Thesaurus

YES

YES

 

Advanced search behavior analytics

YES

YES

 

Configurable weights for the ranking algorithm

YES

YES

 

Autocomplete with Entities Detection

NO

YES

 

Department based custom search

NO

YES

 

Tag cloud facet

NO

YES

Search functionnalities

 

 

Saving search

YES

YES

 

Search based Alerts

YES

YES

 

Advanced Search

YES

YES

 

Standard Autocomplete

YES

YES

 

Spellchecker

YES

YES

 

Favorites

YES

YES

 

Search in metadata and full content

YES

YES

 

Facets

YES

YES

 

Exact Search

YES

YES

 

Synonyms and stopwords configuration

YES

YES

 

Document Preview Page

YES

YES

Others

 

 

Optical Character Recognition (OCR)

YES

YES

 

Indexed corpora analytics

YES

YES

 

Promolinks

YES

YES

 

Duplicates detection

YES

YES

 

GDPR Compliance tools

NO

YES

 

Support by the makers of Datafari

NO

YES

 

Patches and upgrades with detailed procedures

NO

YES

 

Internationalised for indexing and search UI

YES

YES

 

Detailed documentation

YES

YES


 

Valid from 5.0

The documentation below is valid starting from Datafari 5.0 up to 5.3 included

The conceptual difference between the community and enterprise editions relate to the following themes: security, big data, exploitation and relevance. This means that whenever we provide advanced functionalities for one these axis, we tend to keep it for the enterprise edition rather than the community one, in order to maintain a strong interest for potential customers to get the Enterprise Edition. We do not put any particular constraints on the community edition, therefore it remains possible to implement by yourself any functionality that we do not provide outside of the Enterprise Edition, and there are high chances that we will not accept the work as a contribution that would be integrated to the community edition (as it would be conflicting with our own functionality)..

If we take these axis and drill down, this gives us, in terms of functionalities only available in our Enterprise Edition:

  1. Security

    1. Graphical AD configuration: the possibility from the Admin UI to setup the connection to your AD;

    2. Multiple AD OU: the technical capacity for Datafari to manage several AD organisation units;

    3. SSO: Integration with Kerberos and SAMLv2;

    4. Caching mechanism for the users roles: Caching user connection to lower the load.

  2. Big Data

    1. Multiservers clusterisation: preconfigured environments for a multiserver environments (including the SolrCloud capability);

    2. Full compenentisation via Docker containers: all the components of Datafari (Cassandra, Postgre, MCF, Solr ...) are packaged within their own container, allowing for a "micro-service" approach;

    3. Enhanced Data Extraction Server: Preconfigured and faster indexation phase.

  3. Exploitation

    1. Backup scripts: automatically backup your MCF configuration, your Solr index. Scripts and documentation to reload the backups.

    2. Advanced Monitoring: Configuration of Glances, logs downloadable per component from the admin UI, cron job watching the processes and restarting them automatically

    3. Monitoring Dashboards: Dashboards (based on Apache Zeppelin) to analyse the technical components, and to quickly find file indexations problems (such as missing files reported by an employee)

  4. Relevance

    1. Automatic Relevancy Algorithm optimizer: Graphical tool that automatically calculates a local optimal solution for a given number of fields as part of the relevancy algorithm (leveraging the Golden Query Retriever functionality).

    2. Golden Query retriever: Graphical tool in the search results, to compile the golden queries consisting of queries/relevant documents.

    3. Machine Learning for documents real time ranking:

    4. Query time concept extractor:

    5. Indexing time concept extractor:

    6. On the fly semantic annotator: Able to connect asynchronously to 3rd party tools, such as AI tools from Google or Amazon, to extract/analyse documents before indexation. For instance, images of cars would be enriched with metadata giving the brand and color of the car; videos would get back the transcription text.

This separation means that at France Labs, by default (but there can be exceptions), work done on these axis would be put only on the Enterprise Edition, not the Community Edition. This also means that work done on other axis is by default for both versions.

Our vision is that the Community Edition should provide a smooth experience for users: easy installation, easy configuration (via the simplified connectors), and fast speed of search. There should be no particular difference on that point between the Community and Enterprise Edition. 

As for connectors, it should be decided on a case by case.

We compiled below a comparison matrix of both versions, which we try to keep as exhaustive as possible:





Datafari Community Edition

Datafari Enterprise Edition





Datafari Community Edition

Datafari Enterprise Edition

Security



 

Graphical AD Configuration

NO

YES

 

Multiple AD Org Units

NO

YES

 

HTTPS (from web browser to Datafari proxy)

YES

YES

 

Caching Mechanism for users roles

NO

YES

 

Single Sign On

NO

YES

 

OAUTH, SAML

NO

YES

 

Datafari roles assignement

YES

YES

 

SMB1 and SMB2

YES

YES

 

Optimised SMB2

NO

YES

 

Dedicated Apache Proxy

YES

YES

 

Intercomponents SSL/TLS Security (optional)

NO

YES

 

Datafari roles assignement

YES

YES

Big Data



 

Multiservers clusterisation preconfigured

NO

YES

 

Full dockerisation (not released yet in 5.0)

NO

YES

 

Enhanced Data Extraction Server (External Tika)

NO

YES

 

Zookeeper for the Connectors Framework (MCF)

YES

YES

 

Multisource multiformats crawling connectors

YES

YES

 

Search Aggregator to federate several Datafaris

YES

YES

 

Third Party Search integration

NO

YES

 

Connector for Solr source w/o security

YES

YES

 

Connector for Solr source incl. security

NO

YES

 

Connector for Drupal w/o security

YES

YES

 

Connector for Drupal incl. security

NO

YES

 

Connector for Web sources

YES

YES

 

Connector for O365 incl. security

NO

YES

 

Connector for Sharepoint 2010-2019 / Online w/o security

YES

YES

 

Connector for Sharepoint 2010-2019 / Online incl. security

NO

YES

 

Connector for Tuleap incl. security

NO

YES

 

Connector for Documentum w/o security

YES

YES

 

Connector for Documentum incl. security

NO

YES

 

Connector for file share w/o security

YES

YES

 

Connector for file share incl. security with smart incremental features

NO

YES

 

Connector for Alfresco w/o security

YES

YES

 

Connector for Alfresco incl. security

NO

YES

 

Connector for DB (JDBC) w/o security

YES

YES

 

Connector for DB (JDBC) incl. security

NO

YES

 

Connector for Jira and Confluence w/o security

YES

YES

 

Connector for Jira and Confluence incl. security

NO

YES

 

Connector for XWiki incl. security

NO

YES

Exploitation



 

Backup scripts

NO

YES

 

Basic backup and restore MCF

YES

NO

 

Advanced backup and restore for MCF

NO

YES

 

Advanced Monitoring

NO

YES

 

Direct logs download

NO

YES

 

Dashboards for exploitation monitoring

NO

YES

 

Graphical reinitialisation functionality for MCF

NO

YES

 

Automatic HTTPS Certificates update

NO

YES

 

Scriptable installation and deployement (ready for Puppet or Chef)

NO

YES

 

Email alerts to the Datafari admin

NO

YES

 

Docker version

YES

YES

 

VM OVA version

YES

YES

 

Step by step graphical install procedure

YES

YES

Relevance



 

Automatic Relevancy Algorithm Optimiser

NO

YES

 

Golden Queries Retriever

NO

YES

 

Machine Learning for smart re-ranking

NO

YES

 

Query time Concept Extractor

NO

YES

 

Simple indexing time Concept Extractor

YES

YES

 

Advanced indexing time Concept Extractor (STT based NER)

NO

YES

 

On the fly Semantic Annotator

NO

YES

 

Document boost

YES

YES

 

Thesaurus

YES

YES

 

Advanced search behavior analytics

YES

YES

 

Configurable weights for the ranking algorithm

YES

YES

 

Autocomplete with Entities Detection

NO

YES

 

Department based custom search

NO

YES

 

Tag cloud facet

NO

YES

Search functionnalities



 

Saving search

YES

YES

 

Search based Alerts

YES

YES

 

Advanced Search

YES

YES

 

Standard Autocomplete

YES

YES

 

Spellchecker

YES

YES

 

Favorites

YES

YES

 

Search in metadata and full content

YES

YES

 

Facets

YES

YES

 

Exact Search

YES

YES

 

Synonyms and stopwords configuration

YES

YES

 

Document Preview Page

YES

YES

Others



 

Optical Character Recognition (OCR)

YES

YES

 

Indexed corpora analytics

YES

YES

 

Promolinks

YES

YES

 

Duplicates detection

NO

YES

 

GDPR Compliance tools

NO

YES

 

Support by the makers of Datafari

NO

YES

 

Patches and upgrades with detailed procedures

NO

YES

 

Internationalised for indexing and search UI

YES

YES

 

Detailed documentation

YES

YES


Valid up to 4.6

The documentation below is valid up to Datafari EE version 4.6

The conceptual difference between the community and enterprise editions relate to the following themes: security, big data, exploitation and relevance. This means that whenever we provide advanced functionalities for one these axis, we tend to keep it for the enterprise edition rather than the community one, in order to maintain a strong interest for potential customers to get the Enterprise Edition. We do not put any particular constraints on the community edition, therefore it remains possible to implement by yourself any functionality that we do not provide outside of the Enterprise Edition, and there are high chances that we will not accept the work as a contribution that would be integrated to the community edition (as it would be conflicting with our own functionality)..

If we take these axis and drill down, this gives us, in terms of functionalities only available in our Enterprise Edition:

  1. Security

    1. Graphical AD configuration: the possibility from the Admin UI to setup the connection to your AD;

    2. Multiple AD OU: the technical capacity for Datafari to manage several AD organisation units;

    3. SSO: Integration with Kerberos and SAMLv2;

    4. Caching mechanism for the users roles: Caching user connection to lower the load.

  2. Big Data

    1. Multiservers clusterisation: preconfigured environments for a multiserver environments (including the SolrCloud capability);

    2. Full compenentisation via Docker containers: all the components of Datafari (Cassandra, Postgre, MCF, Solr ...) are packaged within their own container, allowing for a "micro-service" approach.

    3. Enhanced Data Extraction Server: Preconfigured and faster indexation phase.

    4. Dedicated Zookeeper for the Crawling Connectors Framework (MCF): allows for better performance since the Zookeeper used by Solr is freed from answering to the MCF frequent calls.

  3. Exploitation

    1. Backup scripts: automatically backup your MCF configuration, your Solr index. Scripts and documentation to reload the backups.

    2. Advanced Monitoring: Configuration of Glances, logs downloadable per component from the admin UI, cron job watching the processes and restarting them automatically

    3. Monitoring Dashboards: ELK based dashboards to analyse the technical components, and to quickly find file indexations problems (such as missing files reported by an employee)

  4. Relevance

    1. Automatic Relevancy Algorithm optimizer: Graphical tool that automatically calculates a local optimal solution for a given number of fields as part of the relevancy algorithm (leveraging the Golden Query Retriever functionality).

    2. Golden Query retriever: Graphical tool in the search results, to compile the golden queries consisting of queries/relevant documents.

    3. Machine Learning for documents real time ranking:

    4. Query time concept extractor:

    5. Indexing time concept extractor:

    6. On the fly semantic annotator: Able to connect asynchronously to 3rd party tools, such as AI tools from Google or Amazon, to extract/analyse documents before indexation. For instance, images of cars would be enriched with metadata giving the brand and color of the car; videos would get back the transcription text.


This separation means that at France Labs, by default (but there can be exceptions), work done on these axis should be put only on the Enterprise Edition, not the Community Edition. This also means that work done on other axis is by default for both versions.

Our vision is that the Community Edition should provide a smooth experience for users: easy installation, easy configuration (via the simplified connectors), and fast speed of search. There should be no particular difference on that point between the Community and Enterprise Edition. 

As for connectors, it should be decided on a case by case.

We compiled below a comparison matrix of both versions, which we try to keep as exhaustive as possible:





Datafari Community Edition

Datafari Enterprise Edition





Datafari Community Edition

Datafari Enterprise Edition

Security





Graphical AD Configuration

NO

YES



Multiple AD Org Units

NO

YES



HTTPS

YES

YES



Caching Mechanism for users roles







Single Sign On

NO

YES



Datafari roles assignement

YES

YES



SMB1 and SMB2

YES

YES



Optimised SMB2

NO

YES



Dedicated Apache Proxy

NO

YES

Big Data





Multiservers clusterisation preconfigured

NO

YES



Full dockerisation (not released yet in 4.2)

NO

YES



Enhanced Data Extraction Server

NO

YES



Zookeeper for the Connectors Framework (MCF)

NO

YES



Multisource multiformats crawling connectors

YES

YES



Cross Data Center Replication

NO

YES

Exploitation





Backup scripts

NO

YES



Backup and restore MCF

NO

YES



Advanced Monitoring

NO

YES



Direct logs download

NO

YES



Dashboards for monitoring

NO

YES

Relevance





Automatic Relevancy Algorithm Optimiser

NO

YES



Golden Queries Retriever

NO

YES



Machine Learning for smart re-ranking

NO

YES



Query time Concept Extractor (in progress)

NO

YES



Indexing time Concept Extractor

YES

YES



On the fly Semantic Annotator

NO

YES



Document boost

YES

YES



Thesaurus

YES

YES



Standard search behavior analytics

YES

YES



Advanced search behavior analytics

NO

YES



Configurable weights for the ranking algorithm

YES

YES



Autocomplete with Entities Detection

NO

YES



Department based custom search

NO

YES

Search functionnalities





Saving search

NO

YES



Search based Alerts

YES

YES



Advanced Search

YES

YES



Standard Autocomplete

YES

YES



Spellchecker

YES

YES



Favorites

YES

YES



Search in metadata and full content

YES

YES



Facets

YES

YES



Exact Search

YES

YES



Synonyms and stopwords configuration

YES

YES

Others





Optical Character Recognition (OCR)

YES

YES



Indexed corpora analytics

YES

YES



Expert based support

NO

YES



Patches and upgrades

NO

YES



Promolinks

YES

YES



Internationalised for indexing and search UI

YES

YES



Detailed documentation

YES

YES