Solr configuration into Datafari

Valid from Datafari X.X

We use different ways to interact with Solr depending the feature used into the admin UI. It can be by pushing the configuration files to Zookeeper and then reload the Solr collection. It can also be by one of the Solr APIs.

In this page we detail for each feature of Datafari how we integrate it with Solr and we give some technical explanations about these options.

Datafari use cases

Feature

Solr API / Files

Needs

File
located in $DATAFARI_HOME/solr/solrcloud/$MAIN_COLLECTION_conf/

Feature

Solr API / Files

Needs

File
located in $DATAFARI_HOME/solr/solrcloud/$MAIN_COLLECTION_conf/

Highlight size

Request Parameters API

Automatically taken into account

params.json

Autocomplete threshold

Config API

Automatically taken into account

configoverlay.json

Document boost

File modified locally in Datafari server

Automatically taken into account. elevate.xml file uploaded to ZK and the collection is reloaded

elevate.xml

Synonyms

File modified locally in Datafari server

Automatically taken into account. synonyms_xx.txt file uploaded to ZK and the collection is reloaded

synonyms_xx.txt

Stopwords

File modified locally in Datafari server

Automatically taken into account. synonyms_xx.xml file uploaded to ZK and the collection is reloaded

stopwords_xx.txt

Protwords

File modified locally in Datafari server

Automatically taken into account. protwords.xml file uploaded to ZK and the collection is reloaded

protwords.txt

Field weight

Request Parameters API

Automatically taken into account

params.json

List of Solr API :

API name

Type

Description

Use in Datafari

URL

API name

Type

Description

Use in Datafari

URL

Schema API

Schema

Schema editing.

Use managed-schema file

Yes

https://lucene.apache.org/solr/guide/7_5/schema-api.html

Config API

Configuration API

The Config API enables manipulating various aspects of your solrconfig.xml using REST-like API calls.

Yes

https://lucene.apache.org/solr/guide/7_5/config-api.html

Request Parameters API

Configuration API

The Request Parameters API allows creating parameter sets, a.k.a. paramsets, that can override or take the place of parameters defined in solrconfig.xml.

Yes

https://lucene.apache.org/solr/guide/7_5/request-parameters-api.html

Blob Store API

Configuration API

The Blob Store REST API provides REST methods to store, retrieve or list files in a Lucene index.

No

https://lucene.apache.org/solr/guide/7_5/blob-store-api.html

Managed Resources

Configuration API

Managed resources expose a REST API endpoint for performing Create-Read-Update-Delete (CRUD) operations on a Solr object.

Only supports StopWords and Synonyms (since 2014...)

No

https://lucene.apache.org/solr/guide/7_5/managed-resources.html

Collections API

SolrCloud

The Collections API is used to create, remove, or reload collections.

Yes

https://lucene.apache.org/solr/guide/7_5/collections-api.html

Configsets API

SolrCloud

The Configsets API enables you to upload new configsets to ZooKeeper, create, and delete configsets when Solr is running SolrCloud mode.

Yes

https://lucene.apache.org/solr/guide/7_5/configsets-api.html

Use of Zookeeper to manage configuration files

As we explained above, for some configuration files of Datafari we store them locally then we push them to Zookeeper. Then we reload the collection to take them into account.

For example for the synonyms :
1)  the user adds synonyms for English language into the Synonyms admin UI

2) the user clicks on the confirm button

3) the modifications are stored into the file $DATAFARI_HOME/solr/solrcloud/$MAIN_COLLECTION/conf/synonyms_fr.txt

4) the file synonyms_fr.txt is pushed back to Zookeeper

5) the Solr collection is reloaded

https://lucene.apache.org/solr/guide/7_5/using-zookeeper-to-manage-configuration-files.html