Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

Valid from Datafari X.X

The goal of this connector is to ingest data from Solr.

...

So the Solr ingester connector can be very useful because we can control the field mappings between the source and the target, control the indexation frequency and the incremental indexation and also have a MCF authority connector to manage the security.

User documentation

  • Configure the Solr ingester repository connector

Connection type : choose Solr ingester

In the tab Solr ingester, you have the following parameters :

-URL Solr : the URL of the Solr you want to index (for example : http://localhost:8983/solr)

-Connection timeout : 60 000 ms by default

-Socket timeout : 180 000 ms by defaultImage Removed

...

  • Configure the job related to the Solr ingester repository connector

The tabs that are applied specifically to this connector are :

...

-Security:
In this tab you can choose to take into account the security. If you check the checkbox "security activated" you have to fill the textbox "Security field". Indeed you indicate to the connector what is the name of the field into the source Solr that holds the security. It will be then stored into the field related to MCF : in the allow_token_document field.Image Removed

...

-Parameters

Collection name (mandatory) : enter the name of the collection you want to index

Field mappings (optional) : indicate the mappings that you want to do between the fields extracted from the source Solr and the fields that you have in your output repository connector.

ID field (mandatory) : the unique key field of the source Solr

Date field  field  (mandatory) : the field that stores the date (NB : it is used for the incremental indexation. If you do not have a date field into your Solr source, add it to Solr and choose a default value like "NOW")

Content field (mandatory) : the field that contains the "content"

Filter condition (optional) : you can filter the documents that you want to crawl from the source Solr. For now it supports only one condition, the syntax has to be : "field:condition" like "inStock:true"Image Removed

...

Technical documentation

  • AddSeedDocuments

In the addSeedDocuments method we perform a global query on the Solr source (with cursormark query). It will gather all the doc ids (filtered or not with a condition entered by the user)

  • ProcessDocuments

For each document identifier, we do a Solr query and check if it is present in the MCF database.

...