Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Info

...

Valid from 4.0.0

The documentation below is valid from Datafari v4.0.0 upwards

By default, the 'DatafariSolr' output connector, which is pre-configured in MCF by Datafari, sends all the documents to the /update/extract handler of Solr. This handler uses an embed Tika to parse the incoming document before indexing it, even if the parsing has already been done by a Tika connector or a Tika service connector that you may have configured in the crawl job. This may result in an alteration of the content of the document, like for XML, CSV or JSON files and also in resource and treatment time consumption that could be avoided.

...

The handler java classes are :

  • com.francelabs.datafari.handler.parsed.ParsedContentHandler

  • com.francelabs.datafari.handler.parsed.ParsedDocumentLoader

  • com.francelabs.datafari.handler.parsed.ParsedRequestHandler

They are located under the 'datafari-handler' module of the Datafari github project

...