...
For users of the Datafari Community Edition: If you want One of the options for Datafari is to use the Tika embedded in ManifoldCF, technically-wise, Datafari uses the Apache Solr Extracting Handler (aka Solr Cell) that leverages Apache Tika embedded in Solr to extract the content that will be indexed from the crawled files. To Still, in order to limit the resource consumption (especially the network if ManifoldCF is installed on an external server), it is possible to use Tika directly in ManifoldCF. In this case, the content is extracted directly in ManifoldCF, and only the content that should be indexed is sent to Apache Solr. A Tika Transformation Connection is configured by default in Datafari. To use it, you simply have to add it to your crawling job in ManifoldCF :
...