Info |
---|
Valid for 4.0The documentation below is valid from Datafari 4.0 upwards |
When indexing documents, Datafari tries to identify the types of files it is indexing to register the correct extension together with each document. The registered extension is used to provide faceted search and allows the user to filter the results of his query by file type. This page exposes the means used by Datafari to perform this detection.
...
Warning | ||
---|---|---|
Be aware that using the default behavior on websites that have url like:
will result in a php type for this document (and all other documents that are retrieved using the same script). For web crawl in general, it is advised to use the alternative configuration, that uses the Tika extracted type in priority. |
Alternative Configuration
...
This parameter is under the "datafari" updateRequestProcessorChain (near line 1180, but this may obviously change).
Please refer to Manage Solr configuration with Zookeeper System Configuration Manager (Zookeeper) to know where the configuration files are located and how to reload the configuration correctly.
...