Datafari gives you the opportunity to link an ontology to it, in order to enrich your documents with the ontology data and then use these additional infos in the search phase.
To do so, the first step is to add the OntologyUpdateProcessor as a custom UpdateProcessor in the file {Datafari_Home}/solr/solr_home/FileShare/conf/customs_solrconfig/custom_update_processors.xml
The OntologyUpdateProcessor will index, for each document inserted in Solr, the labels of the corresponding node into the specified ontology but also its parents and children labels and URIs.
It has several parameters that can be set:
annotationField : [REQUIRED] the field in the input document that contains the annotation URI used as the reference in the ontology.
Note that this field must be added to the document by yourself: Datafari does not provide any plugin or other mechanism to do it. You may want to modify the DatafariUpdateProcessor or make your own UpdateProcessor to do so. |
ontologyURI : [REQUIRED] the location of the ontology, such as http://francelabs.com/ontology/owl.rdf or file:///ontology/owl.rdf
Datafari only supports OWL ontology format for now, so the ontologyURI parameter has to reference a OWL formated ontology |
If you decide not to use the default values for the fields, DO NOT FORGET to add them in the appropriate custom schema file(s) which are located in {Datafari_Home}/solr/solr_home/FileShare/conf/customs_schema |
To implement the OntologyUpdateProcessor, follow these steps:
open the 'custom_libs.xml' file in {Datafari_installation_folder}/solr/solr_home/FileShare/conf/customs_solrconfig/
add the following line (replace the '<to_remove_tag>') :
<lib dir="./lib/jena"/> |
Datafari uses Apache Jena to load an ontology |
add the following lines(still replace the '<to_remove_tag>'):
<processor class="com.francelabs.datafari.updateprocessor.OntologyUpdateProcessorFactory"> <bool name="enabled">true</bool> <str name="annotationField">ontology_annotation</str> <!-- Location of the ontology --> <str name="ontologyURI">file:///owl.rdf</str> </processor> |
Of course you have to replace the values by yours and also add the optional parameters described up above with what you want |
Then, you can configure Datafari to use the new Ontology infos on the search requests :
Modify those lines to set the ontologyEnabled property to true and the other lines to match with your ontology configuration:
ontologyEnabled=true ontologyLanguageSelection=true ontologyNodeLabels=ontology_labels ontologyChildrenLabels=ontology_children_labels ontologyParentsLabels=ontology_parents_labels |
ontologyEnabled : (boolean) Enable or disable the ontology use
ontologyLanguageSelection : (boolean) should use the Datafari selected language for the ontology labels ? If true, be sure that you used the OntologyUpdateProcessor with the parameter userLanguages set to true during the crawl of the documents by MCF
ontologyNodeLabels : the field in your schema that contains the annotation's label(s)
ontologyChildrenLabels : the field in your schema that contains the child documents labels
ontologyParentsLabels : the field in your schema that contains the parent documents labels
Your Datafari is now ready to use the ontology additional fields during search queries thanks to the OntologySuggestion.widget