This documentation gives you the steps to follow to add entity recognition to your Datafari using an external fastapi (giving access to spacy models in our case) to extract entities. This goes from how to add entity extraction to the indexation pipeline to using those entities in an autocomplete component to help users in their searches.
Setting up the fastapi server
Follow the documentation Setting up a server to host Spacy for Named Entity Recognition to setup this service.
Adding Entity Extraction to a Job
In this step, you need to create an instance of the transformation connector. Then you will need to add this transformation to your job and run your job to extract the entities. These steps are explained in https://datafari.atlassian.net/wiki/pages/resumedraft.action?draftId=2469920769.
Adding a Search Component and Search Handler in Solr for Autocomplete
In this section we will assume that the field storing the entities you want to get autocomplete on is entity_keyword.
...
Our search component and sarch handler should be ready to go at this stage. We will now configure Datafari to serve the new suggest endpoints and add an autocomplete suggester to DatafariUI.
Configuring Datafari to Serve the New Suggest Request Handler
As the request handler we added is one that is used for suggestions and autocomplete, we need to modify the file ${DATAFARI_HOME}/tomcat/conf/entity-autocomplete.properties.
...
Save the file once you are done. You should not need to restart anything for that change to be effective.
Adding an Autocomplete Suggester in DatafariUI
Now that everything is ready, we can add an autocomplete suggester in DatafariUI. To do so, open the ${DATAFARI_HOME}/www/ui-config.json file and modify the suggester array which is in the searchBar object:
...
As you can see, we added a suggester of type ENTITY which refers to the field and different components we need. We recommend you leave asFacet to false unless you know the exact behavior of this option. You can modify the maximum number of suggestion shown, the title and subtitle as you see fit. Title and subtitle text can also be added to the translation files if you need to display them in different languages (use the text you put in this ui-config file as a key, and the translated text as a value in each language file you need).
Result
That’s it, you now have an autocomplete section covering your entity type in DatafariUI
...