Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

Info

Valid from Datafari 4.0

Protwords or protected words are a list of words that will be protected from the stemmers.

For a reminder about stemming in Solr we consider the following example : in the case that you send documents to Solr in the field content_en in your main Solr collection :

  1. The

...

  1. man ran

...

  1.  to the beach

  2. The man

...

  1. is running

...

  1.  to the beach.

  2. The man

...

  1. will run

...

  1.  to the beach.

  2. The

...

  1. man runs

...

  1.  to the beach every day.

  2. The man wants to be

...

  1. runner.

With no stemming on the field, if the user searchs the term 'run", only the document 3 will be in the documents list. In the other hand, if an agressive stemmer is in place, most the documents will be on the documents list (different stemmer exist and their choice has a leverage  on the documents list).

in In the document 2, running will be transformed in 'run" prefix, in the document 4, runs will be transformed in 'run prefix'.

If we put the term 'running' on the protected words file, the term will be protected for the stemmers in the content_xx (xx is the language) fields of Datafari. So the term running will not turn into 'run' prefix by the stemmer.

ConcretlyConcretely, it means that the user will have to enter the exact term 'running' to retrieve the document that contains this term .            in a Solr document.                                  

Search Expert: managing protwords

...

Protwords are not language specific. You need to select the language for which you want to manage the synonyms"ALL" in the selection of the language. Once this is done, you get a nice interface allowing you to edit the synonyms protwords list. Note that only one search expert at a given time can edit this file. Any other simultaneous tentative will end up with an error message on the screen.

Image RemovedImage Added
 Here  Here you can delete/add words and add/delete their corresponding synonyms or even delete a synonym entry by clicking on the 'trash' button at the end of a line.

To add new synonym entry, simply click on the 'Add new synonyms' button, a new line will be created and you will be able to edit this new entry:

Image Removed

Image Removed

protwords directly by editing the text file. Simply enter one protword per line.

Once you are ok with your modifications, click on the 'Validate modificationsConfirm' button. The modifications are immediately taken into account with no further action .

Image Removed

Otherwise you will get an error:

Image Removed(file sent to Zookeeper then a reload of the Solr collection is performed).