Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 20 Next »

Valid from 5.0

The documentation below is valid from Datafari 5.0 upwards

For now Datafari is preconfigured to display its menu and functionalities (not to be mixed up with the languages that can be analysed at the indexing phase) for English, French, German, Italian, Arabic, Brazilian Portuguese and Russian. See further below for the default languages that can be analysed at the indexing and search phases.

Step-by-step guide for an existing Datafari

For the internationalization of the user interface :

  1. The folder to store the i18n config files is here : /opt/datafari/tomcat/webapps/Datafari/resources/js/AjaxFranceLabs/locale. Just open it and add your new language. For example if you add German translation, put here a file named de.json.
    Also add a new entry in all the i18n files corresponding to your new language. For example to add Spanish language, we added the following entry in all the i18n files :

    "es_locale" : "Español" 

  2.  For now, we need to add the new language in a Java class. So you will need to download the source code of Datafari, modify the class and upload it to your Datafari server. The class that needs to be modified is this one : com.francelabs.datafari.utils.LanguageUtils.java located in the datafari-webapp module. You need to modify the following line by adding your language :

    public static final List<String> availableLanguages = Arrays.asList("en", "fr", "it", "ar", "ru");

    Then recompile the source code. When you are done, just send back the file $YOUR_DATAFARI_PROJECT/datafari-webapp/target/classes/com/francelabs/datafari/utils/LanguageUtils.class into your Datafari server to this location : /opt/datafari/tomcat/webapps/Datafari/WEB-INF/classes/com/francelabs/datafari/utils.
    Then you need to restart Datafari to load your changes.

  3. Add the new language in /opt/datafari/tomcat/webapps/Datafari/resources/js/AjaxFranceLabs/i18njs.js :

    availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ],


  4. For Datafari Enterprise Edition only :  create a file with the new language : LOCALE.json , for example es.json into the folder /opt/datafari/tomcat/webapps/Datafari/resources/customs/i18n
    Add this content into it :

    {}

For the internationalization of the language detection, indexing and search by the Datafari Solr engine :

  1. Modify the Solr schema to handle the new language

    1. Add the stopwords file specific to your language. In order to do so,  download the Solr version of current version of Datafari here : https://archive.apache.org/dist/lucene/solr/8.5.2/solr-8.5.2.tgz then open the folder solr-8.5.2/server/solr/configsets/_default/conf/lang and search the stopwords file corresponding to your language.
      Then add it to this location into your Datafari server : /opt/datafari/solr/solrcloud/FileShare/conf/lang/
      Then you need to push the file to Zookeeper. To do so, go to the admin UI of Datafari and click on Search Engine Configuration / System Configuration manager menu then click on push my modifications button then apply my modifications (see the page System Configuration Manager (Zookeeper) for more explanations)

    2. Add the fieldtype of your language. In order to do so,  download the Solr version of current version of Datafari here : https://archive.apache.org/dist/lucene/solr/8.5.2/solr-8.5.2.tgz then open the file solr-8.5.2/server/solr/configsets/_default/conf/managed-schema and search the fieldtype corresponding to your language.
      For example, if we want to add Russian language, the Solr configuration is this one :

       <!-- Russian -->
          <fieldType name="text_ru" class="solr.TextField" positionIncrementGap="100">
            <analyzer> 
              <tokenizer class="solr.StandardTokenizerFactory"/>
              <filter class="solr.LowerCaseFilterFactory"/>
              <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt" format="snowball" />
              <filter class="solr.SnowballPorterFilterFactory" language="Russian"/>
              <!-- less aggressive: <filter class="solr.RussianLightStemFilterFactory"/> -->
            </analyzer>
          </fieldType>

      We have to add it into our Datafari server. Edit the file /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema/custom_fieldTypes.incl and adapt the previous code to put into it (you can look at the file /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema/custom_fieldTypes.incl.example to help you) :

      {
           "name":"text_ru",
           "class":"solr.TextField",
           "positionIncrementGap":"100",
           "analyzer" : {
              "tokenizer":{
                 "class":"solr.StandardTokenizerFactory" },
              "filters":[{
                 "class":"solr.LowerCaseFilterFactory" },
                 { "class":"solr.StopFilterFactory","ignoreCase":"true","words":"lang/stopwords_ru.txt","format":"snowball" },
                 { "class":"solr.SnowballPorterFilterFactory","language":"Russian"
                  }]
            }
      }

      Add the new fieldType to Solr by launching the script addCustomSchemaInfo.sh located into /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema :

      cd /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema
      bash addCustomSchemaInfo.sh
    3. Add fields specific to your new language :
      You will notice that we have two fields which are language specific, namely content and title. Therefore, we have "content_en", "title_en", "content_fr", "title_fr", “title_de”, “content_de”. You need to create your specific "content_xy" and "title_xy" fields for your new language and configure them with the fieldType you just added previously. We want them to be indexed, stored and multiValued.
      We have to add them into our Datafari server. Edit the file /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema/custom_fields.incl and create the two fields like below (you can look at the file /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema/custom_fields.incl.example to help you). For example here is the config for Russian language :

      {
            "name":"title_ru",
            "type":"text_ru",
            "stored":true,
            "multiValued":"true",
            "indexed":"true",
      }
      &&
      {
            "name":"title_ru",
            "type":"text_ru",
            "stored":true,
            "multiValued":"true",
            "indexed":"true",
      }

      Add the new fieldType to Solr by launching the script addCustomSchemaInfo.sh located into /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema :

      cd /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema
      bash addCustomSchemaInfo.sh

  2. Modify the dedicated Solr updateprocessor which is declared in the /opt/datafari/solr/solrcloud/FileShare/conf/solrconfig.xml, that you can find at updateRequestProcessorChainDatafari, which detects the languages based on the fields content and title. By default, we use English, French and German. In order to add a new language, add the new language to the element "langid.whitelist" :

    <processor class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
          <str name="langid.fl">content,title</str>
          <str name="langid.langField">language</str>
          <str name="langid.map">true</str>
          <str name="langid.whitelist">en,fr,de</str>
          <str name="langid.fallback">en</str>
    </processor>

    Then you need to push your changes to Zookeeper. To do so, go to the admin UI of Datafari and click on Search Engine Configuration / System Configuration manager menu then click on push my modifications button then apply my modifications (see the page System Configuration Manager (Zookeeper) for more explanations)

  3. Finally change the Index fields relevancy weights Configuration. Indeed you need to add your new fields : title_xy and content_xy into the search algorithm. You can do it directly by the Datafari Admin UI : go to Search Engine Configuration → Fields Weight. Then click on the button add a field to add the two fields. For example for Russian we add :
    title_ru 50
    content_ru 10
    (more explanations here : Index fields relevancy weights Configuration



Valid from 4.4

The documentation below is valid from Datafari 4.4 upwards

For now Datafari is preconfigured to display its menu and functionalities (not to be mixed up with the languages that can be analysed at the indexing phase) for English, French, German, Italian, Arabic, Brazilian Portuguese and Russian. See further below for the default languages that can be analysed at the indexing and search phases.

Step-by-step guide

For the internationalization of the user interface, we use the i18n java library:

  1. The folder to store the i18n config files is here : datafari/WebContent/js/AjaxFranceLabs/locale. Just open it and add your new language. For example if you add German translation, put here a file named de.json.
    Also add a new entry in all the i18n files corresponding to your new language. For example to add Spanish language, we added the following entry in all the i18n files :

    "es_locale" : "Español" 


  2.  Add the new langage in the Java class com.francelabs.datafari.utils.LanguageUtils.java :

    public static final List<String> availableLanguages = Arrays.asList("en", "fr", "it", "ar", "ru");


  3. Add the new language in WebContent/js/AjaxFranceLabs/i18njs.js :

    availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ],


  4. For Datafari Enterprise Edition only :  create a file with the new language : LOCALE.json , for example es.json into the folder /opt/datafari/tomcat/webapps/Datafari/customs/i18n
    Add this content into it :

    {}

  5. Finally launch the ant script datafari-dev.xml to take into Datafari the modifications (if you are in development mode).

For the internationalization of the language detection, indexing and search by the Datafari Solr engine, follow these steps:

  1. Modify the dedicated Solr updateprocessor which is declared in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml, that you can find at updateRequestProcessorChainDatafari, which detects the languages based on the fields content and title. By default, we use English and French. In order to add a new language, modify the new language to the element "langid.whitelist".

  2. Modify the Solr schema to handle the new language, which you will find at DATAFARI_HOME/solr/solr_home/FileShare/conf/schema.xml. You will notice that we already have the following two fields which are language specific, namely content and title. Therefore, we have "content_en", "title_en", "content_fr", "title_fr". You need to create your specific "content_xy" and "title_xy" fields for your new language.

  3. Modify the searchrequesthandler named select in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml. There, change the parameters qf et pf : put the following new fields: title_xy and content_xy to the existing chain of parameters.

  4. Now you can restart your Datafari for the changes to be taken into account.



Valid from 4.0

The documentation below is valid from Datafari 4.0 upwards

For now Datafari is preconfigured for English, French, Italian, Arabic, Brazilian Portuguese and Russian. Still, its aim is to have a global reach, so the steps to enable additional languages is rather straightforward and can be found here. In case you did it, please contact us so that we can integrate it in the next releases of Datafari !

Step-by-step guide

For the internationalization of the user interface, we use the i18n java library:

  1. The folder to store the i18n config files is here : datafari/WebContent/js/AjaxFranceLabs/locale. Just open it and add your new language. For example if you add German translation, put here a file named de.json.
    Also add a new entry in all the i18n files corresponding to your new language. For example to add Spanish language, we added the following entry in all the i18n files :

    "es_locale" : "Español" 


  2.  Add the new langage in the Java class com.francelabs.datafari.utils.LanguageUtils.java :

    public static final List<String> availableLanguages = Arrays.asList("en", "fr", "it", "ar", "ru");


  3. Add the new language in WebContent/js/AjaxFranceLabs/i18njs.js :

    availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ],


  4. For Datafari Enterprise Edition only :  create a file with the new language : LOCALE.json , for example es.json into the folder /opt/datafari/tomcat/webapps/Datafari/customs/i18n
    Add this content into it :

    {}


  5. Finally launch the ant script datafari-dev.xml to take into Datafari the modifications (if you are in development mode).


For the internationalization of the language detection, indexing and search by the Datafari Solr engine, follow these steps:

  1. Modify the dedicated Solr updateprocessor which is declared in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml, that you can find at updateRequestProcessorChainDatafari, which detects the languages based on the fields content and title. By default, we use English and French. In order to add a new language, modify the new language to the element "langid.whitelist".

  2. Modify the Solr schema to handle the new language, which you will find at DATAFARI_HOME/solr/solr_home/FileShare/conf/schema.xml. You will notice that we already have the following two fields which are language specific, namely content and title. Therefore, we have "content_en", "title_en", "content_fr", "title_fr". You need to create your specific "content_xy" and "title_xy" fields for your new language.

  3. Modify the searchrequesthandler named select in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml. There, change the parameters qf et pf : put the following new fields: title_xy and content_xy to the existing chain of parameters.

  4. Now you can restart your Datafari for the changes to be taken into account.


You can either send us your new language either directly or using github, either way is fine by us as the modifications are not huge. We will send you a cool Datafari T-Shirt if you share that with us, so that you can show the community you are a real Datafarian (smile)



Valid from 3.2

The documentation below is valid from Datafari 3.2 upwards

For now Datafari is preconfigured for English, French, Italian, Arabic, Brazilian Portuguese and Russian. Still, its aim is to have a global reach, so the steps to enable additional languages is rather straightforward and can be found here. In case you did it, please contact us so that we can integrate it in the next releases of Datafari !

Step-by-step guide

For the internationalization of the user interface, we use the i18n java library:

  1. The folder to store the i18n config files is here : datafari/WebContent/js/AjaxFranceLabs/locale. Just open it and add your new language. For example if you add German translation, put here a file named de.json.

  2.  Add the new langage in the Java class com.francelabs.datafari.utils.LanguageUtils.java :

    public static final List<String> availableLanguages = Arrays.asList("en", "fr", "it", "ar", "ru");


  3. Add the new language in WebContent/js/AjaxFranceLabs/i18njs.js :

    availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ],


  4. Add the new language in WebContent/js/parameters.js :

    availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ],


  5. Finally launch the ant script datafari-dev.xml to take into Datafari the modifications (if you are in development mode).


For the internationalization of the language detection, indexing and search by the Datafari Solr engine, follow these steps:

  1. Modify the dedicated Solr updateprocessor which is declared in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml, that you can find at updateRequestProcessorChainDatafari, which detects the languages based on the fields content and title. By default, we use English and French. In order to add a new language, modify the new language to the element "langid.whitelist".

  2. Modify the Solr schema to handle the new language, which you will find at DATAFARI_HOME/solr/solr_home/FileShare/conf/schema.xml. You will notice that we already have the following two fields which are language specific, namely content and title. Therefore, we have "content_en", "title_en", "content_fr", "title_fr". You need to create your specific "content_xy" and "title_xy" fields for your new language.

  3. Modify the searchrequesthandler named select in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml. There, change the parameters qf et pf : put the following new fields: title_xy and content_xy to the existing chain of parameters.

  4. Now you can restart your Datafari for the changes to be taken into account.


You can either send us your new language either directly or using github, either way is fine by us as the modifications are not huge. We will send you a cool Datafari T-Shirt if you share that with us, so that you can show the community you are a real Datafarian (smile)


  • No labels