...
...
...
...
Info | |
---|---|
title | Valid from 5.0The documentation below is valid from Datafari 45.4 0 upwards |
For now Datafari is preconfigured to display its menu and functionalities (not to be mixed up with the languages that can be analysed at the indexing phase) for English, French, German, Italian, Arabic, Brazilian Portuguese and Russian. See further below for the default languages that can be analysed at the indexing and search phases.
Step-by-step guide for an existing Datafari
For the internationalization of the user interface
...
:
The folder to store the i18n config files is here : /opt/datafari
...
/tomcat/webapps/Datafari/resources/js/AjaxFranceLabs/locale. Just open it and add your new language. For example if you add German translation, put here a file named de.json.
Also add a new entry in all the i18n files corresponding to your new language. For example to add Spanish language, we added the following entry in all the i18n files :Code Block "es_locale" : "Español"
...
For now, we need to add the new
...
language in
...
a Java
...
class. So you will need to download the source code of Datafari, modify the class and upload it to your Datafari server. The class that needs to be modified is this one : com.francelabs.datafari.utils.LanguageUtils.java located in the datafari-webapp module. You need to modify the following line by adding your language :
Code Block public static final List<String> availableLanguages = Arrays.asList("en", "fr", "it", "ar", "ru");
Add the new language in WebContent/js/AjaxFranceLabs/i18njs.js :
Code Block |
---|
availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ], |
Then recompile the source code. When you are done, just send back the file $YOUR_DATAFARI_PROJECT/datafari-webapp/target/classes/com/francelabs/datafari/utils/LanguageUtils.class into your Datafari server to this location : /opt/datafari/tomcat/webapps/Datafari/WEB-INF/classes/com/francelabs/datafari/utils.
Then you need to restart Datafari to load your changes.Add the new language in /opt/datafari/tomcat/webapps/Datafari/resources/
...
{}
...
Finally launch the ant script datafari-dev.xml to take into Datafari the modifications (if you are in development mode).
For the internationalization of the language detection, indexing and search by the Datafari Solr engine, follow these steps:
- Modify the dedicated Solr updateprocessor which is declared in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml, that you can find at updateRequestProcessorChainDatafari, which detects the languages based on the fields content and title. By default, we use English and French. In order to add a new language, modify the new language to the element "langid.whitelist".
- Modify the Solr schema to handle the new language, which you will find at DATAFARI_HOME/solr/solr_home/FileShare/conf/schema.xml. You will notice that we already have the following two fields which are language specific, namely content and title. Therefore, we have "content_en", "title_en", "content_fr", "title_fr". You need to create your specific "content_xy" and "title_xy" fields for your new language.
- Modify the searchrequesthandler named select in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml. There, change the parameters qf et pf : put the following new fields: title_xy and content_xy to the existing chain of parameters.
- Now you can restart your Datafari for the changes to be taken into account.
Info | ||
---|---|---|
| ||
The documentation below is valid from Datafari 4.4 upwards |
For now Datafari is preconfigured to display its menu and functionalities (not to be mixed up with the languages that can be analysed at the indexing phase) for English, French, German, Italian, Arabic, Brazilian Portuguese and Russian. See further below for the default languages that can be analysed at the indexing and search phases.
Step-by-step guide
For the internationalization of the user interface, we use the i18n java library:
The folder to store the i18n config files is here : datafari/WebContent/js/AjaxFranceLabs/locale. Just open it and add your new language. For example if you add German translation, put here a file named de.json.
Also add a new entry in all the i18n files corresponding to your new language. For example to add Spanish language, we added the following entry in all the i18n files :
Code Block |
---|
"es_locale" : "Español" |
Add the new langage in the Java class com.francelabs.datafari.utils.LanguageUtils.java :
Code Block |
---|
public static final List<String> availableLanguages = Arrays.asList("en", "fr", "it", "ar", "ru"); |
Add the new language in WebContent/js/AjaxFranceLabs/i18njs.js :
Code Block |
---|
availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ], |
...
For Datafari Enterprise Edition only : create a file with the new language : LOCALE.json , for example es.json into the folder /opt/datafari/tomcat/webapps/Datafari/customs/i18n
Add this content into it :
{}
...
Finally launch the ant script datafari-dev.xml to take into Datafari the modifications (if you are in development mode).
For the internationalization of the language detection, indexing and search by the Datafari Solr engine, follow these steps:
- Modify the dedicated Solr updateprocessor which is declared in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml, that you can find at updateRequestProcessorChainDatafari, which detects the languages based on the fields content and title. By default, we use English and French. In order to add a new language, modify the new language to the element "langid.whitelist".
- Modify the Solr schema to handle the new language, which you will find at DATAFARI_HOME/solr/solr_home/FileShare/conf/schema.xml. You will notice that we already have the following two fields which are language specific, namely content and title. Therefore, we have "content_en", "title_en", "content_fr", "title_fr". You need to create your specific "content_xy" and "title_xy" fields for your new language.
- Modify the searchrequesthandler named select in the DATAFARI_HOME/solr/solr_home/FileShare/conf/solrconfig.xml. There, change the parameters qf et pf : put the following new fields: title_xy and content_xy to the existing chain of parameters.
- Now you can restart your Datafari for the changes to be taken into account.
Info | ||
---|---|---|
| ||
The documentation below is valid from Datafari 4.0 upwards |
For now Datafari is preconfigured for English, French, Italian, Arabic, Brazilian Portuguese and Russian. Still, its aim is to have a global reach, so the steps to enable additional languages is rather straightforward and can be found here. In case you did it, please contact us so that we can integrate it in the next releases of Datafari !
Step-by-step guide
For the internationalization of the user interface, we use the i18n java library:
The folder to store the i18n config files is here : datafari/WebContent/js/AjaxFranceLabs/locale. Just open it and add your new language. For example if you add German translation, put here a file named de.json.
Also add a new entry in all the i18n files corresponding to your new language. For example to add Spanish language, we added the following entry in all the i18n files :
Code Block |
---|
"es_locale" : "Español" |
Add the new langage in the Java class com.francelabs.datafari.utils.LanguageUtils.java :
Code Block |
---|
public static final List<String> availableLanguages = Arrays.asList("en", "fr", "it", "ar", "ru"); |
Add the new language in WebContent/js/AjaxFranceLabs/i18njs.js :
Code Block |
---|
availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ], |
For Datafari Enterprise Edition only : create a file with the new language : LOCALE.json , for example es.json into the folder /opt/datafari/tomcat/webapps/Datafari/customs/i18n
Add this content into it :
Code Block |
---|
{} |
...
Finally launch the ant script datafari-dev.xml to take into Datafari the modifications (if you are in development mode).
For the internationalization of the language detection, indexing and search by the Datafari Solr engine, follow these steps:
...
js/AjaxFranceLabs/i18njs.js :
Code Block availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ],
Create a file with the new language : LOCALE.json , for example es.json into the folder /opt/datafari/tomcat/webapps/Datafari/resources/customs/i18n
Add this content into it :Code Block {}
NB : no error on the content : it is really only : {}
For the internationalization of the language detection, indexing and search by the Datafari Solr engine :
Modify the Solr schema to handle the new language
Add the stopwords file specific to your language. In order to do so, download the Solr version of current version of Datafari here : https://archive.apache.org/dist/lucene/solr/8.5.2/solr-8.5.2.tgz then open the folder solr-8.5.2/server/solr/configsets/_default/conf/lang and search the stopwords file corresponding to your language.
Then add it to this location into your Datafari server : /opt/datafari/solr/solrcloud/FileShare/conf/lang/
Then you need to push the file to Zookeeper. To do so, go to the admin UI of Datafari and click on Search Engine Configuration / System Configuration manager menu then click on push my modifications button then apply my modifications (see the page System Configuration Manager (Zookeeper) for more explanations)Add the fieldtype of your language. In order to do so, download the Solr version of current version of Datafari here : https://archive.apache.org/dist/lucene/solr/8.5.2/solr-8.5.2.tgz then open the file solr-8.5.2/server/solr/configsets/_default/conf/managed-schema and search the fieldtype corresponding to your language.
For example, if we want to add Russian language, the Solr configuration is this one :Code Block <!-- Russian --> <fieldType name="text_ru" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt" format="snowball" /> <filter class="solr.SnowballPorterFilterFactory" language="Russian"/> <!-- less aggressive: <filter class="solr.RussianLightStemFilterFactory"/> --> </analyzer> </fieldType>
We have to add it into our Datafari server. Edit the file /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema/custom_fieldTypes.incl and adapt the previous code to put into it (you can look at the file /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema/custom_fieldTypes.incl.example to help you) :
Code Block { "name":"text_ru", "class":"solr.TextField", "positionIncrementGap":"100", "analyzer" : { "tokenizer":{ "class":"solr.StandardTokenizerFactory" }, "filters":[{ "class":"solr.LowerCaseFilterFactory" }, { "class":"solr.StopFilterFactory","ignoreCase":"true","words":"lang/stopwords_ru.txt","format":"snowball" }, { "class":"solr.SnowballPorterFilterFactory","language":"Russian" }] } }
Add the new fieldType to Solr by launching the script addCustomSchemaInfo.sh located into /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema :
Code Block cd /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema bash addCustomSchemaInfo.sh
Add fields specific to your new language :
You will notice that we have two fields which are language specific, namely content and title. Therefore, we have "content_en", "title_en", "content_fr", "title_fr", “title_de”, “content_de”. You need to create your specific "content_xy" and "title_xy" fields for your new language and configure them with the fieldType you just added previously. We want them to be indexed, stored and multiValued.
We have to add them into our Datafari server. Edit the file /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema/custom_fields.incl and create the two fields like below (you can look at the file /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema/custom_fields.incl.example to help you). For example here is the config for Russian language :Code Block { "name":"title_ru", "type":"text_ru", "stored":true, "multiValued":"true", "indexed":"true", } && { "name":"title_ru", "type":"text_ru", "stored":true, "multiValued":"true", "indexed":"true", }
Add the new fieldType to Solr by launching the script addCustomSchemaInfo.sh located into /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema :
Code Block cd /opt/datafari/solr/solrcloud/FileShare/conf/customs_schema bash addCustomSchemaInfo.sh
Modify the dedicated Solr updateprocessor which is declared in the /opt/datafari/solr/solrcloud/FileShare/conf/solrconfig.xml, that you can find at updateRequestProcessorChainDatafari, which detects the languages based on the fields content and title. By default, we use English, French and
...
German. In order to add a new language,
...
add the new language to the element "langid.whitelist"
...
:
Code Block <processor class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory"> <str name="langid.fl">content,title</str> <str name="langid.langField">language</str> <str name="langid.map">true</str> <str name="langid.whitelist">en,fr,de</str> <str name="langid.fallback">en</str> </processor>
Then you need to push your changes to Zookeeper. To do so, go to the admin UI of Datafari and click on Search Engine Configuration / System Configuration manager menu then click on push my modifications button then apply my modifications (see the page System Configuration Manager (Zookeeper) for more explanations)
Finally change the Index fields relevancy weights Configuration. Indeed you need to add your new fields : title_xy and content_xy
...
Info |
---|
You can either send us your new language either directly or using github, either way is fine by us as the modifications are not huge. We will send you a cool Datafari T-Shirt if you share that with us, so that you can show the community you are a real Datafarian |
...
into the search algorithm. You can do it directly by the Datafari Admin UI : go to Search Engine Configuration → Fields Weight. Then click on the button add a field to add the two fields. For example for Russian we add :
title_ru 50
content_ru 10
(more explanations here : Index fields relevancy weights Configuration
...
Expand | ||||
---|---|---|---|---|
| ||||
For now Datafari is preconfigured |
...
to display its menu and functionalities (not to be mixed up with the languages that can be analysed at the indexing phase) for English, French, German, Italian, Arabic, Brazilian Portuguese and Russian. See further below for the default languages that can be analysed at the indexing and search phases. Step-by-step guideFor the internationalization of the user interface, we use the i18n java library:
|
...
|
...
|
...
Code Block |
---|
availableLanguages : [ 'en', 'fr', 'it', 'ar', 'ru' ], |
...
For the internationalization of the language detection, indexing and search by the Datafari Solr engine, follow these steps:
|
...
Expand | ||
---|---|---|
|
...
Info |
---|
You can either send us your new language either directly or using github, either way is fine by us as the modifications are not huge. We will send you a cool Datafari T-Shirt if you share that with us, so that you can show the community you are a real Datafarian |
Related articles
Filter by label (Content by label) | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Page Properties | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||
| ||||||||||||
For now Datafari is preconfigured for English, French, Italian, Arabic, Brazilian Portuguese and Russian. Still, its aim is to have a global reach, so the steps to enable additional languages is rather straightforward and can be found here. In case you did it, please contact us so that we can integrate it in the next releases of Datafari ! Step-by-step guideFor the internationalization of the user interface, we use the i18n java library:
For the internationalization of the language detection, indexing and search by the Datafari Solr engine, follow these steps:
|
...
Expand | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
For now Datafari is preconfigured for English, French, Italian, Arabic, Brazilian Portuguese and Russian. Still, its aim is to have a global reach, so the steps to enable additional languages is rather straightforward and can be found here. In case you did it, please contact us so that we can integrate it in the next releases of Datafari ! Step-by-step guideFor the internationalization of the user interface, we use the i18n java library:
For the internationalization of the language detection, indexing and search by the Datafari Solr engine, follow these steps:
|