Software requirements

Valid from Datafari 6.0

The documentation below is valid from Datafari v6.0

Looking for hardware requirements ?

See https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/1662451718

We detail here the software requirements for the machines used in your Datafari setup. It may vary based on your setup.

Note that you can EITHER use the script below to install Java and all the requirements dependencies OR install manually Java and the dependencies as explained into the page.

OPTION 1 - USING THE SCRIPT

Download script to install Java and dependencies

You can download an init script (.sh) for Datafari :

  • Then launch the script with this command :

source init_datafari_6_dependencies.sh

This script will install Java, all the dependencies needed and will set the open files limit configuration. You have the details of all that the script does by looking at OPTION 2 below.

OPTION 2 - DOING IT MANUALLY (RISKIER)

OS requirements

  • Debian 12 (Bookworm), or Ubuntu 22.04 (Jammy) Environment 64 bits (a Docker image is available if you are on Windows environment) (available both for Datafari CE or Datafari EE)

  • CentOS or RedHat 8 or 9 and CentOS Stream 8 or 9 (available only for Datafari EE)

Java requirements

  • Java version 11 only mandatory installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH, see the annex below to view how to do it)

Dependences requirements

Depending on your OS, the list is not the same.

  • For Debian/Ubuntu users :

apt-get install bash curl debconf unzip sudo libc6-dev jq lsof apache2 libapache2-mod-jk iptables iptables-persistent zip iputils-ping systemd procps bc netcat-openbsd -y -q
  • For Almalinux/CentOS users :

yum install -y jq lsof curl perl-Test-Simple perl-version httpd mod_ssl nano iptables-services nc iputils unzip gcc python3-devel bc

Other requirements

  • Increase Open Files limit configuration

    Change the setting in /etc/security/limits.conf : 

    Then reboot the machine to enable those parameters

  • Set the locale setting (in the example with fr_FR but can be replaced with your locale) : 

If you have some errors, see the annex below.

Optional requirements

  • Optional: You need a user member of the sudo group to start Datafari (or it can be root user)  :

  • Optional: Make sure that the machine is always up to date (in particular for the email alerts scheduler), by enabling ntpdate, for example : 

ANNEXES

  • Set JAVA_HOME variable

  • Troubleshooting for setting the locale
    If you still have an error like "perl: warning: Setting locale failed."
    Do the additionnal steps :
    check the sshd configuration :

    and comment the line :

    Then relaunch the sshd service :

    Add the locale variables to .bashrc configuration file :

    Add the lines :

    For PostgreSQL : 
    PostgreSQL needs to have the variables LANG and LC_* set. To check them, launch the command :

    If LC_ALL and LC_TYPE are not filled, enter this (for English language):

     You can also add the line to /etc/profile then.


We detail here the software requirements for the machines used in your Datafari setup. It may vary based on your setup.

For all installations of Datafari :

  • OS requirements :

    • Debian 11 (Bullseye) or 12, or Ubuntu 20.04 (Focal) or 22.04 (Jammy) Environment 64 bits (a Docker image is available if you are on Windows environment) (available both for Datafari CE or Datafari EE)

    • CentOS or RedHat 8 or 9 and CentOS Stream 8 or 9 (available only for Datafari EE)

    • Recommended versions are Debian 12 or Ubuntu Jammy

Download dependencies scripts

You can download an init script (.sh) for Datafari :

  • For Debian or Ubuntu users :

  • For RedHat or CentOs users :

You can launch the script with this command :

=> It will install all the needed dependencies and increase the open files limit configuration.

Debian/Ubuntu installation

This part of the documentation is for Debian/Ubuntu users, if you are on CentOS please go to the next section.

  • You need to have a Java JDK 11 (the JDK is mandatory if you use ELK, otherwise JRE is sufficient) installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH). To set it, we recommend to set it in /etc/profile :

    For now, Java 11 is the only version of Java supported by Datafari

On Debian 9 you have to add this prior to the installation of Java :

Mandatory dependencies :

Python is supported in version 2.7+ or 3+.

Centos/RedHat installation

  • You need to have a Java JDK 11 (the JDK is mandatory if you use ELK, otherwise JRE is sufficient) installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH). To set it, we recommend to set it in /etc/profile :

In the Datafari 5.0 version on Centos version only, only Python 2.7 version is supported.

For all Operating Systems

  • You need a user member of the sudo group to start Datafari (or it can be root user)  :

  • Increase Open Files limit configuration

    Change the setting in /etc/security/limits.conf : 

    Then reboot the machine to enable those parameters

  • Make sure that the machine is always uptodate (in particular for the email alerts scheduler), by enabling: 

  • Set the locale setting (in the example with fr_FR but can be replaced with your locale) : 

    If you still have an error like "perl: warning: Setting locale failed."
    Do the additionnal steps :
    check the sshd configuration :

    and comment the line :

    Then relaunch the sshd service :

    Add the locale variables to .bashrc configuration file :

    Add the lines :

    For PostgreSQL : 
    PostgreSQL needs to have the variables LANG and LC_* set. To check them, launch the command :

    If LC_ALL and LC_TYPE are not filled, enter this (for English language):

     You can also add the line to /etc/profile then.

    For OS X users : if you obtain the message : Cannot set LC_CTYPE to default locale: No such file or directory when you launch locale command, it might be because your terminal automatically sets environment variables when you log in from Mac to a Linux server : see https://askubuntu.com/a/778672 . To turn if off, uncheck the checkbox here : 

    In iTerm it is in the profile -> Terminal tab.

    In Terminal, it is in the Terminal -> Preferences -> Profiles -> Advanced tab

    • Set the timezone if needeed :
      Check the local zonetime that matches your region in /usr/share/zoneinfo. Then create a symlink from /etc/localtime (example here with Paris time) :


We detail here the software requirements for the machines used in your Datafari setup. It may vary based on your setup.

For all installations of Datafari :

  • OS requirements :

    • Debian 10 (Buster) or 11 (Bullseye), or Ubuntu 20.04 (Focal) or 22.04 (Jammy) Environment 64 bits (a Docker image is available if you are on Windows environment) (available both for Datafari CE or Datafari EE)

    • CentOS or RedHat 7, 8 or 9 and CentOS Stream 7, 8 or 9 (available only for Datafari EE)

    • Recommended versions are Debian 11 or Ubuntu Jammy

Download dependencies scripts

You can download an init script (.sh) for Datafari :

  • For Debian or Ubuntu users :

  • For RedHat or CentOs users :

You can launch the script with this command :

=> It will install all the needed dependencies and increase the open files limit configuration.

Debian/Ubuntu installation

This part of the documentation is for Debian/Ubuntu users, if you are on CentOS please go to the next section.

  • You need to have a Java JDK 11 (the JDK is mandatory if you use ELK, otherwise JRE is sufficient) installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH). To set it, we recommend to set it in /etc/profile :

    For now, Java 11 is the only version of Java supported by Datafari

On Debian 9 you have to add this prior to the installation of Java :

Mandatory dependencies :

Python is supported in version 2.7+ or 3+.

Centos/RedHat installation

  • You need to have a Java JDK 11 (the JDK is mandatory if you use ELK, otherwise JRE is sufficient) installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH). To set it, we recommend to set it in /etc/profile :

In the Datafari 5.0 version on Centos version only, only Python 2.7 version is supported.

For all Operating Systems

  • You need a user member of the sudo group to start Datafari (or it can be root user)  :

  • Increase Open Files limit configuration

    Change the setting in /etc/security/limits.conf : 

    Then reboot the machine to enable those parameters

  • Make sure that the machine is always uptodate (in particular for the email alerts scheduler), by enabling: 

  • Set the locale setting (in the example with fr_FR but can be replaced with your locale) : 

    If you still have an error like "perl: warning: Setting locale failed."
    Do the additionnal steps :
    check the sshd configuration :

    and comment the line :

    Then relaunch the sshd service :

    Add the locale variables to .bashrc configuration file :

    Add the lines :

    For PostgreSQL : 
    PostgreSQL needs to have the variables LANG and LC_* set. To check them, launch the command :

    If LC_ALL and LC_TYPE are not filled, enter this (for English language):

     You can also add the line to /etc/profile then.

    For OS X users : if you obtain the message : Cannot set LC_CTYPE to default locale: No such file or directory when you launch locale command, it might be because your terminal automatically sets environment variables when you log in from Mac to a Linux server : see https://askubuntu.com/a/778672 . To turn if off, uncheck the checkbox here : 

    In iTerm it is in the profile -> Terminal tab.

    In Terminal, it is in the Terminal -> Preferences -> Profiles -> Advanced tab

    • Set the timezone if needeed :
      Check the local zonetime that matches your region in /usr/share/zoneinfo. Then create a symlink from /etc/localtime (example here with Paris time) :


 

We detail here the software requirements for the machines used in your Datafari setup. It may vary based on your setup.

For all installations of Datafari :

  • OS requirements :

    • Debian 10 (Buster) or 11 (Bullseye) or Ubuntu 20.04 (Focal) or 22.04 (Jammy) Environment 64 bits (a Docker image is available if you are on Windows environment)

    • CentOS or RedHat 7 and CentOS Stream 8 (only for Datafari EE)

    • Recommended version are Debian 11 or Ubuntu Jammy

Download dependencies scripts

You can download an init script (.sh) for Datafari :

  • For Debian or Ubuntu users :

  • For RedHat or CentOs users :

You can launch the script with this command :

=> It will install all the needed dependencies and increase the open files limit configuration.

Debian/Ubuntu installation

This part of the documentation is for Debian/Ubuntu users, if you are on CentOS please go to the next section.

  • You need to have a Java JDK 11 (the JDK is mandatory if you use ELK, otherwise JRE is sufficient) installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH). To set it, we recommend to set it in /etc/profile :

    For now, Java 11 is the only version of Java supported by Datafari

On Debian 9 you have to add this prior to the installation of Java :

Mandatory dependencies :

Python is supported in version 2.7+ or 3+.

Centos/RedHat installation

  • You need to have a Java JDK 11 (the JDK is mandatory if you use ELK, otherwise JRE is sufficient) installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH). To set it, we recommend to set it in /etc/profile :

In the Datafari 5.0 version on Centos version only, only Python 2.7 version is supported.

For all Operating Systems

  • You need a user member of the sudo group to start Datafari (or it can be root user)  :

  • Increase Open Files limit configuration

    Change the setting in /etc/security/limits.conf : 

    Then reboot the machine to enable those parameters

  • Make sure that the machine is always uptodate (in particular for the email alerts scheduler), by enabling: 

  • Set the locale setting (in the example with fr_FR but can be replaced with your locale) : 

    If you still have an error like "perl: warning: Setting locale failed."
    Do the additionnal steps :
    check the sshd configuration :

    and comment the line :

    Then relaunch the sshd service :

    Add the locale variables to .bashrc configuration file :

    Add the lines :

    For PostgreSQL : 
    PostgreSQL needs to have the variables LANG and LC_* set. To check them, launch the command :

    If LC_ALL and LC_TYPE are not filled, enter this (for English language):

     You can also add the line to /etc/profile then.

    For OS X users : if you obtain the message : Cannot set LC_CTYPE to default locale: No such file or directory when you launch locale command, it might be because your terminal automatically sets environment variables when you log in from Mac to a Linux server : see https://askubuntu.com/a/778672 . To turn if off, uncheck the checkbox here : 

    In iTerm it is in the profile -> Terminal tab.

    In Terminal, it is in the Terminal -> Preferences -> Profiles -> Advanced tab

    • Set the timezone if needeed :
      Check the local zonetime that matches your region in /usr/share/zoneinfo. Then create a symlink from /etc/localtime (example here with Paris time) :


We detail here the software requirements for the machines used in your Datafari setup. It may vary based on your setup.

For all installations of Datafari :

  • OS requirements :

    • Debian 9+ or Ubuntu 16+ Environment 64 bits (a Docker image is available if you are on Windows environment)

    • CentOS or RedHat 7.0+ (only for Datafari EE)

    • Recommended version is Debian 10

Debian/Ubuntu installation

This part of the documentation is for Debian/Ubuntu users, if you are on CentOS please go to the next section.

  • You need to have a Java JDK 11 (the JDK is mandatory if you use ELK, otherwise JRE is sufficient) installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH). To set it, we recommend to set it in /etc/profile :

    For now, Java 11 is the only version of Java supported by Datafari

On Debian 9 you have to add this prior to the installation of Java :

Mandatory dependencies :

Python is supported in version 2.7+ or 3+.

Centos/RedHat installation

  • You need to have a Java JDK 11 (the JDK is mandatory if you use ELK, otherwise JRE is sufficient) installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH). To set it, we recommend to set it in /etc/profile :

In the Datafari 5.0 version on Centos version only, only Python 2.7 version is supported.

For all Operating Systems

  • You need a user member of the sudo group to start Datafari (or it can be root user)  :

  • Increase Open Files limit configuration

    Change the setting in /etc/security/limits.conf : 

    Then reboot the machine to enable those parameters

  • Make sure that the machine is always uptodate (in particular for the email alerts scheduler), by enabling: 

  • Set the locale setting (in the example with fr_FR but can be replaced with your locale) : 

    If you still have an error like "perl: warning: Setting locale failed."
    Do the additionnal steps :
    check the sshd configuration :

    and comment the line :

    Then relaunch the sshd service :

    Add the locale variables to .bashrc configuration file :

    Add the lines :

    For PostgreSQL : 
    PostgreSQL needs to have the variables LANG and LC_* set. To check them, launch the command :

    If LC_ALL and LC_TYPE are not filled, enter this (for English language):

     You can also add the line to /etc/profile then.

    For OS X users : if you obtain the message : Cannot set LC_CTYPE to default locale: No such file or directory when you launch locale command, it might be because your terminal automatically sets environment variables when you log in from Mac to a Linux server : see https://askubuntu.com/a/778672. To turn if off, uncheck the checkbox here : 

    In iTerm it is in the profile -> Terminal tab.

    In Terminal, it is in the Terminal -> Preferences -> Profiles -> Advanced tab

    • Set the timezone if needeed :
      Check the local zonetime that matches your region in /usr/share/zoneinfo. Then create a symlink from /etc/localtime (example here with Paris time) :






We detail here the software requirements for the machines used in your Datafari setup. It may vary based on your setup.





For all installations of Datafari :

  •  

    • OS requirements :

      • Debian 7+ or Ubuntu 16+ Environment 64 bits (a Docker image is available if you are on Windows environment)

      • CentOS or RedHat 7.0+ (only for Datafari EE)

    • Recommended versions are Debian 9 and Debian 8 (if you are on Debian 7 you will need to add the testing repo in /etc/apt/sources.list)

    • Starting from Datafari 4.1, Datafari does not embed its own Java component. You need to have a Java JRE 8 (or JDK 8) installed on your environment (and you need to set the JAVA_HOME variable for all users and have the java executable in the PATH). To set it, we recommend to set it in /etc/profile :

      For now, Java 8 is the only version of Java supported by Datafari

    • Debian/Ubuntu environment : requires unzip, sudo, libc6-dev, jq, lsof



    • Datafari needs Python v 2.7.x. !! If you have only Python v3, please install Python2 (for Ubuntu 16.04 for example, install the package python-minimal)



    • Needs a user member of the sudo group to start Datafari (or can be root user)  :



    • Make sure that the machine is always uptodate (in particular for the email alerts scheduler), by enabling: 



    • Set the locale setting (in the example with fr_FR but can be replaced with your locale) : 

      If you still have an error like "perl: warning: Setting locale failed."
      Do the additionnal steps :
      check the sshd configuration :

      and comment the line :

      Then relaunch the sshd service :

      Add the locale variables to .bashrc configuration file :

      Add the lines :



      For PostgreSQL : 
      PostgreSQL needs to have the variables LANG and LC_* set. To check them, launch the command :


      If LC_ALL and LC_TYPE are not filled, enter this (for English language):

       You can also add the line to /etc/profile then.

      For OS X users : if you obtain the message : Cannot set LC_CTYPE to default locale: No such file or directory when you launch locale command, it might be because your terminal automatically sets environment variables when you log in from Mac to a Linux server : see https://askubuntu.com/a/778672. To turn if off, uncheck the checkbox here : 

      In iTerm it's in the profile -> Terminal tab.

      In Terminal, it's in the Terminal -> Preferences -> Profiles -> Advanced tab



    • Increase Open Files limit configuration

      For Debian, change the setting in /etc/security/limits.conf : 


      Then reboot the machine to enable those parameters



    • Set the timezone if needeed :
      Check the local zonetime that matches your region in /usr/share/zoneinfo. Then create a symlink from /etc/localtime (example here with Paris time) :



    • For Redhat/Centos :

install Java :

install dependencies :



For a distributed Datafari:

  • ManifoldCF: check the Apache ManifoldCF website recommandations and the above recommandations

  • ELK: check the Elastic website recommandations

  • Datafari main server : Chek the above recommandations

  • SolrCloud servers: Chek the above recommandations