Setup development environment with Eclipse (Linux) for Datafari 4.x

Pre-requisites

  • Have a JDK installed on your PC (Java 8)

  • Have an IDE installed on your PC (in the guide we use Eclipse) and be able to launch it as root

  • Have git installed on your PC

  • Have maven and ant installed on your PC

Clone repository

Open a terminal and navigate to the folder where you want to have Datafari source code to be checked out (usually your workspace folder).

Perform a

git clone https://github.com/francelabs/datafari.git

to checkout the code. The root directory name is datafari.

Build Datafari for the development environment

Navigate to the folder datafari. We now call it DATAFARI_SRC.

Run a : 'mvn install' in DATAFARI_SRC.

Run a : 'ant all' in DATAFARI_SRC/debian7

Install Datafari

Get the dependencies :

apt-get install curl debconf unzip sudo libc6-dev jq lsof

Run a 'dpkg -i DATAFARI_SRC/debian7/installer/dist/datafari.deb and type an admin password.

Open the project in Eclipse

In Eclipse,go to File -> Import... , type maven and « Existing Maven project ».

Select the DATAFARI_SRC folder as root directory and click Finish. 

Change the permissions of the following folders :

chmod -R 777 /opt/datafari  chmod -R 700 /opt/datafari/pgsql

To add access rights to any user on datafari installation folder. Be careful : this should be done only for the development environment and should be avoided for a production deployment!

Setup the Tomcat server in Eclipse

Go to Window -> Preferences -> Server -> Runtime Environments -> Add…

and select Apache Tomcat V.9.0.

Select DATAFARI_SRC/tomcat-dev for the Tomcat Installation directory and Finish.

If Eclipse does not accept the Tomcat version, stop Eclipse, and copy the patch you can download here https://bugs.eclipse.org/bugs/attachment.cgi?id=262418&action=edit in the plugins directory of Eclipse installation directory.

In a Servers window (for example in Debug Perspectives), click on "add a server", select a "tomcat 9.0 server". Click on next and add Datafari webapp :

Click Finish.

Run Datafari

Run Datafari in normal mode. Go to /opt/datafari/bin and run ./start-datafari.sh.

Start Tomcat server with a right click → Start on the server:

Go to the Datafari URL (http://localhost:9080/Datafari)

Update Solr configuration :

If you change the solr configuration in DATAFARI_SRC/datafari-solr/solr_home, you have to run the script updateSolrConfig.sh in DATAFARI_SRC/datafari-solr to update the config in Solr Cloud.

Configure connectors

Run Datafari tomcat. Go to the Datafari URL (http://localhost:9080/Datafari). Login in administrator (admin/admin).

In the left menu, click on Connectors; the Apache ManifoldCF login page shows up. Enter admin/admin to login (click on Login button, as the Enter key may not work).

Click on « List Output Connections » of Output section, and add a new « Output Connection ».
In Name tab, set the name to « DatafariSolr ».
In Type tab, set the connection type to Solr and then click on Continue button.
In the Server tab, set port to « 8983 », set web application name to « solr » and « Core/Collection name » to FileShare.

You can now save:

Your development environment is now ready.

For more information about ManifoldCF configuration, please have a look here: Crawling

Note that if after some time, you do a git pull and the build is not working anymore, it could be due to the Maven dependencies. Delete your local Maven repository in this case : the folder is : /your-user/.m2/