...
And that's it, your Repository connector is created with the appropriate default values:
...
, as you can see in the following screenshot :
...
Info |
---|
Note: the Authority Group is currently not actually being used (as of October 2023) |
Create the corresponding CSV Job with this minimal configurationjob with at least the following info (as seen in the screenshot below):
...
In the connection Pipeline (as seen in the screenshot above), you do not need to add a Tika connector, since a csv file is a simple texte format. You do required at least the Repository in stage 1, and the Output in stage 2.
...
For the CSV file paths parameter (as seen in the screenshot above), it is necessary to specify the CSV file names to be used. The syntax required is that of local files, so if you need to access remote files, you can use SAMBA, for example, to mount a directory containing your CSV files.Use Solr fields in your CSV file to add metadata. Column use mount the foldercontaining your CSV files. Here is an example for a remote file exposed via SMB, mounted locally : /mounted_remote_smb_share/folder1/…/foldern/filename.csv
For the Separator character parameter, insert the separator you want to use. By default, it is “,”. Note that works with multiple characters, but we have not tested it with “typical” escape characters such as “\”.
Info |
---|
Note: this connector does not handle cases where the csv content contains the same character used as the separator character, so you would need to do some cleanup upfront. |
For the Content Column Label parameter, it is used to map the solr field Content to any column of your CSV file. So if your CSV file contains a column with name “mon_contenu”, and you want to map it to the Content solr field, put “mon_contenu” on this parameter.
For the Id Column Label parameter, it is used to map the solr field Id to any column of your CSV file. So if your CSV file contains a column with name “mon_id”, and you want to map it to the id solr field, put “mon_id” on this parameter.
You cannot graphically do any other mapping between a CSV column name, and a solr field. You must have an exact matching between an existing Datafari Solr Field name, and a CSV column name. As a consequence, CSV column names that do not correspond to any Solr field will not be used. Note the matching is case sensitive.
Info |
---|
Note: your CSV file MUST have a first row that contains the labels of its columns. Otherwise the CSV connector will not work. |
...