Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Select the source type you want to crawl. In my case, I want to crawl a SMB server so I will choose “Create a Job Filer”. Before that, don’t forget to follow the Add the JCIFS-NG Connector to Datafari - Community Edition documentation in order to install the JCIFS-NG library (not pre-installed because it is an LGPL licence).

  • Server: the URL of your SMB file system.

  • User & password: credentials to access your file system.

  • Path: path to the repository your want to crawl.

...

  • Model : You can leave blank to use the default model, which is configured the model.json file. As of November 2023, the default model is therefore en_core_web_trf

  • Endpoint : use the one per default, “/split_detect_and_process/”.

  • Prefix : This defines how your metadata will be named. In our example, we will set it to“entity_” as recommended, so we can use the existing field “entity_person”.

...

Note

Warning: remember that the one you want to launch is the one havine having NER” in its name.

This operation may take some time, depending on the size of your file set and your server performance. Document should start appearing in the Datafari Search UI

...