Concepts of our crawling framework

Basically, it all starts with a repository connector to define the type of data source you want to index. You can use the simplified connector mechanism directly from the Datafari admin UI if you want to index some documents swiftly.

Optionally, you can create a job to instantiate your repository connector and define what you exactly want to index.

Also, within this job, you can define an indexing pipeline that uses transformation connectors.

We provide an illustration so that you can have a better idea.

Advanced Crawling website

Please consult Apache ManifoldCF Connectors documentation for further details.