Concepts of our crawling framework
Basically, it all starts with a repository connector to define the type of data source you want to index. You can use the simplified connector mechanism directly from the Datafari admin UI if you want to index some documents swiftly.
Optionally, you can create a job to instantiate your repository connector and define what you exactly want to index.
Also, within this job, you can define an indexing pipeline that uses transformation connectors.
We provide an illustration so that you can have a better idea.
Please consult Apache ManifoldCF Connectors documentation for further details.