Valid as of Datafari 4.6
It can be useful sometimes to force a job to reindex all documents into MCF. When we do this the documents are still present into Solr and not deleted. Indeed, for a filer job, normally MCF compares the document that it is stored in its internal database before fetching the content of the file. If they are the same, the document is not fetched because it is identical.
For other jobs like a web job for example, each time that we start a job MCF does a full indexation because it can not compare a webpage to crawl with a webpage already crawled so this process is not useful. For other sources, if you are not sure about the behavior of the crawler, you can apply this process.
But if we do a modification into the job like adding a metadata, the change would not be visible because the document is not reindexed.
The process is in 2 steps :
Go to the MCF admin UI then into Jobs → List all jobs and click into ‘View’ in front of the job that will have its documents reindexed
At the end of the page of the job click on the yellow button : “Reset seeding”
Go to Outputs → List Output Connections then click on the View button in front of the output : DatafariSolrNoTika
Then click on the yellow button named : “Remove all associated records”
You can now start your job and all the documents will be indexed as if it was the first time that you launched the job.