...
Once you are satisfied with the configuration of the Spacy NER job, you would like to run it in a time window that will not be the same than the original job if you plan to run the two jobs on the same MCF node. The best thing to do is to create the NER job on a dedicated MCF node so that you will be able to run it at any wanted time. If job, make sure that the crawling time window of your Spacy job occurs AFTER the crawling time window of your corresponding non-Spacy job. Otherwise your Spacy-extracted entities will be deleted by the non-Spacy job crawl. Note also that if you run two jobs at the same time on an MCF node, the two jobs will interfere with each other because MCF only has one processing queue for documents. So, MCF will randomly queue documents to process from the standard job and the Spacy job, resulting in longer processing time for both jobs, but more importantly, some documents may be processed by the Spacy job BEFORE the standard job and in that case, the Spacy-extracted entities WILL BE LOSTwill be lost, because the last version of the document that will be indexed will be the one without the extracted entities.
The recommended method is to create the Spacy job on a dedicated MCF node, while still making sure that its time window is AFTER the time window of its corresponding non-Spacy job (which is on another MCF node).