...
For example, you can use it to extract e-mails, phone numbers, product numbers from documents. You could also extract words coming from a list, and later decide to display them as facets.
How to use it
Add the Regex Entity Connector in the job pipeline in the Connection tab of MCF.
...
You can add as many destination metadata, regular expression and source metadata as you want by clicking on the Add button.
Some examples of useful regular expression:
Ignore case: (?i)searched_word: retrieves “searched_word” regardless of character case.
Retrieve the line containing: .*searched_word.*
Search a point: \. “\” is the escape character.
Spaces are taken into account, so searching “word1 word2” will search the exact expression in the content.
e-mails: ([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})
Phone number: (\+)([\s.\(\)]*\d{1}){8,13}(-)?(\d{1,5})
Search “word1” or “word2”: word1|word2.
...