Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

Valid for Datafari 6.0 upwards

Datafari_CE_6_0_Architecture_mono.png

...

Expand
titleValid for 5.2

Valid for 5.2

...

Expand
titleValid for V4.0

Valid for V4.0

Image RemovedImage Added

...

Expand
titleValid for V2.0

Valid for V2.0

Datafari uses a typical search engine architecture. It is based on the triptych crawling, indexing and search. The crawling part is using Apache ManifoldCF. There is another opensource connectors framework under Apache licence v2, the Google Connector Framework. Yet the latter is only being supported and developed by Google, so it appeared more reasonable to us to leverage ManifoldCF, proposed by the Apache foundation, and which benefits from the support of several committers from different entities.

The indexing and search parts both use Apache Solr. Again, there is another popular indexing and search engine available under Apache licence V2. It is Elasticsearch, but similarly to the Google Connector Framework, it is led by the eponym entity, and as such does not guarantee such a longevity as Apache Solr.

ManifoldCF, although being independent from Apache Solr, has the advantage of being conceived from the start as Solr’s connectors framework. It is thus conceptually “naturally” connected to Apache Solr.

The figure below illustrates the system architecture that we are using for Datafari. Its v2 is becoming rather large in terms of components, so this architecture is rather high level and intentionally avoids some connections and components for the sake of clarity.


...