[DEPRECATED] Usage Analytics - Enterprise Edition

Deprecated

This documentation is deprecated as of Datafari version 5.2. Kibana has been replaced by Apache Zeppelin and the new “dashboards” are presented in the following documentation: https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/2736095237

Valid from 4.0

The documentation below is valid from Datafari v4.0 upwards

This dashboard is more advanced than the one available in the Community Edition (Usage Analytics [DEPRECATED]). It is based on the Statistics logs generated by Datafari, which are inserted by Logstash in Elasticsearch under the "statistic/logs" index.

It shows, various informations about the behaviour of users on Datafari.

In the first part of this dashboard, many visualizations are available:

  • Top 20 User Queries With No Search Results: Lists the most frequent user queries where the search engine did not find any relevant result. This may be an indication that non-indexed data source should be part of the crawling scope.

  • Top 20 User Queries With Results but No Clicks: Lists the most frequent user queries where the search engine proposed a list of relevant results. This may be an indication that the ranking was not satisfying. The Search expert may investigate by himself or ask colleagues about it.

  • Top 10 Query Terms: it is the top 10 of the most searched terms in Datafari. For each term the chart shows the total number of time it was queried composed by the number of queries where users have clicked on a result and the number of queries where they did not.

  • Hits x Zero Hits: shows the distribution between requests that had at least one hit and those that did not have any

  • Clicked x Not Clicked: shows the distribution between queries where users have clicked at least on one result and those where they did not.

  • Top 10 User Searches: Lists the top 10 of the most user searches. Contrarily to the Top 10 Query Terms, we may have phrases/expressions here, instead of just terms.

In the second part of this dashboard, we have:

  • History table: Lists the complete history of user queries, with a notion of session tracking. Date corresponds to the query date, Consulted pages corresponds to the pagination: in the screenshot, you need to read as follows: 1,0,0,1,0 means that the user was on the first page  (first 10 results), then the 0 means that he used a facet, the second 0 as well, 1 means that either the user reloaded the page, or he clicked a doc,

Consulted pages column

Understanding the Consulted Pages column may be a bit complex. Here is how to interpret it: when a number superior to 0 is displayed, it corresponds to the display of a results list, respecting the pagination. Therefore, 1 corresponds to the first 10 results, 4 corresponds to results 31 to 40 etc. In case the number is superior to 1, and somewhere further in the sequence, it goes back to 0, there is an ambiguity: either the user selected a facet (highest probabilty of occuring), or he went back to the first page (more rare). When a 0 is displayed, it corresponds to the click on a result. In the screenshot above, it means that for the "performance" query, the user saw the first page (1), clicked on a result (0), then on a second result (0), then selected a facet (1), and clicked on a third result (0).

  • Query explore: Lists the complete history of solr queries, and you can drill down for each of them to get the complete dump.

In the last part of this dashboard, we have:

  • Requests: represents the number of requests over the time, with, in addition, the distribution between the requests that had at least one hit and those that did not have any hit

  • Average Response Time (ms): shows the requests average response time in milliseconds over the time

The data used to build this dashboard are based on data corresponding to the selected time frame,  which is 24 hours by default.

You can create more charts, based on other data, by learning how Kibana works, and how our ELK layout works as well.