Datafari generates a statistic log each time a query is performed by a user.
Here how a statistic log looks like:
statistic log
2015-11-05 15:24:45 STAT StatsPusher:95 - 508e9b3e-fdb0-4831-8fab-2acad81f1cb5|2015-11-05T14:21:29.740+0100|engine|0|4|1|7|2|1|[engine//////4///7///1//////, engine///(extension:docx )///2///6///1//////, ///////////////file:/home/youp/Downloads/doc/Alertes.docx///2]|||file:/home/youp/Downloads/doc/Alertes.docx
It respects a specific format which is :
[log4j_timestamp] [log_level] [logger_class]:[line_in_class] - [query_id] | [query_timestamp] | [query] | [noHits] | [numFound] | [numClicks] | [QTime] | [positionClickTot] | [click] | [history] | [spell] | [suggest] | [url]
Let explain each field:
- [log4j_timestamp] : is the timestamp set by the lo4j API
- [log_level] : is literally the log level
- [logger_class] : is the name of the class which has generated the log
- [line_in_class] : is the line number in the class that has generated the log
- [query_id] : is the id of the query. This id is used to keep track of the user behaviour.
For example, if the user search the word "engine", a query id is generated. Then, if the user click on the facet "Language=>fr" on the query results, the facet "sub-query" will keep the id of the "engine" query but will enrich his history - [query_timestamp] : is the full timestamp of the query, measured by Datafari
- [query] : is the literal query performed by the user
- [noHits]
- [numFound]
- [numClicks]
- [QTime]
- [positionClickTot]
- [click]
- [history]
- [spell]
- [suggest]
- [url]
By default, those logs are not displayed in the console, but are wrote into specifics log files.
The configuration of the log files (path, size, number etc.) can be set in the log4j properties located in tomcat/lib/log4j.properties.