Skip to main content

Posts

Showing posts from July, 2016

ElasticSearch pipeline bucket selector aggregation

ElasticSearch has a concept of bucket selection generated from aggregation. This works as a pipeline, where first aggregation generates buckets, and then bucket selection further filters out buckets. We have an ElasticSearch index ' daily_reports ', where a row represents a particular version of report. When a report is created a new row is inserted in the index with a new ' reportId ' field value and ' publishDate ' field representing the UNIX timestamp. Each report/row has multiple other fields representing properties of the report, for e.g., ' title ', ' activity ', ' reportStatus ', ' reportLevel ', etc. When the report is edited/deleted, a new row is inserted into the index, with same ' reportId ', but different '_id', 'publishDate', 'reportLevel' etc. Now if user wants to get the latest version for each report matching a particular filter criterion ( reportLevel = Monitoring AND repor