Skip to main content


Showing posts from 2016

Kafka MirrorMaker in Kafka

Check MirrorMaker.scala for more details.
Target cluster setupDownload and install Kafka (target cluster). Select appropriate version and download its tgz from Kafka Downloads page.tar -zxf kafka_2.11- cd kafka_2.11- Configure Target Kafka cluster's ZooKeepervim ./config/ # the directory where the snapshot is stored. dataDir=/work/kafka_2.11- # the port at which the clients will connect clientPort=2181 # disable the per-ip limit on thseparatedof connections since this is a non-production config maxClientCnxns=0 Start Target Kafka cluster's ZooKeeper./bin/ config/ Configure Target Kafka cluster's Servervim ./config/ # The id of the broker. This must be set to a unique integer for each broker. # The number of threads handling network requests # The number of threads doing disk I/O # The send buffer (SO_…

Generating Multi-Domain (SAN) Certificates

The Subject Alternative Name field lets you specify additional host names (sites, IP addresses, common names, etc.) to be protected by a single SSL Certificate, such as a Multi-Domain (SAN) or Extend Validation Multi-Domain Certificate.
Secure Host Names on Different Base Domains in One SSL Certificate: A Wildcard Certificate can protect all first-level subdomains on an entire domain, such as * However, a Wildcard Certificate cannot protect both and Host Multiple SSL Sites on a Single IP Address: Hosting multiple SSL-enabled sites on a single server typically requires a unique IP address per site, but a Multi-Domain (SAN) Certificate with Subject Alternative Names can solve this problem. Microsoft IIS and Apache are both able to Virtual Host HTTPS sites using Multi-Domain (SAN) Certificates.Greatly Simplify Your Server's SSL Configuration: Using a Multi-Domain (SAN) Certificate saves you the hassle and time involved in co…

ElasticSearch pipeline bucket selector aggregation

ElasticSearch has a concept of bucket selection generated from aggregation.
This works as a pipeline, where first aggregation generates buckets, and then bucket selection further filters out buckets.

We have an ElasticSearch index 'daily_reports', where a row represents a particular version of report.
When a report is created a new row is inserted in the index with a new 'reportId' field value and 'publishDate' field representing the UNIX timestamp.
Each report/row has multiple other fields representing properties of the report, for e.g., 'title', 'activity', 'reportStatus', 'reportLevel', etc.
When the report is edited/deleted, a new row is inserted into the index, with same 'reportId', but different '_id', 'publishDate', 'reportLevel' etc.

Now if user wants to get the latest version for each report matching a particular filter criterion (reportLevel = Monitoring AND reportStatus = 1), we can get …