metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charles Joynt <>
Subject Writing enrichment data directly from NiFi with PutHBaseJSON
Date Fri, 01 Jun 2018 10:26:45 GMT

I work as a Dev/Ops Data Engineer within the security team at a company in London where we
are in the process of implementing Metron. I have been tasked with implementing feeds of network
environment data into HBase so that this data can be used as enrichment sources for our security
events. First-off I wanted to pull in DNS data for an internal domain.

I am assuming that I need to write data into HBase in such a way that it exactly matches what
I would get from the script. A colleague of mine has already loaded some
DNS data using that script, so I am using that as a reference.

I have implemented a flow in NiFi which takes JSON data from a HTTP listener and routes it
to a PutHBaseJSON processor. The flow is working, in the sense that data is successfully written
to HBase, but despite (naively) specifying "Row Identifier Encoding Strategy = Binary", the
results in HBase don't look correct. Comparing the output from HBase scan commands I see: produced:

ROW:      \xFF\xFE\xCB\xB8\xEF\x92\xA3\xD9#xC\xF9\xAC\x0Ap\x1E\x00\x05whois\x00\x0E192.168.0.198
CELL: column=data:v, timestamp=1516896203840, value={"clientname":"server.domain.local","clientip":""}

PutHBaseJSON produced:

ROW:  server.domain.local
CELL: column=dns:v, timestamp=1527778603783, value={"name":"server.domain.local","type":"A","data":""}

>From source JSON:


I know that there are some differences in column family / field names, but my worry is the
ROW id. Presumably I need to encode my row key, "k" in the JSON data, in a way that matches
how the script did it.

Can anyone explain how I might convert my Id to the correct format?
Does this matter-can Metron use the human-readable ROW ids?

Charlie Joynt

G-RESEARCH believes the information provided herein is reliable. While every care has been
taken to ensure accuracy, the information is furnished to the recipients with no warranty
as to the completeness and accuracy of its contents and on condition that any errors or omissions
shall not be made the basis of any claim, demand or cause of action.
The information in this email is intended only for the named recipient.  If you are not the
intended recipient please notify us immediately and do not copy, distribute or take action
based on this e-mail.
All messages sent to and from this e-mail address will be logged by G-RESEARCH and are subject
to archival storage, monitoring, review and disclosure.
G-RESEARCH is the trading name of Trenchant Limited, 5th Floor, Whittington House, 19-30 Alfred
Place, London WC1E 7EA.
Trenchant Limited is a company registered in England with company number 08127121.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message