ranger-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramesh Mani (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (RANGER-1310) Ranger Audit framework enhancement to provide an option to allow audit records to be spooled to local disk first before sending it to destinations
Date Fri, 20 Jan 2017 02:28:26 GMT

    [ https://issues.apache.org/jira/browse/RANGER-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831042#comment-15831042

Ramesh Mani commented on RANGER-1310:

Thanks [~bosco] for your detailed explanation.

AuditFileCacheProvider which I referred here is same as the FileQueue you are mentioning.

Here is what I was think 
1) We will have One FileQueue which will store the audit in File first using a FileSpooler.
 This FileQueue will be synchronous and will be replacing the AsyncBatchQueue. Only One FileQueue
will be there for all the destinations.
2) FileSpooler in the  FileQueue which will periodically take Files which are closed( which
is batch here) and send it to AsyncBatchQueue. Here  AsyncBatchQueue is the existing one which
send it to Multiple destination, it will have the existing spooling / backing for each of
its destination. 
3) If Summary is enabled FileSpooler in the  FileQueue  will send to  AsyncSummaryBatchQueue
which is also existing one and from there summary will be sent to multiple destination. Summary
is done per file.
4) Flow rate in this case would be same across destination ( Based on the time period in FileQueue
to close and open a audit file ). E.g.  Solr will get data every 5 minutes  if the file rollover
time is 5 minutes. HDFS will also get the the data in the same rate and flushed to hdfs cache.

Regarding Point A "Would we have one FileQueue per Destination or each Destination choose
the reliability level. E.g. Only HDFSDestination needs reliabilityā€¯ 
When you  say reliability requirement,  are you mentioning that each destination should have
it own FileQueue to send it to destination at different rate? Or One destination will use
FileQueue ( say hdfs) and another will be using the existing process of auditing without FileQueue,
based on the reliability requirement? Or all together a new  destination with High Availability
like KAFKA which will cater the audit to HDFS / SOLR etc.?

Regarding the data lose what I found is 
1) In case of HDFS Plugin sending audit to HDFS, when NameNode get restarted, the existing
reference to an open file in the hdfs is lost. HDFS periodically flushes data, but some case
when this is not done yet we see 0 bytes dangling file. Surely this is the issue with closing
of the file. Also when NameNode is restarted the data in the Memory buffer of the AsyncBatchQueue
is also lost. 
2) In case of say HiveServer2 Plugin sending Audit to HDFS when HiveServer2 is restarted then
data in  AsyncBatchQueue memory queue is lost. If case of NameNode getting restarted and if
the stream of audit is going on into a hdfs file, I see hdfs files are getting closed with
partial data, I.e audit framework had sent the data to HDFS and it getting committed, but
due to abrupt NameNode restart partial records are present ( truncated records)


> Ranger Audit framework enhancement to provide an option to  allow audit records to be
spooled to local disk first before sending it to destinations
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>                 Key: RANGER-1310
>                 URL: https://issues.apache.org/jira/browse/RANGER-1310
>             Project: Ranger
>          Issue Type: Bug
>            Reporter: Ramesh Mani
>            Assignee: Ramesh Mani
> Ranger Audit framework enhancement to provide an option to allow audit records to be
spooled to local disk first before sending it to destinations. 
> xasecure.audit.provider.filecache.is.enabled = true ==>  This will enable this functionality
of AuditFileCacheProivder to log the audits locally in a file.
> xasecure.audit.provider.filecache.filespool.file.rollover.sec = \{rollover time - default
is 1 day\} ==> this provides time to send the audit records from local to destination and
flush the pipe. 
> xasecure.audit.provider.filecache.filespool.dir=/var/log/hadoop/hdfs/audit/spool ==>
provides the directory where the Audit FileSpool cache is present.
> This helps in avoiding missing / partial audit records in the hdfs destination which
may happen randomly due to restart of respective plugin components. 

This message was sent by Atlassian JIRA

View raw message