chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luangsay Sourygna <>
Subject Creating a new adaptor: FileTailingAdaptor that would not cut lines
Date Thu, 18 Apr 2013 18:33:30 GMT
Hi all,

FileTailingAdaptor is great to tail log files and send them to Hadoop.
However, last line of the chunk is usually cut which leads to some errors.

I know that we can use CharFileTailingAdaptorUTF8 to solve such problem.
Nonetheless, this adaptor calls the MapProcessor.process() method for every
line in each chunk, thus slowing a lot the Demux phase.

I suggest creating a new adaptor that would mix the benefits of the two
adaptors: the (Demux) speed of FileTailingAdaptor and
the preservation of lines from CharFileTailingAdaptorUTF8.

The implementation of the extractRecords() would be:
- "for loop" on the buffer, starting from the end of the buffer and going
- if we find a separator, save the offset and exit the loop
- rest of method would be similar to CharFileTailingAdaptorUTF8.

Could you guys please tell me what do you think about it?
How do you currently manage the "lines cut" with Chukwa?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message