nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Payne <marka...@hotmail.com>
Subject Re: Use Case...Please help
Date Sun, 15 May 2016 19:55:22 GMT
Hi Deepak,

Certainly, this is something that you could use NiFi for. We often see people using NiFi to
sync data from
a directory on local disk to a directory in HDFS. This is typically accomplished by using
a flow like:

ListFile -> FetchFile -> PutHDFS

You can then create a file in the source directory with the same name by using ReplaceText
to set the content
to nothing and then PutFile to write out the 0-byte content. So the flow would look like:

ListFile -> FetchFile -> PutHDFS -> ReplaceText -> PutFile

PutHDFS has a "Directory" property. If you set this value to "${path}" it will use the same
directory structure that
ListFile found the file to be in when it performed the listing. I.e., if you set ListFile
to pull from /data/mydir
and "Recurse Subdirectories" to true, then any file found in /data/mydir will have a 'path'
of './' and anything found in
/data/mydir/subdir1 will have a path of './subdir1'. If you would rather have the fully qualified
path (/data/mydir/subdir1)
you would use "${absolute.path}" instead of "${path}"

One thing that I find curious about your scenario though is the concept of a 'log copy script'
and then putting back
a 0-byte file so that the script does not pick up the data again. Why not just use NiFi to
pull directly from the source
and avoid using a script all together? The ListFile processor will keep track of what has
been pulled in already,
so it won't copy the data multiple times. But I may not be clear on this point. Is the "Log
repository" that you mention
just a directory that NiFi could pull from, or is it some other sort of repository?

Thanks
-Mark



> On May 15, 2016, at 3:23 PM, Tripathi, Shiv Deepak <shiv.deepak.tripathi@philips.com>
wrote:
> 
> Hi 
>  
> Currently I am using flume for data ingestion and my use case as follows
>  
> Log repository--------log copy Script-----à Staging directory for  copied logs
>  
> Staging directory for  copied logs folder structure----Machine1log----a.log
>                                                                                     
                                         -----b.log
>                                                                                     
               Machine2log----a.log
>                                                                                     
                                         -----b.log
>  
> Flume will copy these logs and replicate same structure in HDFS cluster. Beginning with
which is :
>                                                                                     
           /user/hdfs/Machine1log----a.log
>                                                                                     
                                                     -----b.log
>                                                                                     
                             Machine2log----a.log
>                                                                                     
                                                       -----b.log
>  
>  
> And creates 0 byte dummy file with same name so that Script wont copy the same log again
as it find 0 byte file already existing in source directory.
>  
>  
> Can we do same things with apache nifi?
>  
> Keeping in mind two goals- same folder structure in HDFS and after moving file to HDFS
it should crete 0 byte dummy file in source directory.
>  
>  
> Please help
>  
> Thanks,
> Deepak
>  
>  
>  
>  
> With Best Regards,
> Deepak Tripathi
> Philips Innovation campus
> Bangalore-560045
> <image001.png>
>  
> 
> The information contained in this message may be confidential and legally protected under
applicable law. The message is intended solely for the addressee(s). If you are not the intended
recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction
of this message is strictly prohibited and may be unlawful. If you are not the intended recipient,
please contact the sender by return e-mail and destroy all copies of the original message.


Mime
View raw message