spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <>
Subject Re: spark streaming filestream API
Date Wed, 14 Oct 2015 12:01:28 GMT
Key and Value are the ones that you are using with your InputFormat. Eg:

JavaReceiverInputDStream<String> lines = jssc.fileStream("/sigmoid",
LongWritable.class, Text.class, TextInputFormat.class);

TextInputFormat uses the LongWritable as Key and Text as Value classes. If
your data is plain CSV or text data then you can use the
*jssc.textFileStream("/sigmoid")* without worrying about the InputFormat,
Key and Value classes.

Best Regards

On Wed, Oct 14, 2015 at 5:12 PM, Chandra Mohan, Ananda Vel Murugan <> wrote:

> Hi All,
> I have a directory hdfs which I want to monitor and whenever there is a
> new file in it, I want to parse that file and load the contents into a HIVE
> table. File format is proprietary and I have java parsers for parsing it. I
> am building a spark streaming application for this workflow. For doing
> this, I found JavaStreamingContext.filestream API. It takes four arguments
> directory path, key class, value class and inputformat. What should be
> values of key and value class? Please suggest. Thank you.
> Regards,
> Anand.C

View raw message