spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Sharma <deepakmc...@gmail.com>
Subject Re: Read hdfs files in spark streaming
Date Mon, 10 Jun 2019 10:44:18 GMT
This is the project requirement , where paths are being streamed in kafka
topic.
Seems it's not possible using spark structured streaming.


On Mon, Jun 10, 2019 at 3:59 PM Shyam P <shyamabigdata@gmail.com> wrote:

> Hi Deepak,
>  Why are you getting paths from kafka topic? any specific reason to do so ?
>
> Regards,
> Shyam
>
> On Mon, Jun 10, 2019 at 10:44 AM Deepak Sharma <deepakmca05@gmail.com>
> wrote:
>
>> The context is different here.
>> The file path are coming as messages in kafka topic.
>> Spark streaming (structured) consumes form this topic.
>> Now it have to get the value from the message , thus the path to file.
>> read the json stored at the file location into another df.
>>
>> Thanks
>> Deepak
>>
>> On Sun, Jun 9, 2019 at 11:03 PM vaquar khan <vaquar.khan@gmail.com>
>> wrote:
>>
>>> Hi Deepak,
>>>
>>> You can use textFileStream.
>>>
>>> https://spark.apache.org/docs/2.2.0/streaming-programming-guide.html
>>>
>>> Plz start using stackoverflow to ask question to other ppl so get
>>> benefits of answer
>>>
>>>
>>> Regards,
>>> Vaquar khan
>>>
>>> On Sun, Jun 9, 2019, 8:08 AM Deepak Sharma <deepakmca05@gmail.com>
>>> wrote:
>>>
>>>> I am using spark streaming application to read from  kafka.
>>>> The value coming from kafka message is path to hdfs file.
>>>> I am using spark 2.x , spark.read.stream.
>>>> What is the best way to read this path in spark streaming and then read
>>>> the json stored at the hdfs path , may be using spark.read.json , into a
df
>>>> inside the spark streaming app.
>>>> Thanks a lot in advance
>>>>
>>>> --
>>>> Thanks
>>>> Deepak
>>>>
>>>
>>
>> --
>> Thanks
>> Deepak
>> www.bigdatabig.com
>> www.keosha.net
>>
>

-- 
Thanks
Deepak
www.bigdatabig.com
www.keosha.net

Mime
View raw message