spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayur Rustagi <mayur.rust...@gmail.com>
Subject Re: Historical Data as Stream
Date Sat, 17 May 2014 16:52:33 GMT
The real question is why are looking to consume file as a Stream
1. Too big to load as RDD
2. Operate in sequential manner.

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Sat, May 17, 2014 at 5:12 AM, Soumya Simanta <soumya.simanta@gmail.com>wrote:

> File is just a steam with a fixed length. Usually streams don't end but in
> this case it would.
>
> On the other hand if you real your file as a steam may not be able to use
> the entire data in the file for your analysis. Spark (give enough memory)
> can process large amounts of data quickly.
>
> On May 15, 2014, at 9:52 AM, Laeeq Ahmed <laeeqspark@yahoo.com> wrote:
>
> Hi,
>
> I have data in a file. Can I read it as Stream in spark? I know it seems
> odd to read file as stream but it has practical applications in real life
> if I can read it as stream. It there any other tools which can give this
> file as stream to Spark or I have to make batches manually which is not
> what I want. Its a coloumn of a million values.
>
> Regards,
> Laeeq
>
>
>

Mime
View raw message