spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: No auto decompress in Spark Java textFile function?
Date Wed, 09 Sep 2015 06:40:36 GMT
textFile used to work with .gz files, i haven't tested it on bz2 files. If
it isn't decompressing by default then what you have to do is to use the
sc.wholeTextFiles and then decompress each record (that being file) with
the corresponding codec.

Thanks
Best Regards

On Tue, Sep 8, 2015 at 6:49 PM, Chris Teoh <chris.teoh@gmail.com> wrote:

> Hi Folks,
>
> I tried using Spark v1.2 on bz2 files in Java but the behaviour is
> different to the same textFile API call in Python and Scala.
>
> That being said, how do I process to read .tar.bz2 files in Spark's Java
> API?
>
> Thanks in advance
> Chris
>

Mime
View raw message