hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Kabra <amitkabrai...@gmail.com>
Subject Re: Mapreduce to and from public clouds
Date Fri, 14 Jun 2019 15:29:48 GMT
Any help here ?

On Thu, Jun 13, 2019 at 12:38 PM Amit Kabra <amitkabraiiit@gmail.com> wrote:

> Hello,
>
> I have a requirement where I need to read/write data to public cloud via
> map reduce job.
>
> Our systems currently read and write of data from hdfs using mapreduce and
> its working well, we write data in sequencefile format.
>
> We might have to move data to public cloud i.e s3 / gcp. Where everything
> remains same just we do read/write to s3/gcp
>
> I did quick search for gcp and I didn't get much info on doing mapreduce
> directly from it. GCS connector for hadoop
> <https://cloudplatform.googleblog.com/2014/01/performance-advantages-of-the-new-google-cloud-storage-connector-for-hadoop.html>
> looks closest but I didn't find any map reduce sample for the same.
>
> Any help on where to start for it or is it not even possible say s3/gcp
> outputformat
> <https://github.com/apache/hadoop/tree/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output>
> not there ,etc and we need to do some hack.
>
> Thanks,
> Amit Kabra.
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message