spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Wilkinson <ia...@me.com>
Subject Re: DynamoDB input source
Date Fri, 04 Jul 2014 15:51:21 GMT
Excellent. Let me get browsing on this.

Huge thanks,
ian


On 4 Jul 2014, at 16:47, Nick Pentreath <nick.pentreath@gmail.com> wrote:

> No boto support for that. 
> 
> In master there is Python support for loading Hadoop inputFormat. Not sure if it will
be in 1.0.1 or 1.1
> 
> I master docs under the programming guide are instructions and also under examples project
there are pyspark examples of using Cassandra and HBase. These should hopefully give you enough
to get started. 
> 
> Depending on how easy it is to use the dynamo DB format, you may have to write a custom
converter (see the mentioned examples for storm details).
> 
> Sent from my iPhone
> 
> On 4 Jul 2014, at 08:38, Ian Wilkinson <ianw1@me.com> wrote:
> 
>> Hi Nick,
>> 
>> I’m going to be working with python primarily. Are you aware of
>> comparable boto support?
>> 
>> ian
>> 
>> On 4 Jul 2014, at 16:32, Nick Pentreath <nick.pentreath@gmail.com> wrote:
>> 
>>> You should be able to use DynamoDBInputFormat (I think this should be part of
AWS libraries for Java) and create a HadoopRDD from that.
>>> 
>>> 
>>> On Fri, Jul 4, 2014 at 8:28 AM, Ian Wilkinson <ianw1@me.com> wrote:
>>> Hi,
>>> 
>>> I noticed mention of DynamoDB as input source in
>>> http://ampcamp.berkeley.edu/wp-content/uploads/2012/06/matei-zaharia-amp-camp-2012-advanced-spark.pdf.
>>> 
>>> Unfortunately, Google is not coming to my rescue on finding
>>> further mention for this support.
>>> 
>>> Any pointers would be well received.
>>> 
>>> Big thanks,
>>> ian
>>> 
>> 


Mime
View raw message