spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Trienis <mike.trie...@orcsol.com>
Subject Pushing data from AWS Kinesis -> Spark Streaming -> AWS Redshift
Date Sun, 01 Mar 2015 19:06:55 GMT
Hi All,

I am looking at integrating a data stream from AWS Kinesis to AWS Redshift
and since I am already ingesting the data through Spark Streaming, it seems
convenient to also push that data to AWS Redshift at the same time.

I have taken a look at the AWS kinesis connector although I am not sure it
was designed to integrate with Apache Spark. It seems more like a
standalone approach:

   - https://github.com/awslabs/amazon-kinesis-connectors

There is also a Spark redshift integration library, however, it looks like
it was intended for pulling data rather than pushing data to AWS Redshift:

   - https://github.com/databricks/spark-redshift

I finally took a look at a general Scala library that integrates with AWS
Redshift:

   - http://scalikejdbc.org/

Does anyone have any experience pushing data from Spark Streaming to AWS
Redshift? Does it make sense conceptually, or does it make more sense to
push data from AWS Kinesis to AWS Redshift VIA another standalone approach
such as the AWS Kinesis connectors.

Thanks, Mike.

Mime
View raw message