spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <m...@clearstorydata.com>
Subject Re: Is Spark in Java a bad idea?
Date Tue, 28 Oct 2014 19:31:04 GMT
I believe that you are overstating your case.

If you want to work with with Spark, then the Java API is entirely adequate
with a very few exceptions -- unfortunately, though, one of those
exceptions is with something that you are interested in, JdbcRDD.

If you want to work on Spark -- customizing, extending, or contributing to
it, then working in Scala is pretty much unavoidable if your work is of any
significant depth.

That being said, I expect that there are very few Spark users who are
comfortable with the Scala API who would voluntarily choose to regularly
use the Java or Python APIs, so taking the opportunity to learn Scala isn't
a bad thing.

On Tue, Oct 28, 2014 at 12:15 PM, Ron Ayoub <ronaldayoub@live.com> wrote:

> I interpret this to mean you have to learn Scala in order to work with
> Spark in Scala (goes without saying) and also to work with Spark in Java
> (since you have to jump through some hoops for basic functionality).
>
> The best path here is to take this as a learning opportunity and sit down
> and learn Scala.
>
> Regarding RDD being an internal API, it has two methods that clearly allow
> you to override them which the JdbcRDD does and it looks close to trivial -
> if I only new Scala. Once I learn Scala, I would say the first thing I plan
> on doing is writing my own OracleRDD with my own flavor of Jdbc code. Why
> would this not be advisable?
>
>
> > Subject: Re: Is Spark in Java a bad idea?
> > From: matei.zaharia@gmail.com
> > Date: Tue, 28 Oct 2014 11:56:39 -0700
> > CC: user@spark.incubator.apache.org
> > To: isasmani.git@gmail.com
>
> >
> > A pretty large fraction of users use Java, but a few features are still
> not available in it. JdbcRDD is one of them -- this functionality will
> likely be superseded by Spark SQL when we add JDBC as a data source. In the
> meantime, to use it, I'd recommend writing a class in Scala that has
> Java-friendly methods and getting an RDD to it from that. Basically the two
> parameters that weren't friendly there were the ClassTag and the
> getConnection and mapRow functions.
> >
> > Subclassing RDD in Java is also not really supported, because that's an
> internal API. We don't expect users to be defining their own RDDs.
> >
> > Matei
> >
> > > On Oct 28, 2014, at 11:47 AM, critikaled <isasmani.git@gmail.com>
> wrote:
> > >
> > > Hi Ron,
> > > what ever api you have in scala you can possibly use it form java.
> scala is
> > > inter-operable with java and vice versa. scala being both object
> oriented
> > > and functional will make your job easier on jvm and it is more consise
> than
> > > java. Take it as an opportunity and start learning scala ;).
> > >
> > >
> > >
> > > --
> > > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Is-Spark-in-Java-a-bad-idea-tp17534p17538.html
> > > Sent from the Apache Spark User List mailing list archive at
> Nabble.com.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > > For additional commands, e-mail: user-help@spark.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > For additional commands, e-mail: user-help@spark.apache.org
> >
>

Mime
View raw message