spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Goncharuk <alexey.goncha...@gmail.com>
Subject Re: Make off-heap store pluggable
Date Tue, 21 Jul 2015 17:56:47 GMT
2015-07-20 23:29 GMT-07:00 Matei Zaharia <matei.zaharia@gmail.com>:

> I agree with this -- basically, to build on Reynold's point, you should be
> able to get almost the same performance by implementing either the Hadoop
> FileSystem API or the Spark Data Source API over Ignite in the right way.
> This would let people save data persistently in Ignite in addition to using
> it for caching, and it would provide a global namespace, optionally a
> schema, etc. You can still provide data locality, short-circuit reads, etc
> with these APIs.
>

Absolutely agree.

In fact, Ignite already provides a shared RDD implementation which is
essentially a view of Ignite cache data. This implementation adheres to the
Spark DataFrame API. More information can be found here:
http://ignite.incubator.apache.org/features/igniterdd.html

Also, Ignite in-memory filesystem is compliant with Hadoop filesystem API
and can transparently replace HDFS if needed. Plugging it into Spark should
be fairly easy. More information can be found here:
http://ignite.incubator.apache.org/features/igfs.html

--Alexey

Mime
View raw message