spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vibhor Banga <>
Subject Serialization problem in Spark
Date Thu, 05 Jun 2014 10:11:09 GMT

I am trying to do something like following in Spark:

JavaPairRDD<byte[], MyObject> eventRDD =
PairFunction<Tuple2<ImmutableBytesWritable, Result>, byte[], MyObject >() {
            public Tuple2<byte[], MyObject >
call(Tuple2<ImmutableBytesWritable, Result>
immutableBytesWritableResultTuple2) throws Exception {
                return new
Tuple2<byte[], MyObject >(immutableBytesWritableResultTuple2._1.get(),

        eventRDD.foreach(new VoidFunction<Tuple2<byte[], Event>>() {
            public void call(Tuple2<byte[], Event> eventTuple2) throws
Exception {


processForEvent() function flow contains some processing and ultimately
writing to HBase Table. But I am getting serialisation issues with Hadoop
and HBase inbuilt classes. How do I solve this ? Does using Kyro
Serialisation help in this case ?


View raw message