spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mohan.gadm" <mohan.g...@gmail.com>
Subject Re: Kryo fails with avro having Arrays and unions, but succeeds with simple avro.
Date Thu, 18 Sep 2014 15:20:11 GMT
Hi frank, thanks for the info, thats great. but im not saying Avro serializer
is failing. Kryo is failing
but 
im using kryo serializer. and registering Avro generated classes with kryo.
sparkConf.set("spark.serializer",
	"org.apache.spark.serializer.KryoSerializer");
sparkConf.set("spark.kryo.registrator",
	"com.globallogic.goliath.platform.PlatformKryoRegistrator");

But how did it able to perform output operation when the message is simple.
but not when the message is complex.(please observe no avro schema changes)
just the data is changed.
providing you more info below.

avro schema:
=============
record KeyValueObject {
	union{boolean, int, long, float, double, bytes, string} name;
	union {boolean, int, long, float, double, bytes, string,
array<union{boolean, int, long, float, double, bytes, string,
KeyValueObject}>, KeyValueObject} value;
}
record Datum {
	union {boolean, int, long, float, double, bytes, string,
array<union{boolean, int, long, float, double, bytes, string,
KeyValueObject}>, KeyValueObject} value;
}
record ResourceMessage {
		string version;
		string sequence;
		string resourceGUID;
		string GWID;
		string GWTimestamp;
		union {Datum, array<Datum>} data;
}

simple message is as below:
===================
{"version": "01", "sequence": "00001", "resourceGUID": "001", "GWID": "002",
"GWTimestamp": "1409823150737", "data": {"value": "30"}}

complex message is as below:
===================
{"version": "01", "sequence": "00001", "resource": "sensor-001",
"controller": "002", "controllerTimestamp": "1411038710358", "data":
{"value": [{"name": "Temperature", "value": "30"}, {"name": "Speed",
"value": "60"}, {"name": "Location", "value": ["+401213.1", "-0750015.1"]},
{"name": "Timestamp", "value": "2014-09-09T08:15:25-05:00"}]}}


both messages can fit in to the schema,

actually the message is coming from kafka, which is avro binary.
at spark converting the message to Avro objects(ResourceMessage) using
decoders.(this is also working).
able to perform some mappings, able to convert the stream<ResourceMessage>
to stream<flume Events>

now the events need to be pushed to flume source. for this i need to collect
the RDD, and then send to flume client.

end to end worked fine with simple message. problem is with complex message.




-----
Thanks & Regards,
Mohan
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kryo-fails-with-avro-having-Arrays-and-unions-but-succeeds-with-simple-avro-tp14549p14565.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message