We have encountered a strange problem in our Scrunch code when attempting
to serialize Java enum types (as generated by Avro).
Basically, if you create an Avro-schema with an enum-typed field, then a
Java enum class will be generated for that field. When you create a Scrunch
pipeline to use it, and use it within a compound type as an intermediate
value, it fails when spilling to disk because the ReflectDatumWriter cannot
instantiate the enum type.
Inspecting the implicit PTypeH parameter passed to the offending function
(flatMap to a 4-tuple in this case), we see that it resolves to
quads(records[MyEnumType), strings, strings, strings). The records gets
implemented by the PTypeFamiliy (AvroTypeFamily in this case), which
delegates to containers and then reflects, which in tern delegates to the
Avro standard reflection stuff. I would expect this to have no problem with
an enum type, but for some reason it is trying to instantiate it instead of
using it as an enum.
Is there some special case for Java enums missing in PTypeH, or have I
maybe done something else wrong somewhere?
|