spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <>
Subject Re: Deserializing JSON into Scala objects in Java code
Date Tue, 08 Sep 2015 20:43:10 GMT
Hi Kevin,

This code works fine for me (output is "List(1, 2)"):

import org.apache.spark.status.api.v1.RDDPartitionInfo;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.module.scala.DefaultScalaModule;

class jackson { public static void main(String[] args) throws Exception {
  ObjectMapper mapper = new ObjectMapper();
  mapper.registerModule(new DefaultScalaModule());

  String json = "{ \"blockName\" : \"name\", \"executors\" : [ \"1\",
\"2\" ] }";
  RDDPartitionInfo info = mapper.readValue(json, RDDPartitionInfo.class);
} }

On Tue, Sep 8, 2015 at 1:27 PM, Kevin Chen <> wrote:
> Hi Marcelo,
>  Thanks for the quick response. I understand that I can just write my own
> Java classes (I will use that as a fallback option), but in order to avoid
> code duplication and further possible changes, I was hoping there would be
> a way to use the Spark API classes directly, since it seems there should
> be.
>  I registered the Scala module in the same way (except in Java instead of
> Scala),
> mapper.registerModule(new DefaultScalaModule());
> But I don’t think the module is being used/registered properly? Do you
> happen to know whether the above line should work in Java?
> On 9/8/15, 12:55 PM, "Marcelo Vanzin" <> wrote:
>>Hi Kevin,
>>How did you try to use the Scala module? Spark has this code when
>>setting up the ObjectMapper used to generate the output:
>>As for supporting direct serialization to Java objects, I don't think
>>that was the goal of the API. The Scala API classes are public mostly
>>so that API compatibility checks are performed against them. If you
>>don't mind the duplication, you could write your own Java POJOs that
>>mirror the Scala API, and use them to deserialize the JSON.
>>On Tue, Sep 8, 2015 at 12:46 PM, Kevin Chen <> wrote:
>>> Hello Spark Devs,
>>>  I am trying to use the new Spark API json endpoints at /api/v1/[path]
>>> (added in SPARK-3454).
>>>  In order to minimize maintenance on our end, I would like to use
>>> Retrofit/Jackson to parse the json directly into the Scala classes in
>>> org/apache/spark/status/api/v1/api.scala (ApplicationInfo,
>>> ApplicationAttemptInfo, etc…). However, Jackson does not seem to know
>>>how to
>>> handle Scala Seqs, and will throw an error when trying to parse the
>>> attempts: Seq[ApplicationAttemptInfo] field of ApplicationInfo. Our
>>> is in Java.
>>>  My questions are:
>>> Do you have any recommendations on how to easily deserialize Scala
>>> from json? For example, do you have any current usage examples of
>>> with Java?
>>> Alternatively, are you committed to the json formats of /api/v1/path? I
>>> would guess so, because of the ‘v1’, but wanted to confirm. If so, I
>>> deserialize the json into instances of my own Java classes instead,
>>> worrying about changing the class structure later due to changes in the
>>> Spark API.
>>> Some further information:
>>> The error I am getting with Jackson when trying to deserialize the json
>>> ApplicationInfo is Caused by:
>>> com.fasterxml.jackson.databind.JsonMappingException: Can not construct
>>> instance of scala.collection.Seq, problem: abstract types either need
>>>to be
>>> mapped to concrete types, have custom deserializer, or be instantiated
>>> additional type information
>>> I tried using Jackson’s DefaultScalaModule, which seems to have support
>>> Scala Seqs, but got no luck.
>>> Deserialization works if the Scala class does not have any Seq fields,
>>> works if the fields are Java Lists instead of Seqs.
>>> Thanks very much for your help!
>>> Kevin Chen


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message