spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang, Daoyuan" <daoyuan.w...@intel.com>
Subject RE: MatchError in JsonRDD.toLong
Date Fri, 16 Jan 2015 08:55:20 GMT
Hi Tobias,

Can you provide how you create the JsonRDD?

Thanks,
Daoyuan


From: Tobias Pfeiffer [mailto:tgp@preferred.jp]
Sent: Friday, January 16, 2015 4:01 PM
To: user
Subject: Re: MatchError in JsonRDD.toLong

Hi again,

On Fri, Jan 16, 2015 at 4:25 PM, Tobias Pfeiffer <tgp@preferred.jp<mailto:tgp@preferred.jp>>
wrote:
Now I'm wondering where this comes from (I haven't touched this component in a while, nor
upgraded Spark etc.) [...]

So the reason that the error is showing up now is that suddenly data from a different dataset
is showing up in my test dataset... don't ask me... anyway, this different dataset contains
data like

  {"Click":"nonclicked", "Impression":1,
   "DisplayURL":4401798909506983219, "AdId":21215341, ...}
  {"Click":"nonclicked", "Impression":1,
   "DisplayURL":14452800566866169008, "AdId":10587781, ...}

and the DisplayURL seems to be too long for Long, while it is still inferred as a Long column.

So, what to do about this? Is jsonRDD inherently incapable of handling those long numbers
or is it just an issue in the schema inference and I should file a JIRA issue?

Thanks
Tobias
Mime
View raw message