spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang, Daoyuan" <daoyuan.w...@intel.com>
Subject RE: MatchError in JsonRDD.toLong
Date Tue, 20 Jan 2015 01:33:19 GMT
Yes, actually that is what I mean exactly. And maybe you missed my last response, you can use
the API:
jsonRDD(json:RDD[String], schema:StructType)
to clearly clarify your schema. For numbers bigger than Long, we can use DecimalType.

Thanks,
Daoyuan


From: Tobias Pfeiffer [mailto:tgp@preferred.jp]
Sent: Tuesday, January 20, 2015 9:26 AM
To: Wang, Daoyuan
Cc: user
Subject: Re: MatchError in JsonRDD.toLong

Hi,

On Fri, Jan 16, 2015 at 6:14 PM, Wang, Daoyuan <daoyuan.wang@intel.com<mailto:daoyuan.wang@intel.com>>
wrote:
The second parameter of jsonRDD is the sampling ratio when we infer schema.

OK, I was aware of this, but I guess I understand the problem now. My sampling ratio is so
low that I only see the Long values of data items and infer it's a Long. When I meet the data
that's actually longer than Long, I get the error I posted; basically it's the same situation
as when specifying a wrong schema manually.

So is there any way around this other than increasing the sample ratio to discover also the
very BigDecimal-sized numbers?

Thanks
Tobias

Mime
View raw message