spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang, Daoyuan" <>
Subject RE: MatchError in JsonRDD.toLong
Date Tue, 20 Jan 2015 01:33:19 GMT
Yes, actually that is what I mean exactly. And maybe you missed my last response, you can use
the API:
jsonRDD(json:RDD[String], schema:StructType)
to clearly clarify your schema. For numbers bigger than Long, we can use DecimalType.


From: Tobias Pfeiffer []
Sent: Tuesday, January 20, 2015 9:26 AM
To: Wang, Daoyuan
Cc: user
Subject: Re: MatchError in JsonRDD.toLong


On Fri, Jan 16, 2015 at 6:14 PM, Wang, Daoyuan <<>>
The second parameter of jsonRDD is the sampling ratio when we infer schema.

OK, I was aware of this, but I guess I understand the problem now. My sampling ratio is so
low that I only see the Long values of data items and infer it's a Long. When I meet the data
that's actually longer than Long, I get the error I posted; basically it's the same situation
as when specifying a wrong schema manually.

So is there any way around this other than increasing the sample ratio to discover also the
very BigDecimal-sized numbers?


View raw message