spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From fuellee lee <lifuyu198...@gmail.com>
Subject how to ignore MatchError then processing a large json file in spark-sql
Date Sun, 02 Aug 2015 18:27:53 GMT
I'm trying to process a bunch of large json log files with spark, but it
fails every time with `scala.MatchError`, Whether I give it schema or not.

I just want to skip lines that does not match schema, but I can't find how
in docs of spark.

I know write a json parser and map it to json file RDD can get things done,
but I want to use
`sqlContext.read.schema(schema).json(fileNames).selectExpr(...)` because
it's much easier to maintain.

thanks

Mime
View raw message