spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: how to ignore MatchError then processing a large json file in spark-sql
Date Mon, 03 Aug 2015 20:43:32 GMT
This sounds like a bug.  What version of spark? and can you provide the
stack trace?

On Sun, Aug 2, 2015 at 11:27 AM, fuellee lee <lifuyu198919@gmail.com> wrote:

> I'm trying to process a bunch of large json log files with spark, but it
> fails every time with `scala.MatchError`, Whether I give it schema or not.
>
> I just want to skip lines that does not match schema, but I can't find how
> in docs of spark.
>
> I know write a json parser and map it to json file RDD can get things
> done, but I want to use
> `sqlContext.read.schema(schema).json(fileNames).selectExpr(...)` because
> it's much easier to maintain.
>
> thanks
>

Mime
View raw message