spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From invkrh <>
Subject Re: SparkSQL LEFT JOIN problem
Date Mon, 13 Oct 2014 12:49:10 GMT

Thank you Liquan. I just missed some in information in my previous post.

I just solved the problem.

Actually, I use the first line(schema header) of the CSV file to generate
StructType and StructField. However, the input file is UTF-8 Unicode (*with*
BOM), so the first char of the file is #65279(or U+FEFF).

As a result, the first field has a leading #65279 char. When querying, I
just used account_id, so SparkSQL cannot find the given field in AST, while
the one in AST is #65279account_id.

So the solution this to convert input file to UTF-8 Unicode (*without* BOM),
that will remove the leading #65279. Everything is fine now.

As #65279 is not printable, it's not easy to find the bug, given that the
error msg made me think it's SparkSQL's problem.

Really hope that the exception msg of SparkSQL could be a little more
explicit for developer.



View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message