spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ewan Leith <>
Subject Re: Spark 2.0.0 - Apply schema on few columns of dataset
Date Mon, 08 Aug 2016 05:56:50 GMT
Looking at the encoders api documentation at

== Java == Encoders are specified by calling static methods on Encoders<>.

List<String> data = Arrays.asList("abc", "abc", "xyz"); Dataset<String> ds = context.createDataset(data,

I think you should be calling

.as((Encoders.STRING(), Encoders.STRING()))

or similar


On 8 Aug 2016 06:10, Aseem Bansal <> wrote:
Hi All

Has anyone done this with Java API?

On Fri, Aug 5, 2016 at 5:36 PM, Aseem Bansal <<>>
I need to use few columns out of a csv. But as there is no option to read few columns out
of csv so
 1. I am reading the whole CSV using SparkSession.csv()
 2.  selecting few of the columns using
 3. applying schema using the .as() function of Dataset<Row>.  I tried to extent org.apache.spark.sql.Encoder
as the input for as function

But I am getting the following exception

Exception in thread "main" java.lang.RuntimeException: Only expression encoders are supported

So my questions are -
1. Is it possible to read few columns instead of whole CSV? I cannot change the CSV as that
is upstream data
2. How do I apply schema to few columns if I cannot write my encoder?

View raw message