spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shagun Sodhani <sshagunsodh...@gmail.com>
Subject Lead operator not working as aggregation operator
Date Mon, 02 Nov 2015 10:33:56 GMT
Hi! I was trying out window functions in SparkSql (using hive context) and
I noticed that while this
<https://issues.apache.org/jira/browse/TAJO-919?jql=text%20~%20%22lag%20window%22>
mentions that *lead* is implemented as an aggregate operator, it seems not
to be the case.

I am using the following configuration:

Query : SELECT lead(max(`expenses`)) FROM `table` GROUP BY `customerId`
Spark Version: 10.4
SparkSql Version: 1.5.1

I am using the standard example of (`customerId`, `expenses`) scheme where
each customer has multiple values for expenses (though I am setting age as
Double and not Int as I am trying out maths functions).


*java.lang.NullPointerException at
org.apache.hadoop.hive.ql.udf.generic.GenericUDFLeadLag.evaluate(GenericUDFLeadLag.java:57)*

The entire error stack can be found here <http://pastebin.com/jTRR4Ubx>.

Can someone confirm if this is an actual issue or some oversight on my part?

Thanks!

Mime
View raw message