spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerry <>
Subject Is there any external dependencies for lag() and lead() when using data frames?
Date Mon, 10 Aug 2015 20:26:23 GMT

Using Apache Spark 1.4.1 I'm unable to use lag or lead when making queries
to a data frame and I'm trying to figure out if I just have a bad setup or
if this is a bug. As for the exceptions I get: when using selectExpr() with
a string as an argument, I get "NoSuchElementException: key not found: lag"
and when using the select method and ...spark.sql.functions.lag I get an
AnalysisException. If I replace lag with abs in the first case, Spark runs
without exception, so none of the other syntax is incorrect.

As for how I'm running it; the code is written in Java with a static method
that takes the SparkContext as an argument which is used to create a
JavaSparkContext which then is used to create an SQLContext which loads a
json file from the local disk and runs those queries on that data frame
object. FYI: the java code is compiled, jared and then pointed to with -cp
when starting the spark shell, so all I do is "" in shell.

Let me know what to look for to debug this problem. I'm not sure where to
look to solve this problem.


View raw message