spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerry <jerry.c...@gmail.com>
Subject Re: Is there any external dependencies for lag() and lead() when using data frames?
Date Mon, 10 Aug 2015 21:38:56 GMT
Thanks...   looks like I now hit that bug about HiveMetaStoreClient as I
now get the message about being unable to instantiate it. On a side note,
does anyone know where hive-site.xml is typically located?

Thanks,
        Jerry

On Mon, Aug 10, 2015 at 2:03 PM, Michael Armbrust <michael@databricks.com>
wrote:

> You will need to use a HiveContext for window functions to work.
>
> On Mon, Aug 10, 2015 at 1:26 PM, Jerry <jerry.comp@gmail.com> wrote:
>
>> Hello,
>>
>> Using Apache Spark 1.4.1 I'm unable to use lag or lead when making
>> queries to a data frame and I'm trying to figure out if I just have a bad
>> setup or if this is a bug. As for the exceptions I get: when using
>> selectExpr() with a string as an argument, I get "NoSuchElementException:
>> key not found: lag" and when using the select method and
>> ...spark.sql.functions.lag I get an AnalysisException. If I replace lag
>> with abs in the first case, Spark runs without exception, so none of the
>> other syntax is incorrect.
>>
>> As for how I'm running it; the code is written in Java with a static
>> method that takes the SparkContext as an argument which is used to create a
>> JavaSparkContext which then is used to create an SQLContext which loads a
>> json file from the local disk and runs those queries on that data frame
>> object. FYI: the java code is compiled, jared and then pointed to with -cp
>> when starting the spark shell, so all I do is "Test.run(sc)" in shell.
>>
>> Let me know what to look for to debug this problem. I'm not sure where to
>> look to solve this problem.
>>
>> Thanks,
>>         Jerry
>>
>
>

Mime
View raw message