spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Meeraj Kunnumpurath <mee...@servicesymphony.com>
Subject UDF for gradient ascent
Date Sat, 26 Nov 2016 15:31:50 GMT
Hello,

I have a dataset of features on which I want to compute the likelihood
value for implementing gradient ascent for estimating coefficients. I have
written a UDF that compute the probability function on each feature as
shown below.

def getLikelihood(cfs : List[(String, Double)], df: DataFrame) = {
  val pr = udf((r: Row) => {
    cfs.foldLeft(0.0)((x, y) => x * 1 / Math.pow(Math.E,
r.getAs[Double](y._1) * y._2))
  })
  df.withColumn("probabibility", pr(struct(df.columns.map(df(_)) :
_*))).agg(sum('probabibility)).first.get(0)
}

When I run it I get a long exception trace listing some generated code, as
shown below.

org.codehaus.commons.compiler.CompileException: File 'generated.java', Line
2445, Column 34: Expression "scan_isNull1" is not an rvalue
at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:10174)
at
org.codehaus.janino.UnitCompiler.toRvalueOrCompileException(UnitCompiler.java:6036)
at
org.codehaus.janino.UnitCompiler.getConstantValue2(UnitCompiler.java:4440)
at org.codehaus.janino.UnitCompiler.access$9900(UnitCompiler.java:185)
at
org.codehaus.janino.UnitCompiler$11.visitAmbiguousName(UnitCompiler.java:4417)

This is line 2445 in the generated code,

/* 2445 */     Object project_arg = scan_isNull1 ? null :
project_converter2.apply(scan_value1);

Many thanks



-- 
*Meeraj Kunnumpurath*


*Director and Executive PrincipalService Symphony Ltd00 44 7702 693597*

*00 971 50 409 0169meeraj@servicesymphony.com <meeraj@servicesymphony.com>*

Mime
View raw message