spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Praneeth Gayam <praneeth.ga...@gmail.com>
Subject Re: use WithColumn with external function in a java jar
Date Tue, 29 Aug 2017 02:22:54 GMT
You can create a UDF which will invoke your java lib

def calculateExpense: UserDefinedFunction = udf((pexpense: String,
cexpense: String) => new MyJava().calculateExpense(pexpense.toDouble,
cexpense.toDouble))





On Tue, Aug 29, 2017 at 6:53 AM, purna pradeep <purna2pradeep@gmail.com>
wrote:

> I have data in a DataFrame with below columns
>
> 1)Fileformat is csv
> 2)All below column datatypes are String
>
>     employeeid,pexpense,cexpense
>
> Now I need to create a new DataFrame which has new column called
> `expense`, which is calculated based on columns `pexpense`, `cexpense`.
>
> The tricky part is the calculation algorithm is not an **UDF** function
> which I created, but it's an external function that needs to be imported
> from a Java library which takes primitive types as arguments - in this case
> `pexpense`, `cexpense` - to calculate the value required for new column.
>
> The external function signature
>
>     public class MyJava
>
>     {
>
>         public Double calculateExpense(Double pexpense, Double cexpense) {
>            // calculation
>         }
>
>     }
>
> So how can I invoke that external function to create a new calculated
> column. Can I register that external function as UDF in my Spark
> application?
>
> Stackoverflow reference
>
> https://stackoverflow.com/questions/45928007/use-withcolumn-with-external-
> function
>
>
>
>
>
>

Mime
View raw message