Couldn't you include all the needed columns in your input dataframe?

// maropu

On Fri, May 27, 2016 at 1:46 AM, Koert Kuipers <koert@tresata.com> wrote:
that is nice and compact, but it does not add the columns to an existing dataframe

On Wed, May 25, 2016 at 11:39 PM, Takeshi Yamamuro <linguin.m.s@gmail.com> wrote:
Hi,

How about this?
--
val func = udf((i: Int) => Tuple2(i, i))
val df = Seq((1, 0), (2, 5)).toDF("a", "b")
df.select(func($"a").as("r")).select($"r._1", $"r._2")

// maropu


On Thu, May 26, 2016 at 5:11 AM, Koert Kuipers <koert@tresata.com> wrote:
hello all,

i have a single udf that creates 2 outputs (so a tuple 2). i would like to add these 2 columns to my dataframe.

my current solution is along these lines:
df
  .withColumn("_temp_", udf(inputColumns))
  .withColumn("x", col("_temp_)("_1"))
  .withColumn("y", col("_temp_")("_2"))
  .drop("_temp_")

this works, but its not pretty with the temporary field stuff.

i also tried this:
val tmp = udf(inputColumns)
df
  .withColumn("x", tmp("_1"))
  .withColumn("y", tmp("_2"))

this also works, but unfortunately the udf is evaluated twice

is there a better way to do this?

thanks! koert



--
---
Takeshi Yamamuro




--
---
Takeshi Yamamuro