spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Felix Cheung (JIRA)" <>
Subject [jira] [Commented] (SPARK-18823) Assignation by column name variable not available or bug?
Date Tue, 13 Dec 2016 07:15:58 GMT


Felix Cheung commented on SPARK-18823:

How important it is to support
df[[myname]] <- c(1:nrow(df))
df[[2]] <- df$eruptions

I think we should support
df$waiting <- c(1:nrow(df))

which I've plan to work on.

> Assignation by column name variable not available or bug?
> ---------------------------------------------------------
>                 Key: SPARK-18823
>                 URL:
>             Project: Spark
>          Issue Type: Question
>          Components: SparkR
>    Affects Versions: 2.0.2
>         Environment: RStudio Server in EC2 Instances (EMR Service of AWS) Emr 4. Or databricks
( .
>            Reporter: Vicente Masip
>             Fix For: 2.0.2
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> I really don't know if this is a bug or can be done with some function:
> Sometimes is very important to assign something to a column which name has to be access
trough a variable. Normally, I have always used it with doble brackets likes this out of SparkR
> # df could be faithful normal data frame or data table.
> # accesing by variable name:
> myname = "waiting"
> df[[myname]] <- c(1:nrow(df))
> # or even column number
> df[[2]] <- df$eruptions
> The error is not caused by the right side of the "<-" operator of assignment. The
problem is that I can't assign to a column name using a variable or column number as I do
in this examples out of spark. Doesn't matter if I am modifying or creating column. Same problem.
> I have also tried to use this with no results:
> val df2 = withColumn(df,"tmp", df$eruptions)

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message