spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pietrop <>
Subject Spark R guidelines for non-spark functions and coxph (Cox Regression for Time-Dependent Covariates)
Date Tue, 15 Nov 2016 11:56:27 GMT
Hi all,
I'm writing here after some intensive usage on pyspark and SparkSQL.
I would like to use a well known function in the R world: coxph() from the
survival package.
>From what I understood, I can't parallelize a function like coxph() because
it isn't provided with the SparkR package. In other words, I should
implement a SparkR compatible algorithm instead of using coxph().
I have no chance to make coxph() parallelizable, right?
More generally, I think this is true for any non-spark function which only
accept data.frame format as the data input. 

Do you plan to implement the coxph() counterpart in Spark? The most useful
version of this model is the Cox Regression Model for Time-Dependent
Covariates, which is missing from ANY ML framework as far as I know.

Thank you

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe e-mail:

View raw message