spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "itai.efrati" <>
Subject Mlib RandomForest (Spark 2.0) predict a single vector
Date Mon, 01 Aug 2016 19:32:45 GMT

After training a RandomForestRegressor in PipelineModel using mlib and
DataFrame (Spark 2.0) 
I loaded the saved model into my RT environment in order to predict using
the model, each request
is handled and transform through the loaded PipelineModel but in the process
I had to convert the 
single request vector to a one row DataFrame using spark.createdataframe all
of this takes around 700ms!

comparing to 2.5ms if I uses mllib RDD
Is there any way to use the new mlib to predict a a single vector without
converting to DataFrame or do something else to speed things up?

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe e-mail:

View raw message