spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ayan guha <guha.a...@gmail.com>
Subject SPARK MLLib - How to tie back Model.predict output to original data?
Date Wed, 17 Aug 2016 00:48:37 GMT
Hi

I have a dataset as follows:

DF:
amount:float
date_read:date
meter_number:string

I am trying to predict future amount based on past 3 weeks consumption (and
a heaps of weather data related to date).

My Labelpoint looks like

label (populated from DF.amount)
features (populated from a bunch of other stuff)

Model.predict output:
label
prediction

Now, I am trying to put together this prediction value back to meter number
and date_read from original DF?

One way to assume order of records in DF and Model.predict will be exactly
same and zip two RDDs. But any other (possibly better) solution?

-- 
Best Regards,
Ayan Guha

Mime
View raw message