spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Hakobian <nicholas.hakob...@rallyhealth.com>
Subject Re: use CrossValidatorModel for prediction
Date Mon, 03 Oct 2016 16:27:57 GMT
It means that somewhere in your job you are creating new
SqlContext/HiveContext a large number of times. A new tab is created for
each context. You can reuse contexts by using the getOrCreate() function if
you want to reuse without passing an explicit reference around.

-Nick

Nicholas Szandor Hakobian, Ph.D.
Senior Data Scientist
Rally Health
nicholas.hakobian@rallyhealth.com


On Sun, Oct 2, 2016 at 7:12 PM, Pengcheng <pchluo@gmail.com> wrote:

> After I used dataframe, sparkSQL, the dashboard showed bunch of SQLs.
>
> Does that mean all the sql job are still running? The list is ever growing
> and never stops.
>
>
> thanks for any pointer!
>
>
> peter
>
>
>
> [image: Inline image 1]
>
> On Sun, Oct 2, 2016 at 11:04 AM, Pengcheng Luo <pchluo@gmail.com> wrote:
>
>>
>>
>> On Oct 2, 2016, at 1:04 AM, Pengcheng <pchluo@gmail.com> wrote:
>>
>> Dear Spark Users,
>>
>> I was wondering.
>>
>> I have a trained crossvalidator model
>> *model: CrossValidatorModel*
>>
>> I wan to predict a score for  *features: RDD[Features]*
>>
>> Right now I have to convert features to dataframe and then perform
>> predictions as following:
>>
>> """
>>     val sqlContext = new SQLContext(features.context)
>>     val input: DataFrame = sqlContext.createDataFrame(features.map(x =>
>> (Vectors.dense(x.getArray),1.0) )).toDF("features", "label")
>>     model.transform(input)
>> """
>>
>> *i wonder If there is any API I can use to performance prediction on each
>> individual Features*
>>
>> *For example, features.map( x => model.predict(x) ) *
>>
>>
>>
>> a big thank you!
>>
>> peter
>>
>>
>

Mime
View raw message