spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debasish Das <debasish.da...@gmail.com>
Subject Re: Spark LIBLINEAR
Date Wed, 14 May 2014 14:32:07 GMT
Hi Professor Lin,

On our internal datasets,  I am getting accuracy at par with glmnet-R for
sparse feature selection from liblinear. The default mllib based gradient
descent was way off. I did not tune learning rate but I run with varying
lambda. Ths feature selection was weak.

I used liblinear code. Next I will explore the distributed liblinear.

Adding the code on github will definitely help for collaboration.

I am experimenting if a bfgs / owlqn based sparse logistic in spark mllib
give us accuracy at par with liblinear.

If liblinear solver outperforms them (either accuracy/performance) we have
to bring tron to mllib and let other algorithms benefit from it as well.

We are using Bfgs and Owlqn solvers from breeze opt.

Thanks.
Deb
 On May 12, 2014 9:07 PM, "DB Tsai" <dbtsai@stanford.edu> wrote:

> It seems that the code isn't managed in github. Can be downloaded from
> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/distributed-liblinear/spark/spark-liblinear-1.94.zip
>
> It will be easier to track the changes in github.
>
>
>
> Sincerely,
>
> DB Tsai
> -------------------------------------------------------
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Mon, May 12, 2014 at 7:53 AM, Xiangrui Meng <mengxr@gmail.com> wrote:
>
>> Hi Chieh-Yen,
>>
>> Great to see the Spark implementation of LIBLINEAR! We will definitely
>> consider adding a wrapper in MLlib to support it. Is the source code
>> on github?
>>
>> Deb, Spark LIBLINEAR uses BSD license, which is compatible with Apache.
>>
>> Best,
>> Xiangrui
>>
>> On Sun, May 11, 2014 at 10:29 AM, Debasish Das <debasish.das83@gmail.com>
>> wrote:
>> > Hello Prof. Lin,
>> >
>> > Awesome news ! I am curious if you have any benchmarks comparing C++ MPI
>> > with Scala Spark liblinear implementations...
>> >
>> > Is Spark Liblinear apache licensed or there are any specific
>> restrictions on
>> > using it ?
>> >
>> > Except using native blas libraries (which each user has to manage by
>> pulling
>> > in their best proprietary BLAS package), all Spark code is Apache
>> licensed.
>> >
>> > Thanks.
>> > Deb
>> >
>> >
>> > On Sun, May 11, 2014 at 3:01 AM, DB Tsai <dbtsai@stanford.edu> wrote:
>> >>
>> >> Dear Prof. Lin,
>> >>
>> >> Interesting! We had an implementation of L-BFGS in Spark and already
>> >> merged in the upstream now.
>> >>
>> >> We read your paper comparing TRON and OWL-QN for logistic regression
>> with
>> >> L1 (http://www.csie.ntu.edu.tw/~cjlin/papers/l1.pdf), but it seems
>> that it's
>> >> not in the distributed setup.
>> >>
>> >> Will be very interesting to know the L2 logistic regression benchmark
>> >> result in Spark with your TRON optimizer and the L-BFGS optimizer
>> against
>> >> different datasets (sparse, dense, and wide, etc).
>> >>
>> >> I'll try your TRON out soon.
>> >>
>> >>
>> >> Sincerely,
>> >>
>> >> DB Tsai
>> >> -------------------------------------------------------
>> >> My Blog: https://www.dbtsai.com
>> >> LinkedIn: https://www.linkedin.com/in/dbtsai
>> >>
>> >>
>> >> On Sun, May 11, 2014 at 1:49 AM, Chieh-Yen <r01944006@csie.ntu.edu.tw>
>> >> wrote:
>> >>>
>> >>> Dear all,
>> >>>
>> >>> Recently we released a distributed extension of LIBLINEAR at
>> >>>
>> >>> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/distributed-liblinear/
>> >>>
>> >>> Currently, TRON for logistic regression and L2-loss SVM is supported.
>> >>> We provided both MPI and Spark implementations.
>> >>> This is very preliminary so your comments are very welcome.
>> >>>
>> >>> Thanks,
>> >>> Chieh-Yen
>> >>
>> >>
>> >
>>
>
>

Mime
View raw message