spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DB Tsai <dbt...@stanford.edu>
Subject Re: Spark LIBLINEAR
Date Thu, 15 May 2014 18:43:26 GMT
Hi Deb,

My co-worker fixed a owlqn bug in breeze, and it's important to have this
to converge to the correct result.

https://github.com/scalanlp/breeze/pull/247

You may want to use the snapshot of breeze to have this fix in.


Sincerely,

DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Wed, May 14, 2014 at 7:32 AM, Debasish Das <debasish.das83@gmail.com>wrote:

> Hi Professor Lin,
>
> On our internal datasets,  I am getting accuracy at par with glmnet-R for
> sparse feature selection from liblinear. The default mllib based gradient
> descent was way off. I did not tune learning rate but I run with varying
> lambda. Ths feature selection was weak.
>
> I used liblinear code. Next I will explore the distributed liblinear.
>
> Adding the code on github will definitely help for collaboration.
>
> I am experimenting if a bfgs / owlqn based sparse logistic in spark mllib
> give us accuracy at par with liblinear.
>
> If liblinear solver outperforms them (either accuracy/performance) we have
> to bring tron to mllib and let other algorithms benefit from it as well.
>
> We are using Bfgs and Owlqn solvers from breeze opt.
>
> Thanks.
> Deb
>  On May 12, 2014 9:07 PM, "DB Tsai" <dbtsai@stanford.edu> wrote:
>
>> It seems that the code isn't managed in github. Can be downloaded from
>> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/distributed-liblinear/spark/spark-liblinear-1.94.zip
>>
>> It will be easier to track the changes in github.
>>
>>
>>
>> Sincerely,
>>
>> DB Tsai
>> -------------------------------------------------------
>> My Blog: https://www.dbtsai.com
>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>
>>
>> On Mon, May 12, 2014 at 7:53 AM, Xiangrui Meng <mengxr@gmail.com> wrote:
>>
>>> Hi Chieh-Yen,
>>>
>>> Great to see the Spark implementation of LIBLINEAR! We will definitely
>>> consider adding a wrapper in MLlib to support it. Is the source code
>>> on github?
>>>
>>> Deb, Spark LIBLINEAR uses BSD license, which is compatible with Apache.
>>>
>>> Best,
>>> Xiangrui
>>>
>>> On Sun, May 11, 2014 at 10:29 AM, Debasish Das <debasish.das83@gmail.com>
>>> wrote:
>>> > Hello Prof. Lin,
>>> >
>>> > Awesome news ! I am curious if you have any benchmarks comparing C++
>>> MPI
>>> > with Scala Spark liblinear implementations...
>>> >
>>> > Is Spark Liblinear apache licensed or there are any specific
>>> restrictions on
>>> > using it ?
>>> >
>>> > Except using native blas libraries (which each user has to manage by
>>> pulling
>>> > in their best proprietary BLAS package), all Spark code is Apache
>>> licensed.
>>> >
>>> > Thanks.
>>> > Deb
>>> >
>>> >
>>> > On Sun, May 11, 2014 at 3:01 AM, DB Tsai <dbtsai@stanford.edu> wrote:
>>> >>
>>> >> Dear Prof. Lin,
>>> >>
>>> >> Interesting! We had an implementation of L-BFGS in Spark and already
>>> >> merged in the upstream now.
>>> >>
>>> >> We read your paper comparing TRON and OWL-QN for logistic regression
>>> with
>>> >> L1 (http://www.csie.ntu.edu.tw/~cjlin/papers/l1.pdf), but it seems
>>> that it's
>>> >> not in the distributed setup.
>>> >>
>>> >> Will be very interesting to know the L2 logistic regression benchmark
>>> >> result in Spark with your TRON optimizer and the L-BFGS optimizer
>>> against
>>> >> different datasets (sparse, dense, and wide, etc).
>>> >>
>>> >> I'll try your TRON out soon.
>>> >>
>>> >>
>>> >> Sincerely,
>>> >>
>>> >> DB Tsai
>>> >> -------------------------------------------------------
>>> >> My Blog: https://www.dbtsai.com
>>> >> LinkedIn: https://www.linkedin.com/in/dbtsai
>>> >>
>>> >>
>>> >> On Sun, May 11, 2014 at 1:49 AM, Chieh-Yen <r01944006@csie.ntu.edu.tw
>>> >
>>> >> wrote:
>>> >>>
>>> >>> Dear all,
>>> >>>
>>> >>> Recently we released a distributed extension of LIBLINEAR at
>>> >>>
>>> >>> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/distributed-liblinear/
>>> >>>
>>> >>> Currently, TRON for logistic regression and L2-loss SVM is supported.
>>> >>> We provided both MPI and Spark implementations.
>>> >>> This is very preliminary so your comments are very welcome.
>>> >>>
>>> >>> Thanks,
>>> >>> Chieh-Yen
>>> >>
>>> >>
>>> >
>>>
>>
>>

Mime
View raw message