spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Bradley <jos...@databricks.com>
Subject Re: LogisticGradient Design
Date Fri, 27 Mar 2015 19:49:51 GMT
Makes sense!

On Wed, Mar 25, 2015 at 2:46 PM, Debasish Das <debasish.das83@gmail.com>
wrote:

> Cool...Thanks...It will be great if they move in two code paths just for
> the sake of code clean-up
>
> On Wed, Mar 25, 2015 at 2:37 PM, DB Tsai <dbtsai@dbtsai.com> wrote:
>
>> I did the benchmark when I used the if-else statement to switch the
>> binary & multinomial logistic loss and gradient, and there is no
>> performance hit at all. However, I'm refactoring the LogisticGradient
>> code so the addBias and scaling can be done in LogisticGradient
>> instead of the input dataset to avoid the second cache. In this case,
>> the code will be more complicated, so I will split the code into two
>> paths. Will be done in another PR.
>>
>> Sincerely,
>>
>> DB Tsai
>> -------------------------------------------------------
>> Blog: https://www.dbtsai.com
>>
>>
>> On Wed, Mar 25, 2015 at 11:57 AM, Joseph Bradley <joseph@databricks.com>
>> wrote:
>> > It would be nice to see how big a performance hit we take from combining
>> > binary & multiclass logistic loss/gradient.  If it's not a big hit,
>> then it
>> > might be simpler from an outside API perspective to keep them in 1 class
>> > (even if it's more complicated within).
>> > Joseph
>> >
>> > On Wed, Mar 25, 2015 at 8:15 AM, Debasish Das <debasish.das83@gmail.com
>> >
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> Right now LogisticGradient implements both binary and multi-class in
>> the
>> >> same class using an if-else statement which is a bit convoluted.
>> >>
>> >> For Generalized matrix factorization, if the data has distinct ratings
>> I
>> >> want to use LeastSquareGradient (regression has given best results to
>> date)
>> >> but if the data has binary labels 0/1 based on domain knowledge
>> (implicit
>> >> for example, visits no-visits) I want to use a LogisticGradient
>> without any
>> >> overhead for multi-class if-else...
>> >>
>> >> I can compare the performance of LeastSquareGradient and multi-class
>> >> LogisticGradient on the recommendation metrics but it will be great if
>> we
>> >> can separate binary and multi-class in Separate
>> >> classes....MultiClassLogistic can extend BinaryLogistic but mixing
>> them in
>> >> the same class is an overhead for users (like me) who wants to use
>> >> BinaryLogistic for his application..
>> >>
>> >> Thanks.
>> >> Deb
>> >>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message