Hey,
We're actually working on similar ideas in the AMPlab with spark  for example we've got some
image classification pipelines built on this idea  http://www.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf
Approximating kernel methods via random projections hit with nonlinearity.
Additionally, block coordinate descent seems to work pretty well for these scenarios where
you end up with lots of features. An advantage of this approach in spark is that you can avoid
materializing the whole data matrix if you're working on a subset of columns at a time.
We're hoping to have some code out for public consumption later in the summer.
 Evan
> On Jun 28, 2014, at 12:12 PM, Tom Vacek <minnesota.cs@gmail.com> wrote:
>
> What is your general solver? IPM or simplex or something else? I have
> seen a lot of attempts to apply iterative solvers for the subproblems on
> those without much luck because the conditioning of the linear systems gets
> worse and worse near the optimum. IPOPT (interior point method) has an
> LBFGS subproblem solver built in, so maybe it's worth a try to see if the
> approach would meet your needs. Methods more amenable to iterative
> subproblem solvers might be augmented Lagrangian, nonlinear rescaling, or
> Murty's (
> http://www.researchgate.net/publication/250180549_A_New_Practically_Efficient_Interior_Point_Method_for_Quadratic_Programming)
> methods.
>
> This blog post gives a decent introduction to the kernel approximation
> topic:
> http://peekaboovision.blogspot.com/2012/12/kernelapproximationsforefficient.html.
> Missing is mention of the research into how to choose the best set of
> prototype vectors. (I believe Kmeans on the data is practically best.) In
> the approximation domain, "Fastfood" by Smola, et al. is a neat idea.
> That's something I've thought about for MLLib, but I think the numeric
> support is lacking in Java land.
>
>
> On Sat, Jun 28, 2014 at 1:47 PM, Debasish Das <debasish.das83@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am coming up with an iterative solver for Equality and bound constrained
>> quadratic minimization...
>>
>> I have the cholesky versions running but cholesky does not scale for large
>> dimensions but works fine for matrix factorization usecases where ranks
>> are low..
>>
>> Minimize 0.5x'Px + q'x
>> s.t Aeq x = beq
>> lb <= x <= ub
>>
>> Based on your decomposition you will end up using linear CG in xupdate or
>> NLCG/BFGS with bounds...I am not sure which one is better unless I see both
>> of them running on datasets....
>>
>> I am hoping we can reuse the solver for SVM variants...
>>
>> Could you please point to some implementation references for Nystrom
>> approximations or kernel mappings ?
>>
>> Thanks.
>> Deb
>>
>>
>> On Sat, Jun 28, 2014 at 11:26 AM, Tom Vacek <minnesota.cs@gmail.com>
>> wrote:
>>
>>> What flavor of SVM are you trying to support? LSSVM doesn't need a bound
>>> constraint, but most other formulations do. There have been ideas for
>>> boundconstrained CG, though bounded LBFGS is more common. I think code
>>> for Nystrom approximations or kernel mappings would be more useful.
>>>
>>>
>>> On Fri, Jun 27, 2014 at 5:42 PM, Debasish Das <debasish.das83@gmail.com>
>>> wrote:
>>>
>>>> Thanks David...Let me try it...I am keen to see the results first and
>>> later
>>>> will look into runtime optimizations...
>>>>
>>>> Deb
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> On Fri, Jun 27, 2014 at 3:12 PM, David Hall <dlwh@cs.berkeley.edu>
>>>> wrote:
>>>>
>>>>> I have no ideas on benchmarks, but breeze has a CG solver:
>> https://github.com/scalanlp/breeze/tree/master/math/src/main/scala/breeze/optimize/linear/ConjugateGradient.scala
>> https://github.com/scalanlp/breeze/blob/e2adad3b885736baf890b306806a56abc77a3ed3/math/src/test/scala/breeze/optimize/linear/ConjugateGradientTest.scala
>>>>>
>>>>> It's based on the code from TRON, and so I think it's more targeted
>> for
>>>>> normconstrained solutions of the CG problem.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 27, 2014 at 5:54 PM, Debasish Das <
>>> debasish.das83@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am looking for an efficient linear CG to be put inside the
>>> Quadratic
>>>>>> Minimization algorithms we added for Spark mllib.
>>>>>>
>>>>>> With a good linear CG, we should be able to solve kernel SVMs with
>>> this
>>>>>> solver in mllib...
>>>>>>
>>>>>> I use direct solves right now using cholesky decomposition which
>> has
>>>>> higher
>>>>>> complexity as matrix sizes become large...
>>>>>>
>>>>>> I found out some jblas example code:
>> https://github.com/mikiobraun/jblasexamples/blob/master/src/CG.java
>>>>>>
>>>>>> I was wondering if mllib developers have any experience using this
>>>> solver
>>>>>> and if this is better than apache commons linear CG ?
>>>>>>
>>>>>> Thanks.
>>>>>> Deb
>>
