spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hollin Wilkins <hol...@combust.ml>
Subject Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext
Date Fri, 03 Feb 2017 16:53:09 GMT
Hey Aseem,

We have built pipelines that execute several string indexers, one hot
encoders, scaling, and a random forest or linear regression at the end.
Execution time for the linear regression was on the order of 11
microseconds, a bit longer for random forest. This can be further optimized
by using row-based transformations if your pipeline is simple to around 2-3
microseconds. The pipeline operated on roughly 12 input features, and by
the time all the processing was done, we had somewhere around 1000 features
or so going into the linear regression after one hot encoding and
everything else.

Hope this helps,
Hollin

On Fri, Feb 3, 2017 at 4:05 AM, Aseem Bansal <asmbansal2@gmail.com> wrote:

> Does this support Java 7?
>
> On Fri, Feb 3, 2017 at 5:30 PM, Aseem Bansal <asmbansal2@gmail.com> wrote:
>
>> Is computational time for predictions on the order of few milliseconds (<
>> 10 ms) like the old mllib library?
>>
>> On Thu, Feb 2, 2017 at 10:12 PM, Hollin Wilkins <hollin@combust.ml>
>> wrote:
>>
>>> Hey everyone,
>>>
>>>
>>> Some of you may have seen Mikhail and I talk at Spark/Hadoop Summits
>>> about MLeap and how you can use it to build production services from your
>>> Spark-trained ML pipelines. MLeap is an open-source technology that allows
>>> Data Scientists and Engineers to deploy Spark-trained ML Pipelines and
>>> Models to a scoring engine instantly. The MLeap execution engine has no
>>> dependencies on a Spark context and the serialization format is entirely
>>> based on Protobuf 3 and JSON.
>>>
>>>
>>> The recent 0.5.0 release provides serialization and inference support
>>> for close to 100% of Spark transformers (we don’t yet support ALS and LDA).
>>>
>>>
>>> MLeap is open-source, take a look at our Github page:
>>>
>>> https://github.com/combust/mleap
>>>
>>>
>>> Or join the conversation on Gitter:
>>>
>>> https://gitter.im/combust/mleap
>>>
>>>
>>> We have a set of documentation to help get you started here:
>>>
>>> http://mleap-docs.combust.ml/
>>>
>>>
>>> We even have a set of demos, for training ML Pipelines and linear,
>>> logistic and random forest models:
>>>
>>> https://github.com/combust/mleap-demo
>>>
>>>
>>> Check out our latest MLeap-serving Docker image, which allows you to
>>> expose a REST interface to your Spark ML pipeline models:
>>>
>>> http://mleap-docs.combust.ml/mleap-serving/
>>>
>>>
>>> Several companies are using MLeap in production and even more are
>>> currently evaluating it. Take a look and tell us what you think! We hope to
>>> talk with you soon and welcome feedback/suggestions!
>>>
>>>
>>> Sincerely,
>>>
>>> Hollin and Mikhail
>>>
>>
>>
>

Mime
View raw message