flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sachingoel0101 <...@git.apache.org>
Subject [GitHub] flink pull request: Decision tree [Flink-1727]
Date Thu, 21 May 2015 13:51:24 GMT
GitHub user sachingoel0101 opened a pull request:


    Decision tree [Flink-1727]

    This implements a part of the Decision Tree Algorithm. As of now, only continuous valued
fields are implemented. Also, Gini index based splitting only. Entropy to be added later.
    Also adds an Online Histogram based on Ben-Haim and Yom-Tov's paper [http://www.jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf]
    Tested on the Iris data set. [https://archive.ics.uci.edu/ml/machine-learning-databases/iris/]
    Achieving an accuracy of 96.7% based on a 80:20 split of the training data. [Included
in the testing suite as DecisionTreeSuite]

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sachingoel0101/flink decision_tree

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #708
commit b855f177fe4a017d167c43de1b1606cd891a58ef
Author: Sachin Goel <sachingoel0101@gmail.com>
Date:   2015-05-16T13:41:42Z

    Histogram implementation done with tests

commit c64ea0452be867e5fcb0bcb3f8401dbc74ce8fa6
Author: Sachin Goel <sachingoel0101@gmail.com>
Date:   2015-05-21T13:45:24Z

    Decision tree implemented. For continuous data. Only Gini. Tested on Iris


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message