flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] azagrebin commented on a change in pull request #6425: [FLINK-9664][Doc] fixing documentation in ML quick start
Date Thu, 09 Aug 2018 07:58:38 GMT
azagrebin commented on a change in pull request #6425: [FLINK-9664][Doc] fixing documentation
in ML quick start
URL: https://github.com/apache/flink/pull/6425#discussion_r208837648

 File path: docs/dev/libs/ml/quickstart.md
 @@ -129,6 +129,10 @@ and the [test set here](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/b
 This is an astroparticle binary classification dataset, used by Hsu et al. [[3]](#hsu) in
 practical Support Vector Machine (SVM) guide. It contains 4 numerical features, and the class
+Before importing the traning and test dataset, Flink SVM only supports threshold binary values
+`+1.0` and `-1.0`. Thus a conversion is needed upon downloading the svmguide1 dataset since
it is 
+labelled using `1`s and `0`s.
 Review comment:
   By sections I mean `LibSVM files` and `Classification` parts of `quickstart.md`.
   I think your explanation of why we need the conversion was good, expanded enough, I just
suggested to rephrase its start a bit to be moved to the beginning of `Classification` section.
The example of conversion can follow your explanation. The overall structure I suggest:
   *LibSVM files*
   ...Text as before...
   ((( leave only lib SVM importing specifics in this example: )))
   val astroTrainLibSVM: DataSet[LabeledVector] = MLUtils.readLibSVM(env, "/path/to/svmguide1")
   val astroTestLibSVM: DataSet[LabeledVector] = MLUtils.readLibSVM(env, "/path/to/svmguide1.t")
   ...Text as before..
   ...Explanation of conversion need before classification...:
   // conversion code example, e.g. which I suggested
   ...section continues as it was with classification description and its example..
   The idea is that at the end user can just copy/paste code snippets starting from the import
code, then conversion/normalisation, then classification etc and it eventually works altogether.

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

View raw message