flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chiwanpark <...@git.apache.org>
Subject [GitHub] flink pull request: Flink 1745
Date Sun, 04 Oct 2015 14:16:29 GMT
Github user chiwanpark commented on the pull request:

    I just reviewed quickly. I have some comments for this pull request.
    1. This pull request cannot pass checking Scala code style. The Scala code should not
exceed 100 characters in each line. I also found some wrong indentations such as line 200-211
in `KNN.scala` file. Please reformat your code.
    2. The meaning of `BruteOrQuad` is ambiguous. From my understand, if `BruteOrQuad` is
true, we use quadtree implementation, right? So the other name such as `useQuadTree` would
be better.
    3. Please remove blank line after Scala doc comments.
    4. Your implementation assumes that the type of given `Vector` is `DenseVector`. But basically,
there is no guarantee of the type of `Vector`. We should generalize the implementation.
    5. We should be able to configure the use of quadtree implementation. Please add configure
parameter into `KNN` object.
    6. Currently, `predictDataSet` is too big and complex because of large code in `mapPartition`
operation. How about split the operation into `mapPartition` for raw and quadtree?
    7. Please avoid calculation of `BigDecimal`. Maybe we can check whether we use quadtree
or not without `BigDecimal` like following:
      * `(4 ^ dim) * Ntest * log(Ntrain) < Ntest * Ntrain` is same as `dim + log_4(log(Ntrain))
< log_4(Ntrain)`
    This is great start! If you have some question about my comments, please feel free to
    P.S.: How about create training quadtree before cross operation? Is there any intention
to create quadtree after cross operation?

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message