metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cestella <...@git.apache.org>
Subject [GitHub] metron pull request #867: METRON-1350: Add reservoir sampling functions to S...
Date Wed, 13 Dec 2017 21:53:56 GMT
Github user cestella commented on a diff in the pull request:

    https://github.com/apache/metron/pull/867#discussion_r156796690
  
    --- Diff: metron-analytics/metron-statistics/README.md ---
    @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used.
       * bounds - A list of value bounds (excluding min and max) in sorted order.
     * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N).
 No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th
bin, and values greater than the last bound go in the M'th bin.
     
    +### Sampling Functions
    +
    +#### `SAMPLE_ADD`
    +* Description: Add a value or collection of values to a sampler.
    +* Input:
    --- End diff --
    
    They're both needed.  Some use-cases would be fine without bias and some would be better
with bias.  As a follow-on, I was planning on adding a biased sampler, but this is a big enough
PR without it.  It'd look something like:
    ```
    samples := SAMPLE_MERGE(PROFILE_GET('samples', ...))
    biased_sample := SAMPLE_GET_BIASED(samples, 0.015)
    ```


---

Mime
View raw message