mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: Misfires in OnlineSummarizer
Date Sun, 17 Apr 2011 16:13:10 GMT
If i read Lance's code correctly, he indeed gives them consecutively.

On Sun, Apr 17, 2011 at 2:25 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
> Yeah...
>
> What Sean says.  The inaccuracy surprises me a bit, but it is outside the
> intended usage.
>
> Did you give the values in random order or in consecutive order?  If they
> are consecutive, then I am not worried at all.  If you got this error from
> random ordering, I am a bit more unhappy.
>
> On Sun, Apr 17, 2011 at 2:21 AM, Sean Owen <srowen@gmail.com> wrote:
>
>> The implementation is intentionally an approximation which uses
>> constant memory, instead of tracking the entire data set, which is
>> necessary to get an exact answer. You should find it converges to the
>> expected values with more data.
>>
>> On Sun, Apr 17, 2011 at 7:53 AM, Lance Norskog <goksron@gmail.com> wrote:
>> > If you add the Java methods at the bottom to the
>> > org.apache.mahout.stats.OnlineSummarizer and run the main(), a funny
>> > thing prints out:
>> >
>> >
>> [(count=200.0),(sd=28.8660),(mean=49.5000),(min=0.0),(25%=34.1312),(median=60.2104),(75%=83.8722),(max=99.0),]
>> >
>> > I added the numbers 0-99 twice to the summarizer. I would have
>> > expected the 25%=25 +/- 1, median=50 +/- 1, and 75%=75 +/- 1
>> > Note that the mean is correct.
>> >
>> ---------------------------------------------------------------------------
>> >
>> >  @Override
>> >  public String toString() {
>> >   return "[" +
>> >   pair("count", getCount()) + pair("sd", getSD()) + pair("mean",
>> getMean()) +
>> >   pair("min", getMin()) + pair("25%", getQuartile(1)) +
>> > pair("median", getMedian()) +
>> >      pair("75%", getQuartile(3)) + pair("max", getMax()) + "]";
>> >  }
>> >
>> >  private String pair(String tag, double value) {
>> >    String s = Double.toString(value);
>> >    if (s.length() > 8)
>> >      s = s.substring(0, 7);
>> >    return "(" + tag + "=" + s + "),";
>> >  }
>> >
>> >  public static void main(String[] args) {
>> >    OnlineSummarizer osQ = new OnlineSummarizer();
>> >    for(int i = 0; i < 200; i++) {
>> >      osQ.add(i % 100);
>> >    }
>> >    System.out.println(osQ.toString());
>> >  }
>> >
>> > --
>> > Lance Norskog
>> > goksron@gmail.com
>> >
>>
>

Mime
View raw message