lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene
Date Thu, 04 Jan 2007 02:40:27 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462117
] 

Grant Ingersoll commented on LUCENE-675:
----------------------------------------

Doron,

When I apply your patch, I am getting strange errors.  It seems to go through cleanly, but
then the new files (for instance, byTask.stats.Report.java) has the whole file occurring twice
in each file, thus causing duplicate class exceptions.  This happens for all the files in
the byTask package.  The changes in the other files apply cleanly.

I applied the patch as: patch -p0 -i <patch file> as I always do on a clean version.

I suspect that your last comment may be at the root of the issue. Can you try applying this
again to a clean version and see if you still have issues or whether it is something I am
missing?  Can you regenerate this patch, perhaps using a command line tool?  Looking at the
patch file, I am not sure what the issue is.  

Otherwise, based on the documentation, this sounds really interesting and useful.  Based on
some of your other patches, I assume you are using this to do benchmarking, no?

Thanks,
Grant

> Lucene benchmark: objective performance test for Lucene
> -------------------------------------------------------
>
>                 Key: LUCENE-675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-675
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Andrzej Bialecki 
>         Assigned To: Grant Ingersoll
>            Priority: Minor
>         Attachments: benchmark.byTask.patch, benchmark.patch, BenchmarkingIndexer.pm,
extract_reuters.plx, LuceneBenchmark.java, LuceneIndexer.java, taskBenchmark.zip, timedata.zip,
tiny.alg, tiny.properties
>
>
> We need an objective way to measure the performance of Lucene, both indexing and querying,
on a known corpus. This issue is intended to collect comments and patches implementing a suite
of such benchmarking tests.
> Regarding the corpus: one of the widely used and freely available corpora is the original
Reuters collection, available from http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.tar.gz
or http://people.csail.mit.edu/u/j/jrennie/public_html/20Newsgroups/20news-18828.tar.gz. I
propose to use this corpus as a base for benchmarks. The benchmarking suite could automatically
retrieve it from known locations, and cache it locally.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message