lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zeynep P." <>
Subject Wikipedia revision history dump + lucene benchmark
Date Tue, 10 Apr 2012 17:33:51 GMT
wikipedia.alg in benchmark is only able to extract and index current pages
dumps. It does not take revisions into account. Do you know any way to do
this? Or should I change EnwikiContentSource to handle the versions?

Although, Wikipedia dumps are widely used especially for research purposes,
as far as I know, there is no topics/qrels for them (except the one here  for revision history
dump 2001 - 2005 which is annotated based on temporal expressions). The
question is that do you know any other?

By the way, I think in wikipedia.alg
should be remplaced by *EnwikiQueryMaker*.

Thanks in advance,
Best regards

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message