lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "aris buinevicius" <>
Subject tagging application, best way to architect?
Date Thu, 10 Jul 2008 02:23:26 GMT
We're trying to implement a large scale domain specific web email
application, and so far solr performance on the search side is really doing
well for us.

There are two limitations that I can't seem to get around however, and was
hoping for some advice.

1. We would like to do bulk tagging on large query result sets (ie, if you
have 1M emails, do a search, and then you wish to apply a tag to the result
set of, say, 250k results).   I've tried many approaches, but the closest
support I could see was the update field functionality in SOLR-139.   Is
there any other way to separate the very dynamic metadata (tags and other
fields) abstracted away from the static documents themselves?   I've
researched joining against a metadata database, but unfortunately the join
logic for large results is just too bulky to be perform well at scale.
Also have even looked at postgres tsearch2, but that also breaks down with a
large number of emails.

2. We're assuming we'll have thousands of users with independent data; any
good way to partition multiple indexes with solr?   With Lucene we could
just save those in independent directories, and cache the index while the
user session is active.   I saw some configurations on tomcat that would
allow multiple instances, but that's probably not practical for lots of
concurrent users.

Thanks for any tips; would love to use Solr (or Lucene), but haven't been
able to get around issue 1 yet for large numbers of emails in a timely
response.   We've really looked at the gamut here, including solr, lucene,
postgres (tsearch2), sphinx, xapian, couchdb(!), and more.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message