lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Moenieb Davids <>
Subject Sole for Content Management
Date Thu, 07 Jun 2018 18:06:55 GMT
Hi All,

I am currently testing a deployment of a content management framework where
I am trying to punt Solr as the tool of choice for ingestion and searching.

Current status:
I have deployed SolrCloud across multiple servers with multiple shards and
a replication factor of 2.
In terms of collections, I have a person collection that contains details
individuals including address and high level portfolio info. Structurally,
this collection contains great grandchildren.
Then I have a few collections that deals with content. For now, content is
just emails and document with a max size of 2MB, with certain user
exceptions that can go higher than 2MB.
Content is indexed twice in terms of the actual content, firstly as
binary/stream and then as readable text. Metadata is negligible

When performing full text searches without concurrently executing updates,
solr seems to be doing well. Running updates also does okish given the
nature of the transaction. However, when I run search and updates
simultaneously, performance drops quite significantly. I have played with
field properties, analyzers, tokenizers, shafting sizes etc.
Any advice?
Would like to know if anyone has done something similar. Please excuse the
long winded message

Sent from Gmail Mobile

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message