lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Moenieb Davids <moenieb.dav...@gmail.com>
Subject Fwd: Sole for Content Management
Date Thu, 07 Jun 2018 18:09:25 GMT
---------- Forwarded message ----------
From: *Moenieb Davids* <moenieb.davids@gmail.com>
Date: Thursday, June 7, 2018
Subject: Sole for Content Management
To: "general@lucene.apache.org" <general@lucene.apache.org>, "
user@lucene.apache.org" <user@lucene.apache.org>


Hi All,

Background:
I am currently testing a deployment of a content management framework where
I am trying to punt Solr as the tool of choice for ingestion and searching.

Current status:
I have deployed SolrCloud across multiple servers with multiple shards and
a replication factor of 2.
In terms of collections, I have a person collection that contains details
individuals including address and high level portfolio info. Structurally,
this collection contains great grandchildren.
Then I have a few collections that deals with content. For now, content is
just emails and document with a max size of 2MB, with certain user
exceptions that can go higher than 2MB.
Content is indexed twice in terms of the actual content, firstly as
binary/stream and then as readable text. Metadata is negligible


Challenges:
When performing full text searches without concurrently executing updates,
solr seems to be doing well. Running updates also does okish given the
nature of the transaction. However, when I run search and updates
simultaneously, performance drops quite significantly. I have played with
field properties, analyzers, tokenizers, shafting sizes etc.
Any advice?
Would like to know if anyone has done something similar. Please excuse the
long winded message


-- 
Sent from Gmail Mobile



-- 
Sent from Gmail Mobile

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message