phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankit Singhal <ankitsingha...@gmail.com>
Subject Re: Excessive ExecService RPCs with multi-threaded ingest
Date Wed, 23 Nov 2016 12:53:17 GMT
How about not sending the IndexMaintainers from the client and prepare them
at the server itself and cache/refresh it per table like we do currently
for PTable?

On Mon, Oct 24, 2016 at 9:32 AM, Josh Elser <josh.elser@gmail.com> wrote:

> If anyone is interested, I did hack on this some more over the weekend.
>
> https://github.com/joshelser/phoenix/tree/reduced-server-cache-rpc
>
> Very much in a state of "well, it compiles". Will try to find some more
> time to poke at it and measure whether or not it actually makes a positive
> impact (with serialized IndexMaintainers only being about 20bytes for one
> index table, the server-side memory impact certainly isn't that crazy, but
> the extra RPCs likely adds up).
>
> Feedback welcome from the brave :)
>
>
> Josh Elser wrote:
>
>> Hi folks,
>>
>> I was doing some testing earlier this week and Enis's keen eye caught
>> something rather interesting.
>>
>> When using YCSB to ingest data into a table with a secondary index using
>> 8 threads and batch size of 1000 rows, the number of ExecService
>> coprocessor calls actually exceeded the number of Multi calls to write
>> the data (something like 21k ExecService calls to 18k Multi calls).
>>
>> I dug into this some more and noticed that it's because each thread is
>> creating its own ServerCache to store the serialized IndexMetadata
>> before shipping the data table updates. So, when we have 8 threads all
>> writing mutations for the same data and index table, we have ~8x the
>> ServerCache entries being created than if we had just one thread.
>>
>> Looking at the code, I completely understand why they're local to the
>> thread and not shared on the Connection (very tricky), but I'm curious
>> if anyone had noticed this before or if there are reasons to not try to
>> share these ServerCache(s) across threads. Looking at the data being put
>> into the ServerCache, it appears to be exactly the same for each of the
>> threads sending mutations. I'm thinking that we could do safely by
>> tracking when we are loading (or have loaded) the data into the
>> ServerCache and doing some reference counting to determine when its
>> actually safe to delete the ServerCache.
>>
>> I hope to find/make some time to get a patch up, but thought I'd take a
>> moment to write it up if anyone has opinions/feedback.
>>
>> Thanks!
>>
>> - Josh
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message