nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otto Fowler <ottobackwa...@gmail.com>
Subject RE: Performance of adding many keys to redis with PutDistributedMapCache
Date Thu, 02 Apr 2020 11:50:59 GMT
Maybe something that used records and a record query on top of mset would
be the most efficient.




On April 2, 2020 at 06:27:53, Hesselmann, Brian (brian.hesselmann@cgi.com)
wrote:

Hi Bryan and Mike,

Thanks for the responses. For now we have introduced a ExecuteStreamCommand
to use the redis-cli and different commands directly. It seems to improve
performance for now, but we will have to look into introducing a new
procesor or different DB if necessary.

Thanks,
Brian

------------------------------
*Van:* Mike Thomsen [mikerthomsen@gmail.com]
*Verzonden:* woensdag 1 april 2020 0:08
*Aan:* users@nifi.apache.org
*Onderwerp:* Re: Performance of adding many keys to redis with
PutDistributedMapCache

Might be worth experimenting with KeyDB and see if that helps. It's a
mutli-threaded fork of Redis that's supposedly about as fast in a single
node as a same size Redis cluster when you compare cluster nodes to KeyDB
thread pool size.

https://keydb.dev/
<https://urldefense.proofpoint.com/v2/url?u=https-3A__keydb.dev_&d=DwMFaQ&c=H50I6Bh8SW87d_bXfZP_8g&r=SZ1t8SQDPUG29Dh1I8iJ-uskV9jK3PuRgBiFyP5aljY&m=1nnOc3V31kMYb0yHffJiNjhefJYM79NHp8bM9bX9i0c&s=sxTO-sVQaGBua-hqcP-AyOfbdlBidK20WyRaAuw7xsM&e=>

On Tue, Mar 31, 2020 at 4:49 PM Bryan Bende <bbende@gmail.com> wrote:

> Hi Brian,
>
> I'm not sure what can really be done with the existing processor besides
> what you have already done. Have you configured your overall Timer Driven
> thread pool appropriately?
>
> Most likely there would need to be a new PutRedis processor that didn't
> have to adhere to the DistributedMapCacheInterface and could use MSET or
> whatever specific Redis functionality was needed.
>
> Another option might be a record-based variation of PutDistributedMapCache
> where you could keep thousands of records together and stream them to the
> cache. It would take a record-path to specify the key for each record and
> serialize the record as the value (assuming your data fits into one of the
> record formats like JSON, Avro, CSV).
>
> -Bryan
>
> On Tue, Mar 31, 2020 at 4:23 PM Hesselmann, Brian <
> brian.hesselmann@cgi.com> wrote:
>
>> Hi,
>>
>> We currently run a flow that puts about 700.000 entries/flowfiles into
>> Redis every 5 minutes. I'm looking for ways to improve performance.
>>
>> Currently we've been upping the number of concurrent tasks and run
>> duration of the PutDistributedMapCache processor to be able to process
>> everything. I know Redis supports setting multiple keys at once using MSET(
>> https://redis.io/commands/mset
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__redis.io_commands_mset&d=DwMFaQ&c=H50I6Bh8SW87d_bXfZP_8g&r=SZ1t8SQDPUG29Dh1I8iJ-uskV9jK3PuRgBiFyP5aljY&m=1nnOc3V31kMYb0yHffJiNjhefJYM79NHp8bM9bX9i0c&s=M5M84Vuo0mPoJU3kz_Job5q4S0N2sHtinRxUGBpKQew&e=>),
>> however using Nifi this command is not available.
>>
>> Short of simply upgrading the system we run Nifi/Redis on, do you have
>> any suggestions for improving performance of PutDistributedMapCache?
>>
>> Best,
>> Brian
>>
>

Mime
View raw message