lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-7792) Add optional concurrency to OfflineSorter
Date Thu, 20 Apr 2017 13:14:04 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-7792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976659#comment-15976659
] 

Dawid Weiss commented on LUCENE-7792:
-------------------------------------

Looks good looking at the patch alone. It could be probably made more elegant if null executor
was substituted with a "same thread" executor -- then no if/else checks would be necessary,
you'd simply pass the callable to an executor always (and it'd execute the job immediately
on submission).

Some day we could think of changing {{reThrow}} to return a dummy RuntimeException so that:
{code}
+          IOUtils.reThrow(ee.getCause());
+
+          // dead code but javac disagrees:
+          result = null;
{code}
could be changed to this:
{code}
+          throw IOUtils.reThrow(ee.getCause());
{code}

reThrow would never return any value anyway, but it'd shut up the compiler and make cleaner
code (I think).

> Add optional concurrency to OfflineSorter
> -----------------------------------------
>
>                 Key: LUCENE-7792
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7792
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: master (7.0), 6.6
>
>         Attachments: LUCENE-7792.patch
>
>
> OfflineSorter is a heavy operation and is really an embarrassingly concurrent problem
at heart, and if you have enough hardware concurrency (e.g. fast SSDs, multiple CPU cores)
it can be a big speedup.
> E.g., after reading a partition from the input, one thread can sort and write it, while
another thread reads the next partition, etc.  Merging partitions can also be done in the
background.  Some things still cannot be concurrent, e.g. the initial read from the input
must be a single thread, as well as the final merge and writing to the final output.
> I think I found a fairly non-invasive way to add optional concurrency to this class,
by adding an optional ExecutorService to OfflineSorter's ctor (similar to IndexSearcher) and
using futures to represent each partition as we sort, and creating Callable classes for sorting
and merging partitions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message