lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Bernstein (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-9684) Add schedule Streaming Expression
Date Mon, 02 Jan 2017 02:46:58 GMT

    [ https://issues.apache.org/jira/browse/SOLR-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15792043#comment-15792043
] 

Joel Bernstein edited comment on SOLR-9684 at 1/2/17 2:46 AM:
--------------------------------------------------------------

Ok, then let's go with *priority* as the name for this function.

About the *merge* function. The merge function is shorthand for "mergeSort". It's designed
to merge two streams sorted on the same keys and maintain the sort order. Originally the idea
was that the /export handler was a giant sorting engine, and merge was a way to efficiently
merge the sorted streams.

The priority function behaves more like the SQL UNIONALL with priority. But it's different
in that *priority* only picks one stream to iterate on each open/close. This design allows
it to iterate the high priority topic until it's empty, and only then iterate through the
lower priority topic.

Also the *merge* function I think fits into the relational algebra category. The *priority*
function is mainly going to be used for task prioritization and execution.

Eventually we'll need to implement both a UnionStream and UnionAllStream as well.




was (Author: joel.bernstein):
Ok, then let's go with *priority* as the name for this function.

About the *merge* function. The merge function is shorthand for "mergeSort". It's designed
to merge two streams sorted on the same keys and maintain the sort order. Originally the idea
was that the /export handler was a giant sorting engine, and merge was a way to efficiently
merge the sorted streams.

The priority function behaves more like the SQL UNIONALL with priority. But it's different
in that *priority* only picks one stream to iterate on each open/close. This design allows
it to iterate the high priority topic until it's empty, and only then iterate through the
lower priority topic.

Also *merge* function I think fits into the relational algebra category. The *priority* function
is mainly going to be used for task prioritization and execution.

Eventually we'll need to implement both a UnionStream and UnionAllStream as well.



> Add schedule Streaming Expression
> ---------------------------------
>
>                 Key: SOLR-9684
>                 URL: https://issues.apache.org/jira/browse/SOLR-9684
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>             Fix For: master (7.0), 6.4
>
>         Attachments: SOLR-9684.patch, SOLR-9684.patch, SOLR-9684.patch
>
>
> SOLR-9559 adds a general purpose *parallel task executor* for streaming expressions.
The executor() function executes a stream of tasks and doesn't have any concept of task priority.
> The scheduler() function wraps two streams, a high priority stream and a low priority
stream. The scheduler function emits tuples from the high priority stream first, and then
the low priority stream.
> The executor() function can then wrap the scheduler function to see tasks in priority
order.
> Pseudo syntax:
> {code}
> daemon(executor(schedule(topic(tasks, q="priority:high"), topic(tasks, q="priority:low"))))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message