lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <jan....@cominvent.com>
Subject Re: Streaming timeseries() and buckets with no docs
Date Thu, 06 Sep 2018 13:11:09 GMT
Thanks!

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 6. sep. 2018 kl. 15:09 skrev Joel Bernstein <joelsolr@gmail.com>:
> 
> I found the ticket you created and commented on it. I'll work on this today.
> 
> 
> Joel Bernstein
> http://joelsolr.blogspot.com/
> 
> 
> On Thu, Sep 6, 2018 at 9:04 AM Joel Bernstein <joelsolr@gmail.com> wrote:
> 
>> Ok, I'll create a ticket for this, it's a very quick fix. I'll try to
>> commit today.
>> 
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>> 
>> 
>> On Thu, Sep 6, 2018 at 6:52 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>> 
>>> Created https://issues.apache.org/jira/browse/SOLR-12749
>>> 
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> 
>>>> 5. sep. 2018 kl. 23:48 skrev Jan Høydahl <jan.asf@cominvent.com>:
>>>> 
>>>> Checked git history for TimeSeriesStream on master, and I cannot see
>>> any commits related to this?
>>>> 
>>>> SOLR-11914: Deprecated some SolrParams methods. * toSolrParams(nl)
>>> moved to a NamedList method, which is more natural. David Smiley
>>> 23.04.2018, 19:26
>>>> SOLR-11629: Add new CloudSolrClient.Builder ctors Jason Gerlowski
>>> 10.03.2018, 15:30
>>>> SOLR-11799: Fix NPE and class cast exceptions in the TimeSeriesStream
>>> Joel Bernstein 28.12.2017, 17:14
>>>> SOLR-11490: Add missing @since tags To all descendants of TupleStream
>>> Alexandre Rafalovitch 19.10.2017, 03:38
>>>> SOLR-10770: Fix precommit Joel Bernstein 30.05.2017, 20:51
>>>> SOLR-10770: Add date formatting to timeseries Streaming Expression Joel
>>> Bernstein 30.05.2017, 20:38
>>>> SOLR-10566: Fix error handling Joel Bernstein 01.05.2017, 18:06
>>>> SEARCH-313: Handled unescaped plus sign in gap Joel Bernstein
>>> 27.04.2017, 04:34
>>>> SOLR-10566: Fix precommit Joel Bernstein 26.04.2017, 17:17
>>>> SOLR-10566: Add timeseries Streaming Expression Joel Bernstein
>>> 26.04.2017, 16:57
>>>> 
>>>> --
>>>> Jan Høydahl, search solution architect
>>>> Cominvent AS - www.cominvent.com <http://www.cominvent.com/>
>>>> 
>>>>> 5. sep. 2018 kl. 16:12 skrev Jan Høydahl <jan.asf@cominvent.com
>>> <mailto:jan.asf@cominvent.com>>:
>>>>> 
>>>>> I have tested this with latest released ver 7.4.0
>>>>> 
>>>>> --
>>>>> Jan Høydahl, search solution architect
>>>>> Cominvent AS - www.cominvent.com <http://www.cominvent.com/>
>>>>> 
>>>>>> 4. sep. 2018 kl. 16:32 skrev Joel Bernstein <joelsolr@gmail.com
>>> <mailto:joelsolr@gmail.com>>:
>>>>>> 
>>>>>> Which version are you using?
>>>>>> 
>>>>>> I remember addressing this issue, but it may have been in Alfresco's
>>>>>> version of Solr and never got ported back.
>>>>>> 
>>>>>> I do agree that in a time series a null value is not what people
>>> want. It
>>>>>> is a very small change to populate with zeros if it has not already
>>> been
>>>>>> done in the latest versions.
>>>>>> 
>>>>>> Joel Bernstein
>>>>>> http://joelsolr.blogspot.com/ <http://joelsolr.blogspot.com/>
>>>>>> 
>>>>>> 
>>>>>> On Mon, Sep 3, 2018 at 8:58 AM Jan Høydahl <jan.asf@cominvent.com
>>> <mailto:jan.asf@cominvent.com>> wrote:
>>>>>> 
>>>>>>> Hi
>>>>>>> 
>>>>>>> We have a timeseries expression with gap="+1DAY" and a sum(imps_l)
to
>>>>>>> aggregate sums of an integer for each bucket.
>>>>>>> Now, some day buckets do not contain any documents at all, and
>>> instead of
>>>>>>> returning a tuple with value 0, it returns
>>>>>>> a tuple with no entry at all for the sum, see the bucket for
date_dt
>>>>>>> 2018-06-22 below:
>>>>>>> 
>>>>>>> {
>>>>>>> "result-set": {
>>>>>>>   "docs": [
>>>>>>>     {
>>>>>>>       "sum(imps_l)": 0,
>>>>>>>       "date_dt": "2018-06-21",
>>>>>>>       "count(*)": 5
>>>>>>>     },
>>>>>>>     {
>>>>>>>       "date_dt": "2018-06-22",
>>>>>>>       "count(*)": 0
>>>>>>>     },
>>>>>>>     {
>>>>>>>       "EOF": true,
>>>>>>>       "RESPONSE_TIME": 3
>>>>>>>     }
>>>>>>>   ]
>>>>>>> }
>>>>>>> }
>>>>>>> 
>>>>>>> 
>>>>>>> Now when we want to convert this into a column using
>>> col(a,'sum(imps_l)')
>>>>>>> then that array will get mostly numbers
>>>>>>> but also some string entries 'sum(imps_l)' which is the key name.
I
>>> need
>>>>>>> purely integers in the column.
>>>>>>> 
>>>>>>> Should the timeseries() have output values for all functions
even if
>>> there
>>>>>>> are no documents in the bucket?
>>>>>>> Or is there something similar to the select() expression that
can
>>> take a
>>>>>>> stream of tuples not originating directly
>>>>>>> from search() and replace values? Or is there perhaps a function
>>> that can
>>>>>>> loop through the column produced by col()
>>>>>>> and replace non-numeric values with 0?
>>>>>>> 
>>>>>>> --
>>>>>>> Jan Høydahl, search solution architect
>>>>>>> Cominvent AS - www.cominvent.com <http://www.cominvent.com/>
>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message