lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gergely Nagy <foge...@gmail.com>
Subject Re: Indexing and searching a DateTime range
Date Tue, 10 Feb 2015 01:20:05 GMT
OK. I found the Alfresco code on GitHub. So it's open source it seems.

And I found the DateTimeAnalyser, so I will just take that code as a
starting point:
https://github.com/lsbueno/alfresco/tree/master/root/projects/repository/source/java/org/alfresco/repo/search/impl/lucene/analysis

Thank you for everybody for the time to respond.

2015-02-10 9:55 GMT+09:00 Gergely Nagy <fogetti@gmail.com>:

> Thank you Barry, I really appreciate your time to respond,
>
> Let me clarify this a little bit more. I think it was not clear.
>
> I know how to parse dates, this is not the question here. (See my previous
> email: "how can I pipe my converter logic into the indexing process?")
>
> All of your solutions guys would work fine if I wanted to index
> per-document. Which I do NOT want to do. What I would like to do to index
> per log line.
>
> I need to do a full text search, but with the additional requirement to
> filter those search hits by DateTime range.
>
> I hope this makes it clearer. So any suggestions how to do that?
>
> Sidenote: I saw that Alfresco implemented this analyzer, called
> DateTimeAnalyzer, but Alfresco is not open source. So I was wondering how
> to implement the same. Actually after wondering for 2 days, I became
> convinced that writing an Analyzer should be the way to go. I will post my
> solution later if I have a working code.
>
> 2015-02-10 8:50 GMT+09:00 Barry Coughlan <b.coughlan2@gmail.com>:
>
>> Hi Gergely,
>>
>> Writing an analyzer would work but it is unnecessarily complicated. You
>> could just parse the date from the string in your input code and index it
>> in the LongField like this:
>>
>> SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd
>> HH:mm:ss.S'Z'");
>> format.setTimeZone(TimeZone.getTimeZone("UTC"));
>> long t = format.parse("2015-02-08 00:02:06.123Z INFO...").getTime();
>>
>> Barry
>>
>> On Tue, Feb 10, 2015 at 12:21 AM, Gergely Nagy <fogetti@gmail.com> wrote:
>>
>> > Thank you for taking your time to respond Karthik,
>> >
>> > Can you show me an example how to convert DateTime to milliseconds? I
>> mean
>> > how can I pipe my converter logic into the indexing process?
>> >
>> > I suspect I need to write my own Analyzer/Tokenizer to achieve this. Is
>> > this correct?
>> >
>> > 2015-02-09 22:58 GMT+09:00 KARTHIK SHIVAKUMAR <nskarthik.k@gmail.com>:
>> >
>> > > Hi
>> > >
>> > > Long time ago,.. I used to store datetime in millisecond .
>> > >
>> > > TermRangequery used to work in perfect condition....
>> > >
>> > > Convert all datetime to millisecond and index the same.
>> > >
>> > > On search condition again convert datetime to millisecond and use
>> > > TermRangequery.
>> > >
>> > > With regards
>> > > Karthik
>> > > On Feb 9, 2015 1:24 PM, "Gergely Nagy" <fogetti@gmail.com> wrote:
>> > >
>> > > > Hi Lucene users,
>> > > >
>> > > > I am in the beginning of implementing a Lucene application which
>> would
>> > > > supposedly search through some log files.
>> > > >
>> > > > One of the requirements is to return results between a time range.
>> > Let's
>> > > > say these are two lines in a series of log files:
>> > > > 2015-02-08 00:02:06.852Z INFO...
>> > > > ...
>> > > > 2015-02-08 18:02:04.012Z INFO...
>> > > >
>> > > > Now I need to search for these lines and return all the text
>> > in-between.
>> > > I
>> > > > was using this demo application to build an index:
>> > > >
>> > > >
>> > >
>> >
>> http://lucene.apache.org/core/4_10_3/demo/src-html/org/apache/lucene/demo/IndexFiles.html
>> > > >
>> > > > After that my first thought was using a term range query like this:
>> > > >         TermRangeQuery query =
>> > TermRangeQuery.newStringRange("contents",
>> > > > "2015-02-08 00:02:06.852Z", "2015-02-08 18:02:04.012Z", true, true);
>> > > >
>> > > > But for some reason this didn't return any results.
>> > > >
>> > > > Then I was Googling for a while how to solve this problem, but all
>> the
>> > > > datetime examples I found are searching based on a much simpler
>> field.
>> > > > Those examples usually use a field like this:
>> > > > doc.add(new LongField("modified", file.lastModified(),
>> Field.Store.NO
>> > ));
>> > > >
>> > > > So I was wondering, how can I index these log files to make a range
>> > query
>> > > > work on them? Any ideas? Maybe my approach is completely wrong. I
am
>> > > still
>> > > > new to Lucene so any help is appreciated.
>> > > >
>> > > > Thank you.
>> > > >
>> > > > Gergely Nagy
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message