lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raymond Xie <xie3208...@gmail.com>
Subject Re: How do I create a schema file for FIX data in Solr
Date Mon, 02 Apr 2018 13:30:01 GMT
Thank you Rick for the enlightening.

I will get the FIX message parsed first and come back here later.


*------------------------------------------------*
*Sincerely yours,*


*Raymond*

On Mon, Apr 2, 2018 at 9:15 AM, Rick Leir <rleir@leirtech.com> wrote:

> Google
>    fix to json,
> there are a few interesting leads.
>
> On April 2, 2018 12:34:44 AM EDT, Raymond Xie <xie3208080@gmail.com>
> wrote:
> >Thank you, Shawn, Rick and other readers,
> >
> >To Shawn:
> >
> >For  *8=FIX.4.4 9=653 35=RIO* as an example, in the FIX standard: 8
> >means BeginString, in this example, its value is  FIX.4.4.9, and 9
> >means
> >body length, it is 653 for this message, 35 is RIO, meaning the message
> >type is RIO, 122 stands for OrigSendingTime and has a format of
> >UTCTimestamp
> >
> >You can refer to this page for details: https://www.onixs.biz
> >/fix-dictionary/4.2/fields_by_tag.html
> >
> >All the values are explained as string type.
> >
> >All the tag numbers are from FIX standard so it doesn't change (in my
> >case)
> >
> >I expect a python program might be needed to parse the message and
> >extract
> >each tag's value, index is to be made on those extracted value as long
> >as
> >their field (tag) name.
> >
> >With index in place, ideally and naturally user will search for any
> >keyword, however, in this case, most queries would be based on tag 37
> >(Order ID) and 75 (Trade Date), there is another customized tag (not in
> >the
> >standard) Order Version to be queried on.
> >
> >I understand the parser creation would be a manual process, as long as
> >I
> >know or have a small sample program, I will do it myself and maybe
> >adjust
> >it as per need.
> >
> >To Rick:
> >
> >You mentioned creating JSON document, my understanding is a parser
> >would be
> >needed to generate that JSON document, do you have any existing example
> >code?
> >
> >
> >
> >
> >Thank you guys very much.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >*------------------------------------------------*
> >*Sincerely yours,*
> >
> >
> >*Raymond*
> >
> >On Sun, Apr 1, 2018 at 2:16 PM, Shawn Heisey <apache@elyograg.org>
> >wrote:
> >
> >> On 4/1/2018 10:12 AM, Raymond Xie wrote:
> >>
> >>> FIX is a format standard of financial data. It contains lots of tags
> >in
> >>> number with value for the tag, like 8=asdf, where 8 is the tag and
> >asdf is
> >>> the tag's value. Each tag has its definition.
> >>>
> >>> The sample msg in FIX format was in the original question.
> >>>
> >>> All I need to do is to know how to paste the msg and get all tag's
> >value.
> >>>
> >>> I found so far a parser is what I need to start with., But I am more
> >>> concerning about how to create index in Solr on the extracted tag's
> >value,
> >>> that is the first step, the next would be to customize the dashboard
> >for
> >>> users to search with a value to find out which msg contains that
> >value in
> >>> which tag and present users the whole msg as proof.
> >>>
> >>
> >> Most of Solr's functionality is provided by Lucene.  Lucene is a java
> >API
> >> that implements search functionality.  Solr bolts on some
> >functionality on
> >> top of Lucene, but doesn't really do anything to fundamentally change
> >the
> >> fact that you're dealing with a Lucene index.  So I'm going to mostly
> >talk
> >> about Lucene below.
> >>
> >> Lucene organizes data in a unit that we call a "document." An easy
> >analogy
> >> for this is that it is a lot like a row in a single database table.
> >It has
> >> fields, each field has a type. Unless custom software is used, there
> >is
> >> really no support for data other than basic primitive types --
> >numbers and
> >> strings.  The only complex type that I can think of that Solr
> >supports out
> >> of the box is geospatial coordinates, and it might even support
> >> multi-dimensional coordinates, but I'm not sure.  It's not all that
> >complex
> >> -- the field just stores and manipulates multiple numbers instead of
> >one.
> >> The Lucene API does support a FEW things that Solr doesn't implement.
> > I
> >> don't think those are applicable to what you're trying to do.
> >>
> >> Let's look at the first part of the data that you included in the
> >first
> >> message:
> >>
> >> 8=FIX.4.4 9=653 35=RIO
> >>
> >> Is "8" always a mixture of letters and numbers and periods? Is "9"
> >always
> >> a number, and is it always a WHOLE number?  Is "35" always letters?
> >> Looking deeper to data that I didn't quote ... is "122" always a
> >date/time
> >> value?  Are the tag numbers always picked from a well-defined set, or
> >do
> >> they change?
> >>
> >> Assuming that the answers in the previous paragraph are found and a
> >> configuration is created to deal with all of it ... how are you
> >planning to
> >> search it?  What kind of queries would you expect somebody to make?
> >That's
> >> going to have a huge influence on how you configure things.
> >>
> >> Writing the schema is usually where people spend the most time when
> >> they're setting up Solr.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message