lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sznajder ForMailingList <bs4mailingl...@gmail.com>
Subject Re: "Avoiding" a schema.xml
Date Sat, 02 May 2015 21:33:29 GMT
Thanks!

Indeed, one of my issues is that I can not know about the fields to be
indexed before seeing (and making some entity extraction) on the browsed
documents.
It is the reason I thought to avoid the schema definition ...

The schema API sounds interesting! Does it exist via SolrJ?

Many thanks!

Benjamin

On Thu, Apr 30, 2015 at 6:27 PM, Erick Erickson <erickerickson@gmail.com>
wrote:

> Could you explain a bit more _why_ you want to do this? As you're
> probably well aware, there
> are multiple ways to shoot yourself in the foot in lower-level Lucene.
>
> If you have some situation where you're creating indexes on the fly
> that may vary then
> you could consider the "managed schema" that lets you create a schema
> via API calls,
> then you wouldn't need to mess with editing the schema.xml file for
> instance.
>
> Best,
> Erick
>
> On Thu, Apr 30, 2015 at 8:12 AM, Shawn Heisey <apache@elyograg.org> wrote:
> > On 4/30/2015 8:43 AM, Sznajder ForMailingList wrote:
> >> I am interested to index some documents in Solr, as I did in Lucene.
> >>
> >> I mean: giving via solrJ all the information about the field I am adding
> >> (Tokenize, store, facet etc...)
> >>
> >> can we do that? Or is it mandatory to define a schema on the collection?
> >
> > All that information is defined on the server.  You do not have direct
> > access to the Lucene index - Solr is intended as an abstraction, so the
> > admin and the users/applications that use Solr do not need to understand
> > all the low-level details that go into a Lucene application.  The admin
> > just has to deal with configuration files like schema.xml, and the users
> > just need to know what fields are in each document and how the query
> > syntax works.  Deeper Lucene knowledge is helpful, but not strictly
> > necessary.
> >
> > If you want Lucene-level control, you'll need to write the search server
> > yourself using Lucene.  If you have very specific needs that Solr's
> > approach can't satisfy, you always have this option.
> >
> > The newest Solr versions do have an example of what's known as a
> > "data-driven" schema, or schemaless mode.  In this mode, Solr builds up
> > the schema automatically, guessing the field type based on what kind of
> > data is the first to arrive for each field.  This is good for
> > prototyping, but for production use, I would want to be in full manual
> > control of the schema.
> >
> > Thanks,
> > Shawn
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message