lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gytis Mikuciunas <gyt...@gmail.com>
Subject Re: how to get modified field data if it doesn't exist in meta
Date Mon, 13 Feb 2017 14:05:07 GMT
Hi,

Who can compile me this to jar file? (I found something similar i need in
google: (
http://stackoverflow.com/questions/20745935/set-last-modified-field-when-not-defined-in-document-in-solr
))

package modifiedG4;

import java.io.IOException;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.response.SolrQueryResponse;
import org.apache.solr.update.AddUpdateCommand;
import org.apache.solr.update.processor.UpdateRequestProcessor;
import org.apache.solr.update.processor.UpdateRequestProcessorFactory;

public class LastModifiedMergeProcessorFactory
   extends UpdateRequestProcessorFactory {

  @Override
  public UpdateRequestProcessor getInstance(SolrQueryRequest req,
       SolrQueryResponse rsp, UpdateRequestProcessor next) {
    return new LastModifiedMergeProcessor(next);
  }
}

class LastModifiedMergeProcessor extends UpdateRequestProcessor {

  public LastModifiedMergeProcessor(UpdateRequestProcessor next) {
    super(next);
  }

  @Override
  public void processAdd(AddUpdateCommand cmd) throws IOException {
    SolrInputDocument doc = cmd.getSolrInputDocument();

    Object metaDate = doc.getFieldValue( "last_modified" );
    Object fileDate = doc.getFieldValue( "file_date" );
    if( metaDate == null && fileDate != null) {
        doc.addField( "last_modified", fileDate );
    }

      // pass it up the chain
      super.processAdd(cmd);
    }
}

On Sun, Feb 12, 2017 at 8:45 PM, Alexandre Rafalovitch <arafalov@gmail.com>
wrote:

> It would have to be a custom one. One you write. But I believe Tika
> would pass a file name as one of the parameters, so you just need to
> use standard Java API to look up the system date. That - of course -
> assumes that the files you index are on the same filesystem as Solr
> itself, so it could look it up.
>
> You can find more about the UPRs at:
> https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors
> You can find the full list of the URPs at:
> http://www.solr-start.com/info/update-request-processors/
> If you are on the latest Solr 6.4, you would probably want to subclass
> SimpleUpdateProcessorFactory and follow the implementation example of
> TemplateUpdateProcessorFactory
> https://github.com/apache/lucene-solr/blob/releases/
> lucene-solr/6.4.0/solr/core/src/java/org/apache/solr/update/processor/
> TemplateUpdateProcessorFactory.java
>
> Alternatively, you could implement your URP in Javascript, but I am
> not sure that has an API to check file dates.
>
> Regards,
>    Alex.
> ----
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 12 February 2017 at 13:28, Gytis Mikuciunas <gytmkc@gmail.com> wrote:
> > Alexandre, could you provide some link or give more info about this
> > processor?
> > I'm novice in the solr world;)
> >
> >
> > Regards,
> > Gytis
> >
> > On Feb 10, 2017 14:59, "Alexandre Rafalovitch" <arafalov@gmail.com>
> wrote:
> >
> > Custom update request processor that looks up a file from the name and
> gets
> > the date should work.
> >
> > Regards,
> >     Alex
> >
> > On 10 Feb 2017 2:39 AM, "Gytis Mikuciunas" <gytmkc@gmail.com> wrote:
> >
> > Hi,
> >
> > We have started to use solr for our documents indexing (vsd, vsdx,
> > xls,xlsx, doc, docx, pdf, txt).
> >
> > Modified date values is needed for each file. MS Office's files, pdfs
> have
> > this value.
> > Problem is with txt files as they don't have this value in their meta.
> >
> > Is there any possibility to get it somehow from os level and force adding
> > it to solr when we do indexing.
> >
> > p.s.
> >
> > Windows 2012 server, single instance
> >
> > typical command we use: java -Dauto -Dc=index_sandbox -Dport=80
> > -Dfiletypes=vsd,vsdx,xls,xlsx,doc,docx,pdf,txt -Dbasicauth=admin:xxxx
> -jar
> > example/exampledocs/post.jar "M:\DNS_dump"
> >
> >
> > Regards,
> >
> > Gytis
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message