lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koji Miyamoto" <moto.k...@gmail.com>
Subject Re: extending SolrIndexSearcher
Date Wed, 10 May 2006 01:08:18 GMT
I tried it with just Lucene + RMI, and that works just fine.  It's actually
based on the Lucene In Action e-book topic on how to use
ParallelMultiSearcher (chap.5).  The relevant code snippet follows:

/*
 * search server:
 * This is the code frag for the search server, which enters
 * a wait-loop to accept requests on port 1099.
 * This server implementation is run on 2+ separate boxes, one
 * is a "master" while the rest are as "slaves", where master is
 * the main entry point which searches both it's local indexes,
 * and sends requests to each slave, which only searches its own
 * local indexes and reports back results to the master.
 */

  //private Vector<Searchable> _searchables;
  //private Vector<String> _localDirs;
  // ...

  // add local dirs as searchables..
  for (int i=0; i<_localDirs.size(); i++) {
     System.out.println("local searchable: " + _localDirs.get(i) + " ..");
     _searchables.add(new IndexSearcher(_localDirs.get(i)));
  }

  // add remote nodes (slaves) as searchables..
  // note: only master will do this, the slaves only looks at its local
indexes..
  if (_remoteNodes != null) {
     Collection nodes = _remoteNodes.values();
     Iterator it = nodes.iterator();
     String node = "";
     while (it.hasNext()) {
        node = (String) it.next();
        try {
           // remote nodes (slaves) also reachable via port 1099
           _searchables.add((Searchable) Naming.lookup("//" + node +
":1099/" + _DEFAULT_SVC_NAME_));
           System.out.println("remote searchable: " + node + " ..");
        } catch (java.rmi.ConnectException e) {
           System.err.println("ERROR: unable to connect to node=" + node + "
...");
        }
     }
  }

  // just some glue to prepare list of searchables for ParallelMultiSearcher
constructor..
  Searchable[] sch = new Searchable[_searchables.size()];
  for (int i=0; i<_searchables.size(); i++) {
     sch[i] = _searchables.get(i);
  }

  // start up server..
  System.setSecurityManager(new RMISecurityManager());
  LocateRegistry.createRegistry(_port);
  Searcher parallelSearcher = new ParallelMultiSearcher(sch);
  RemoteSearchable parallelImpl = new RemoteSearchable(parallelSearcher);
  Naming.rebind("//" + _nodeID + ":" + _port + "/" + _DEFAULT_SVC_NAME_,
parallelImpl);
  System.out.println("SearchServer started " +
        "(nodeID=" + _nodeID +
        ", port=" + _port +
        ", role=" + ((_remoteNodes!=null)?"master":"slave") +
        ", # searchables=" + _searchables.size() + ")...");

  // enters wait state, ready to accept requests on port 1099...

========================

/*
 * search client
 * This basically does an RMI naming lookup to get a reference to
 * the master node on port 1099, then sends a search query..
 */

TermQuery query = new TermQuery(new Term("body", word));
MultiSearcher searcher = new MultiSearcher(new
             Searchable[]{_lookupRemote(_DEFAULT_SVC_NAME_)});

Hits hits = searcher.search(query);

Document doc = null;
for (int i=0; i<hits.length(); i++) {
  doc = hits.doc(i);
  // able to get hit info here...
}

// .....

private Searchable _lookupRemote(String svcName) throws Exception {
  return (Searchable) Naming.lookup("//" + _host + ":" + _port + "/" +
svcName);
}

========================

>From both of the above code, I am able to start a server on box1 (master),
another server on box2 (slave), then invoke a client that queries box1,
which can get results from searching indexes in box1+box2.  With this
working, that's when I tried to incorporate ParallelMultiSearcher on Solr's
SolrIndexSearcher, since I saw that it is the place where it uses Lucene's
IndexSearcher.  I replaced it with ParallelMultiSearcher, where it is
initialized similar to the client code I mentioned above.

>From that, it seems like Solr itself needs to marshall and unmarshall the
searcher instance SolrIndexSearcher holds, and because the
ParallelMultiSearcher is initialized with RMI stubs, it fails to proceed
with such marshall/unmarshall internal actions.  As mentioned in the first
email, if I use ParallelMultiSearcher to only look at local indexes (no RMI
stub), Solr works just fine.  So I'm wondering if there is a way use
SolrIndexSearcher to search both local and remote indexes, even if not
through the RMI solution Lucene's ebook has suggested via its
ParallelMultiSearcher class.

tia,
Koji



On 5/9/06, Chris Hostetter <hossman_lucene@fucit.org> wrote:
>
>
> I don't really know a lot about RMI, but as i understand it, Serialization
> is a core neccessity -- if the arguments you want to pass to your Remote
> Method aren't serializable, then RMI can't pass those argument across the
> wire.
>
> That said: it's not clear to me from the psuedocode/stacktrace you
> included *what* isn't serializable ... is it a Solr class or a core Lucene
> class?
>
> If it's a Lucene class, you may want to start by making a small proof
> of concept RMI app that just uses the Lucene core classes, once that
> works then try your changes in Solr.
>
>
> : Date: Tue, 9 May 2006 02:32:45 -0700
> : From: Koji Miyamoto <moto.koji@gmail.com>
> : Reply-To: solr-user@lucene.apache.org
> : To: solr-user@lucene.apache.org
> : Subject: extending SolrIndexSearcher
> :
> : Hi,
> :
> : I am looking at extending the source code for SolrIndexSearcher for my
> own
> : purposes.  Basically, I am trying to replace the use of Lucene's
> : IndexSearcher with a ParallelMultiSearcher version so that I can have a
> : query search both locally available indexes as well as remote indexes
> : available only via RMI.  This ParallelMultiSearcher is instantiated to
> : consist of both local and remote Searchable references.  The local
> : Searchables are simply IndexSearcher instances tied to local disk
> (separate
> : indexes), while the remote Searchables are made reachable via RMI.
> :
> : In essence, where it used to be:
> :
> :   IndexSearcher searcher = new IndexSearcher(reader);
> :
> : it is now: (not the actual code but similar)
> :
> :   Searchable[] searchables = new Searchable[3];
> :   for (int i=0; i<2; i++) {
> :     // Local searchable:
> :     searchables[i] = new IndexSearcher("/disk" + i + "/index");
> :   }
> :
> :   // RMI searchable:  throws exception during search..
> :   searchables[2] = (Searchable) Naming.lookup
> : ("//remote_host:1099/remote_svc");
> :
> :   ParallelMultiSearcher searcher = new ParallelMultiSearcher(sch);
> :
> : When I build the source and use it (the short story, by replacing the
> : relevant class file(s) within solr.war used by the example jetty
> : implementation), it starts up just fine.  If I comment out the RMI
> : searchable line, submission of a search query to Jetty/Solr works just
> fine,
> : and it is able to search any number of indexes.  However, with the RMI
> : searchable uncommented out, I get an exception thrown (here's the ending
> of
> : it):
> :
> : May 9, 2006 1:38:07 AM org.apache.solr.core.SolrException log
> : SEVERE: java.rmi.MarshalException: error marshalling arguments; nested
> : exception is:
> :         java.io.NotSerializableException:
> : org.apache.lucene.search.MultiSearcher$1
> :         at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122)
> :         at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown
> : Source)
> :         at org.apache.lucene.search.MultiSearcher.search(
> MultiSearcher.java
> : :248)
> :         at org.apache.lucene.search.Searcher.search(Searcher.java:116)
> :         at org.apache.lucene.search.Searcher.search(Searcher.java:95)
> :         at org.apache.solr.search.SolrIndexSearcher.getDocListNC(
> : SolrIndexSearcher.java:794)
> :         at org.apache.solr.search.SolrIndexSearcher.getDocListC(
> : SolrIndexSearcher.java:712)
> :         at org.apache.solr.search.SolrIndexSearcher.getDocList(
> : SolrIndexSearcher.java:605)
> :         at org.apache.solr.request.StandardRequestHandler.handleRequest(
> : StandardRequestHandler.java:106)
> :
> : So it looks like it requires Serialization somehow to get it to work.
> : Wondering if anyone has any ideas to get around this problem.
> :
> : tia,
> : Koji
> :
>
>
>
> -Hoss
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message