lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wun...@wunderwood.org>
Subject Re: SOLR + Nutch set up (UNCLASSIFIED)
Date Wed, 03 Aug 2016 17:45:41 GMT
I’m pretty sure Nutch uses a batch crawler instead of the adaptive crawler in Ultraseek.

I think we were the only people who built an adaptive crawler for enterprise use. I tried
to get Ultraseek open-sourced. I made the argument to Mike Lynch. He looked at me like I had
three heads and didn’t even answer me.

Ultraseek also has great support for sites that need login. If you use that, you’ll need
to find a way to do that with another crawler.

wunder
Walter Underwood
Former Ultraseek Principal Engineer
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Aug 3, 2016, at 10:12 AM, Musshorn, Kris T CTR USARMY RDECOM ARL (US) <kris.t.musshorn.ctr@mail.mil>
wrote:
> 
> CLASSIFICATION: UNCLASSIFIED
> 
> We are currently using ultraseek and looking to deprecate it in favor of solr/nutch.
> Ultraseek runs all the time and auto detects when pages have changed and automatically
reindexes them.
> Is this possible with SOLR/nutch?
> 
> Thanks,
> Kris
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~
> Kris T. Musshorn
> FileMaker Developer - Contractor - Catapult Technology Inc.      
> US Army Research Lab 
> Aberdeen Proving Ground 
> Application Management & Development Branch 
> 410-278-7251
> kris.t.musshorn.ctr@mail.mil
> ~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> 
> 
> CLASSIFICATION: UNCLASSIFIED


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message