nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doğacan Güney (JIRA) <j...@apache.org>
Subject [jira] Issue Comment Edited: (NUTCH-442) Integrate Solr/Nutch
Date Mon, 19 Nov 2007 21:27:43 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543695
] 

dogacan edited comment on NUTCH-442 at 11/19/07 1:27 PM:
---------------------------------------------------------------

Here is the latest patch. I can tell you that it compiles but I don't know if it runs or not
:)

I don't have time to describe all the new things in the patch, but here is what my git-log
command tells me: (Edit: I should add: I will describe changes properly later, I just don't
have the time right now :)

- Ported to latest trunk
- Revert unnecessary changes to languageidentifier.
- Java 5 compatibility fixes.
    
    * Don't use Override tags for "implement"ed methods.
    
    * Work around ExecutorService.invokeAll bug.
      http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6267833
- Ugly hack for stringifying boolean queries.
    
    Adds a static stringify(BooleanQuery) method to SolrSearchBean. We can't just call
    BooleanQuery.toString() because it can produce strings like url:http://www.google.com
    which is _not_ a valid solr query string. Method stringify produces them like
    url:"http://www.google.com".

- Update segment names periodically in FetchedSegments.
    
    Added a new monitoring thread to FetchedSegments that watch
    for changes in the given segments directory and updates map
    'segments' accordingly. Map 'segments' is changed to a ConcurrentMap
    for thread-safety.

- SolrResponseHandler updates.
    
    * Don't buffer characters in SolrResponseHandler unless necessary.
    
    * Use StringBuilder.setLength(0) instead of creating a new StringBuilder.




      was (Author: dogacan):
    Here is the latest patch. I can tell you that it compiles but I don't know if it runs
or not :)

I don't have time to describe all the new things in the patch, but here is what my git-log
command tells me:

- Ported to latest trunk
- Revert unnecessary changes to languageidentifier.
- Java 5 compatibility fixes.
    
    * Don't use Override tags for "implement"ed methods.
    
    * Work around ExecutorService.invokeAll bug.
      http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6267833
- Ugly hack for stringifying boolean queries.
    
    Adds a static stringify(BooleanQuery) method to SolrSearchBean. We can't just call
    BooleanQuery.toString() because it can produce strings like url:http://www.google.com
    which is _not_ a valid solr query string. Method stringify produces them like
    url:"http://www.google.com".

- Update segment names periodically in FetchedSegments.
    
    Added a new monitoring thread to FetchedSegments that watch
    for changes in the given segments directory and updates map
    'segments' accordingly. Map 'segments' is changed to a ConcurrentMap
    for thread-safety.

- SolrResponseHandler updates.
    
    * Don't buffer characters in SolrResponseHandler unless necessary.
    
    * Use StringBuilder.setLength(0) instead of creating a new StringBuilder.



  
> Integrate Solr/Nutch
> --------------------
>
>                 Key: NUTCH-442
>                 URL: https://issues.apache.org/jira/browse/NUTCH-442
>             Project: Nutch
>          Issue Type: New Feature
>         Environment: Ubuntu linux
>            Reporter: rubdabadub
>         Attachments: NUTCH-442_v4.patch, NUTCH_442_v3.patch, RFC_multiple_search_backends.patch,
schema.xml
>
>
> Hi:
> After trying out Sami's patch regarding Solr/Nutch. Can be found here (http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html)
and I can confirm it worked :-) And that lead me to request the following :
> I would be very very great full if this could be included in nutch 0.9 as I am trying
to eliminate my python based crawler which post documents to solr. As I am in the corporate
enviornment I can't install trunk version in the production enviornment thus I am asking this
to be included in 0.9 release. I hope my wish would be granted.
> I look forward to get some feedback.
> Thank you.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message