incubator-droids-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Florent André <florent.andre-...@4sengines.com>
Subject Need 1 :
Date Mon, 13 Jul 2009 14:49:12 GMT
Hi Droids list !

After a speak during the Lenya meeting with Senior Thorsten (Olé !:) ), I
would like to have more informations about droids.

I know that droids is not only a web crawler (and I would like to use it
for other think), but my immediate need is about crawling...

So let's go : 

I would like to pass to droids an xml like (just an example) : 
<article>
  <droids:url>http://example.com/test.html</droids:url>
  <title>
   
<droids:xpath>html/body/div[@id='content']/div[@id='title']/h1</droids:xpath>
  </title>
  <firstparagraph>
    
<droids:xpath>html/body/div[@id='content']/div[@id='article']/p[position()=1]</droids:xpath>
  </firstparagraph>
  <othertext>
   
<droids:xpath>html/body/div[@id='content']/div[@id='article']/p[position()>1]</droids:xpath>
  </othertext>
</article>

and that droids give me someting like : 
<article>
  <title> this is the article title </article>
  <firstparagraph> This article is about the....</firstparagraph>
  <othertext>bla bla bla bla bla...</othertext>
</article>

So my questions are : 

1) It's possible ? 

2) If yes, I will have to (think that I'm not a java's SuperStar) :
   a) install droids, type 2 commands lines, and let's go (1 hour work)
   b) install droids, really understand understand how droids work, code
some classes (3 weeks work)
   c) install droids, create a class from existing one, doing some try
error (4-5 days work)
   d) ...

3) It's difficult to plug droids into a Lenya (based on cocoon) app ?

Thanks for your answer,

Regards

Mime
View raw message