incubator-droids-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chapuis Bertil <bchap...@agimem.com>
Subject Re: Question re: Ajax processing
Date Mon, 11 Apr 2011 21:12:46 GMT
You pointed the thread-safety problem. A good starting point may be to have
a HtmlUnit WebClient initialized for each Worker instances. However I'm not
able to evaluate the quantity of work it requires.

On 11 April 2011 22:57, Tony Dietrich <tony@dietrich.org.uk> wrote:

> Thanks Chapuis, I know about HtmlUnit.
>
> See my reply to Fuad.
>
> However, I have no idea how to integrate this into Droids. Help?
>
> Tony
>
> -----Original Message-----
> From: Chapuis Bertil [mailto:bchapuis@agimem.com]
> Sent: 11 April 2011 21:52
> To: droids-dev@incubator.apache.org
> Subject: Re: Question re: Ajax processing
>
> Yes, you are right, the HttpClient does not interpret javascript and no
> support is provided in Droids for such a use case. However, this may
> probably be achieved by using another client like the one provided by
> HtmlUnit which can be used to retrieve information from web sites and which
> works with most javascript libraries.
>
> http://htmlunit.sourceforge.net/javascript-howto.html
>
> On 11 April 2011 22:15, Tony Dietrich <tony@dietrich.org.uk> wrote:
>
> > OK, I'm getting ready to make use of Droids in an application for my
> > company, BUT:
> >
> >
> >
> > As far as I can tell, the current Droids Http Client implementation does
> > not
> > return a fully populated w3c document if the remote page uses
> > Ajax/JavaScript to synchronously populate the document.
> >
> > Correct/Not correct?
> >
> >
> >
> > (If I'm correct, this is a show-stopper for me.)
> >
> >
> >
> > If I'm wrong, can someone point me in the right way to ensure that a
> remote
> > crawl of a website will indeed return a fully populated document whether
> or
> > not the site uses Ajax/JavaScript to populate elements within the page
> > after
> > load?
> >
> >
> >
> > Tony Dietrich
> >
> >
> >
> >
>
>
> --
> Bertil Chapuis
> Agimem Sàrl
> http://www.agimem.com
>
>


-- 
Bertil Chapuis
Agimem Sàrl
http://www.agimem.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message