nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doğacan Güney <doga...@gmail.com>
Subject Re: Why does TestNodeWalker keep failing?
Date Sat, 13 Jun 2009 08:26:41 GMT
On Fri, Jun 12, 2009 at 15:12, Andrzej Bialecki <ab@getopt.org> wrote:

> Doğacan Güney wrote:
>
>> Hi all,
>>
>> Does anyone know why TestNodeWalker keeps failing
>> for the last couple of days?
>>
>> I can reproduce the error in my computer; test log looks like
>> this:
>>
>> Testsuite: org.apache.nutch.util.TestNodeWalker
>> Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 1.101 sec
>> ------------- Standard Error -----------------
>> java.io.IOException: Server returned HTTP response code: 503 for URL:
>> http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
>>    at
>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1241)
>>    at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown
>> Source)
>>    at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
>>    at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown
>> Source)
>>    at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown
>> Source)
>>    at
>> org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown
>> Source)
>>    at
>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
>> Source)
>>    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>>    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>>    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>>    at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>>    at
>> org.apache.nutch.util.TestNodeWalker.testSkipChildren(TestNodeWalker.java:63)
>>
>
> Hmm, error 503 is "Service unavailable". Either this is a genuine problem
> at www.w3.org, or the access to this site is not available from the
> machine that runs tests. I believe we should do something similar as we did
> for generating the web docs, i.e. use our own catalog or DTDs instead of
> downloading DTDs from the net.
>

DTD is defined like this (in file TestNodeWalker.java)

private final static String WEBPAGE=
  "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"
http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">"
// ... rest of the webpage

How can we move that DTD to local? Perhaps, we should just remove
that line, I don't know if it does anything there.


>
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>


-- 
Doğacan Güney

Mime
View raw message