tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koutsoulis Philippe (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-1138) I got empty body and empty title with some documents
Date Mon, 24 Jun 2013 12:14:20 GMT
Koutsoulis Philippe created TIKA-1138:
-----------------------------------------

             Summary: I got empty body and empty title with some documents
                 Key: TIKA-1138
                 URL: https://issues.apache.org/jira/browse/TIKA-1138
             Project: Tika
          Issue Type: Bug
          Components: general
    Affects Versions: 1.3
         Environment: Windows 7 (my desktop)
            Reporter: Koutsoulis Philippe


*+Tested version:+* Apache Tika 1.3 (with the Apache Tika GUI)

Hi all,

I have empty body and empty title with some documents.
Do you have an idea?

*+Extract from my "Structured Text"+*
{noformat}
<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml">
<head>
...
<title/>
</head>
<body/></html>
{noformat}

*+Files to reproduce+*
[http://www.justice.gouv.fr/art_pix/declaration_sexe_20091016.xls]
[http://ge.ch/ssco_gestats/excel/deinfo_par_ht2004.xls]
[http://homepage.swissonline.ch/ccvaf1/stock_divers/palmares_ccvaf.xls]
[http://top1000.anthologeek.net/participants.current.txt]
[http://ge.ch/ssco_gestats/excel/refona_par_ht2006.xls]
[http://www.rad.fr/solupro.xls]
[http://www.pfynschiessen.ch/TClassementgroupeinvite.xls]
[http://www.gregdonner.org/workbench/wb_31rev.txt]

(i) No error in logs :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message