tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Lothian <nloth...@educationau.edu.au>
Subject Reading metadata without downloading entire file
Date Wed, 18 Feb 2009 06:22:48 GMT
I'm trying to get MP3 Metadata without downloading an entire MP3.

I've setup a FilterInputStream which throws an InterruptedIOException after a given amount
of a file is downloaded.

If I point this at an HTML page it works - I can get the title from the metadata.

If I point it at an MP3 file it doesn't give me any metadata at all (except the Metadata.RESOURCE_NAME_KEY
which I set), even if I set the download length to be just less than the length of the file.
If I download the whole file it works

(JPGs don't seem to work either)

Why is this so? My understanding was that Tika would work with streams?


Code:

                CountingInputStream stream = new CountingInputStream(method.getResponseBodyAsStream(),
new CountingListener() {
                        public void transferred(long amount, InputStream theStream) throws
InterruptedIOException {
                                if (amount > 20000l) {
                                        throw new InterruptedIOException();
                                }
                        }
                });


                Metadata metadata = new Metadata();
                metadata.set(Metadata.RESOURCE_NAME_KEY, address);
                try {
                        parser.parse(stream, getXmlContentHandler(), metadata);
                } catch (Exception e) {
                        e.printStackTrace();
                } finally {
                        System.out.println("size = " + stream.getTransferred());
                        stream.close();
                }
                System.out.println(Arrays.toString(metadata.names()));


Regards
  Nick Lothian

IMPORTANT: This e-mail, including any attachments, may contain private or confidential information.
If you think you may not be the intended recipient, or if you have received this e-mail in
error, please contact the sender immediately and delete all copies of this e-mail. If you
are not the intended recipient, you must not reproduce any part of this e-mail or disclose
its contents to any other party. This email represents the views of the individual sender,
which do not necessarily reflect those of Education.au except where the sender expressly states
otherwise. It is your responsibility to scan this email and any files transmitted with it
for viruses or any other defects. education.au limited will not be liable for any loss, damage
or consequence caused directly or indirectly by this email.

Mime
View raw message