nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Tang <him...@gmail.com>
Subject Index more...
Date Mon, 06 Jun 2005 01:41:36 GMT
Hi Guys

I wanna more fields in html header to be indexed or stored. 
Take below as example, 'breadcrumb' should be stored without indexing
while 'keywords' should indexed.
<html>
<head>
.....
<meta name="breadcrumb" content="home >> introduction">
<meta name="keywords"  content="">
</head>
......

</html>

And I read the index-more plugin, it seems I cannot not get <meta ...>
at all. Those info. is stored in "content". Are there something
missing in the 'metaData' in this plugin?

/Jack

Mime
View raw message