Storing the original would be an excellent idea and would be quite doable.
2009/12/14 Christian Kohlschütter <kohlschuetter@l3s.de>
> However it would also be great (in order to increase recall) to also store
> non-content and just add some kind of static boosting for content blocks
> over non-content blocks. I am not sure whether this will work right now
> using an Analyzer. What you could do though, is to store the text into
> separate fields ("content"/"boilerplate") and add field-specific boosts at
> query time.
>
--
Ted Dunning, CTO
DeepDyve
|