lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: [lucy-user] Subclassing Lucy::Analysis::Analyzer
Date Mon, 02 Jul 2012 15:22:22 GMT
On Mon, Jul 2, 2012 at 6:31 AM, Martin Hebnes Pedersen
<> wrote:
> So far I've overridden 'new' and 'transform' and indexing seems to work.
> The problem is that I need some parameters like minLength, but they are not
> in schema_x.json and My::Analysis::EdgeNGramTokenizer's contructor is never
> called when I load the IndexSearcher.
> I have tried to override dump and load, but that only results in segfaults.

I've investigated a bit, and it seems that while overriding dump() works,
overriding load() fails.

Something like this ought to work (but doesn't):

    our %min_length;

    sub new {
        my ($class, %args) = @_;
        my $min_length = delete $args{min_length}
            or confess "Missing required arg 'min_length'";
        my $self = $class->SUPER::new(%args);
        $min_length{$$self} = $min_length;
        return $self;

    sub dump {
        my $self = shift;
        my $dump = $self->SUPER::dump;
        $dump->{min_length} = $min_length{$$self};
        return $dump;

    sub load {
        my ($self, $dump) = @_;
        $self = $self->SUPER::load($dump);
        $min_length{$$self} = $dump->{min_length};
        return $self;

I'll look deeper and get back to you.

> Is there some other hook I should override to initialize the analyzer?
> So, we are looking for a solution that let us implement the analyzers in the
> host language (perl). Is this currently possible?

Subclassing Analyzer used to be supported, but the public subclassing API was
redacted in anticipation of a reworking of Analyzer's guts.  So, what you are
attempting is officially unsupported and rests on the current implementation
details.  That said, no one is actively working on the overhaul and things
have not changed much recently.

Marvin Humphrey

View raw message