lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kehan Harman <kehan.har...@gaiaresources.com.au>
Subject Re: Suggester autocomplete for address information
Date Tue, 26 Feb 2019 14:12:34 GMT
I'd like to clarify that what I am looking for is the right field type for
the address field that will suggest values as follows for the input:
Input:
"123 SM"
Suggestions:

   - 123-127 SMITH STREET, KEMPSEY NSW 2440
   - 123 SMYTHE STREET. RANDOM PLACE 9999


And in addition to this I want the search to also provide results if I
simply include the postcode (4 integers here in Oz) as follows:

Input:
"2440"

Suggestions:

   - 123-127 SMITH STREET, KEMPSEY NSW 2440
   - 120 SMITH STREET, KEMPSEY NSW 2440
   - 65 SMITH STREET, KEMPSEY NSW 2440
   - 2440 ANOTHER RANDOM ROAD, RANDOM PLACE 9999


In short I would like it to try to match the beginning part of the address
first and if that fails start using later parts of the string such as
suburb, state and postcode.

The field type that I'm currently using as the basis of these suggestions
is as follows:


<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <
filter class="solr.StopFilterFactory" words="lang/stopwords_en.txt"
ignoreCase="true"/> <filter class="solr.LowerCaseFilterFactory"/> <filter
class="solr.EnglishPossessiveFilterFactory"/> <filter class=
"solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class=
"solr.PorterStemFilterFactory"/> </analyzer>

Thanks,
Kehan


On Tue, 26 Feb 2019 at 21:54, Kehan Harman <
kehan.harman@gaiaresources.com.au> wrote:

> Hi All,
>
> I'm new to Solr & the community so feel free to ignore / remove if this is
> the incorrect mailing list for this query.
>
> I'm trying to build an autocomplete using a Solr index for addresses in a
> format similar to:
>
> 123 Smith Street, KEMPSEY, NSW 2440
>
> I'm looking to have these addresses suggest values to users based on their
> input with some spellchecking capability.
>
> My documents contain contents like:
> { "id":"ANSW718363409", "table":"ADDRESS_DEFAULT_GEOCODE", "address":"123-127
> SMITH STREET, KEMPSEY NSW 2440", "address_location":
> "-31.07321967,152.84505473", "address_latitude":-31.07322, "
> address_longitude":152.84506, "locality_pid":"NSW2119", "locality_latitude
> ":-31.060476, "locality_longitude":152.84819, "suburb_postcode":"KEMPSEY
> NSW 2440", "number_first":123, "number_last":127, "street_number":
> "123-127", "street_name":"SMITH", "street_type_code":"STREET", "
> locality_name":"KEMPSEY", "state_name":"NEW SOUTH WALES", "
> state_abbreviation":"NSW", "postcode":"2440", "_version_":
> 1626515771141128204}
>
> These are Australian addresses extracted from
> https://data.gov.au/dataset/ds-dga-19432f89-dc3a-4ef3-b943-5326ef1dbecc/details
> .
>
> My managed schema has the following fields - I'm using the example managed
> schema *sample_techproducts_configs* with some additional fields that
> have been added using the schema API.:
>
> <field name="address" type="text_en" multiValued="false" indexed="true"
> stored="true"/> <field name="address_latitude" type="float" multiValued=
> "false" indexed="true" stored="true"/> <field name="address_location" type
> ="location" multiValued="false" indexed="true" stored="true"/> <field name
> ="address_longitude" type="float" multiValued="false" indexed="true"
> stored="true"/> <field name="building_name" type="string" multiValued=
> "false" indexed="true" stored="true"/> <field name="filename" type=
> "string" multiValued="false" indexed="true" stored="true"/> <field name=
> "flat_number" type="int" multiValued="false" indexed="true" stored="true"
> /> <field name="flat_type_code" type="string" multiValued="false" indexed=
> "true" stored="true"/> <field name="foo" type="string" indexed="true"
> stored="true"/> <field name="id" type="string" multiValued="false" indexed
> ="true" required="true" stored="true"/> <field name="index_id" type=
> "strings"/> <field name="level_number" type="int" multiValued="false"
> indexed="true" stored="true"/> <field name="locality_latitude" type=
> "float" multiValued="false" indexed="true" stored="true"/> <field name=
> "locality_location" type="location" multiValued="false" indexed="true"
> stored="true"/> <field name="locality_longitude" type="float" multiValued=
> "false" indexed="true" stored="true"/> <field name="locality_name" type=
> "string" multiValued="false" indexed="true" stored="true"/> <field name=
> "locality_pid" type="string" multiValued="false" indexed="true" stored=
> "true"/> <field name="number_first" type="int" multiValued="false" indexed
> ="true" stored="true"/> <field name="number_first_suffix" type="string"
> multiValued="false" indexed="true" stored="true"/> <field name=
> "number_last" type="int" multiValued="false" indexed="true" stored="true"
> /> <field name="number_last_suffix" type="string" multiValued="false"
> indexed="true" stored="true"/> <field name="postcode" type="string"
> multiValued="false" indexed="true" stored="true"/> <field name=
> "state_abbreviation" type="string" multiValued="false" indexed="true"
> stored="true"/> <field name="state_name" type="string" multiValued="false"
> indexed="true" stored="true"/> <field name="street_name" type="string"
> multiValued="false" indexed="true" stored="true"/> <field name=
> "street_number" type="string" multiValued="false" indexed="true" stored=
> "true"/> <field name="street_type_code" type="string" multiValued="false"
> indexed="true" stored="true"/> <field name="suburb_postcode" type=
> "text_en" multiValued="false" indexed="true" stored="true"/> <field name=
> "table" type="string" multiValued="false" indexed="true" stored="true"/> <
> field name="type" type="string" multiValued="false" indexed="true" stored=
> "true"/>
>
> The search component / requestHandler are defined as follows.
>
> <searchComponent name="suggest" class="solr.SuggestComponent"> <lst name=
> "suggester"> <str name="name">suburb</str> <str name="lookupImpl">
> FuzzyLookupFactory</str> <str name="dictionaryImpl">
> DocumentDictionaryFactory</str> <str name="field">suburb_postcode</str>
<
> str name="suggestAnalyzerFieldType">string</str> <str name=
> "buildOnStartup">true</str> </lst> <lst name="suggester"> <str
name="name"
> >address</str> <str name="lookupImpl">FuzzyLookupFactory</str> <str
name=
> "dictionaryImpl">DocumentDictionaryFactory</str> <str name="field">address
> </str> <str name="suggestAnalyzerFieldType">string</str> <str name=
> "buildOnStartup">true</str> </lst> </searchComponent> <requestHandler
name
> ="/suggest" class="solr.SearchHandler" startup="lazy" > <lst name=
> "defaults"> <str name="suggest">true</str> <str name="suggest.count">10</
> str> </lst> <arr name="components"> <str>suggest</str> </arr>
</
> requestHandler>
>
> Please let me know if you need any more information in order to answer
> this?
> Thanks,
> Kehan
>
>
>

-- 
*------------------------------------*
Kehan Harman
Gaia Resources
p +61 8 92277309
m +61 406872510
w www.gaiaresources.com.au
e kehan.harman@gaiaresources.com.au
t @kehan <http://twitter.com/kehan>
g kehh <http://github.com/kehh>

I acknowledge the traditional custodians of the lands and waters where we
live and work, and pay my respects to elders past, present and future.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message