tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TIKA-1106) CLAVIN Integration
Date Sun, 21 May 2017 15:40:13 GMT

     [ https://issues.apache.org/jira/browse/TIKA-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris A. Mattmann updated TIKA-1106:
------------------------------------
    Fix Version/s:     (was: 1.15)
                   1.16

> CLAVIN Integration
> ------------------
>
>                 Key: TIKA-1106
>                 URL: https://issues.apache.org/jira/browse/TIKA-1106
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 1.3
>         Environment: All
>            Reporter: Adam Estrada
>            Assignee: Chris A. Mattmann
>            Priority: Minor
>              Labels: entity, geospatial, new-parser
>             Fix For: 1.16
>
>
> I've been evaluating CLAVIN as a way to extract location information from unstructured
text. It seems like meshing it with Tika in some way would make a lot of sense. From CLAVIN
website...
> {quote}
> CLAVIN (*Cartographic Location And Vicinity INdexer*) is an open source software package
for document geotagging and geoparsing that employs context-based geographic entity resolution.
It combines a variety of open source tools with natural language processing techniques to
extract location names from unstructured text documents and resolve them against gazetteer
records. Importantly, CLAVIN does not simply "look up" location names; rather, it uses intelligent
heuristics in an attempt to identify precisely which "Springfield" (for example) was intended
by the author, based on the context of the document. CLAVIN also employs fuzzy search to handle
incorrectly-spelled location names, and it recognizes alternative names (e.g., "Ivory Coast"
and "Côte d'Ivoire") as referring to the same geographic entity. By enriching text documents
with structured geo data, CLAVIN enables hierarchical geospatial search and advanced geospatial
analytics on unstructured data.
> {quote}
> There was only one other instance of the word "clavin" mentioned in the ASF jira site
so I thought it was definitely worth posting here.
> https://github.com/Berico-Technologies/CLAVIN



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message