tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: [DISCUSS] Centralizing JSON handling of Metadata
Date Thu, 29 May 2014 00:42:00 GMT
Thank you, Ray! 

In almost reverse order, I've been using Jackson for this already, but I used GSON in TIKA-1291
because that's what CLI was already using.  In GSON's favor, the jar is a bit smaller, but
I have no real preference or reason to pick one over the other.  I'm not a json-blackbelt
(or, I guess that would be blckbelt), so I'm happy to go with either.

A new compilation unit makes sense. I'm wondering if we want to be that specific?  tika-serialization?
Or, maybe just tika-utils?

Package name looks good to me. 

Thanks, again!



-----Original Message-----
From: Ray Gauss II [mailto:ray.gauss@alfresco.com] 
Sent: Wednesday, May 28, 2014 3:07 PM
To: dev@tika.apache.org; Allison, Timothy B.
Subject: Re: [DISCUSS] Centralizing JSON handling of Metadata

Hi Tim,

1) Sounds good to me.

2) I do think we want core as lean as possible, so my vote would be for a separate project/module,
similar to what was done with tika-xmp.  Perhaps something like tika-serialization-json to
indicate other formats may follow in the same precedence?

3) Similar to above, perhaps org.apache.tika.metadata.serialization.json?

Just curious, any particular reason for GSON over Jackson?



On May 28, 2014 at 1:32:41 PM, Allison, Timothy B. (tallison@mitre.org) wrote:
> All,
> Nick recommended I put the question to the dev list for discussion. It might be useful
> to centralize our json handling of Metadata. We are now currently using different libraries
> and doing different things in CLI and in tika-server.
> 1) Do we want to centralize json handling of Metadata?
> 2) If so, where? Core? I share Nick's hesitance to add a dependency to core. OTOH, GSON
> is only 186k, but this would add potential for jar conflicts with folks integrating Tika,
> and it doesn't feel like a core function to me...it is a handy decorator for applications.
> 3) Wherever it goes, what package do we want to put it in? I like Nick's recommendations,
> with a slight preference for the second (oat.utils.json).
> Thank you!
> Best,
> Tim
> -----Original Message-----
> From: Nick Burch (JIRA) [mailto:jira@apache.org]
> Sent: Wednesday, May 28, 2014 12:41 PM
> To: dev@tika.apache.org
> Subject: [jira] [Commented] (TIKA-1311) Centralize JSON handling of Metadata
> [ https://issues.apache.org/jira/browse/TIKA-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011287#comment-14011287
> ]
> Nick Burch commented on TIKA-1311:
> ----------------------------------
> If we put it into core, we'd need to add another dependency (to GSON) which isn't ideal,
> so we might want to run the plan past the dev list first to see what people think (core
> to try to have a very minimal set of deps, unlike the other modules)
> Package wise, org.apache.tika.metadata.json is what I'd lean towards, otherwise  
> utils.json
> > Centralize JSON handling of Metadata
> > ------------------------------------
> >
> > Key: TIKA-1311
> > URL: https://issues.apache.org/jira/browse/TIKA-1311
> > Project: Tika
> > Issue Type: Task
> > Reporter: Tim Allison
> > Priority: Minor
> >
> > When json was initially added to TIKA CLI (TIKA-213), there was a recommendation
> centralize JSON handling of Metadata, potentially putting it in core. On a recent bug
> fix (TIKA-1291), the same recommendation was repeated especially noting that we now 

> handle JSON/Metadata differently in CLI and server.
> > Let's centralize JSON handling in core and use GSON. We should add a serializer
and a  
> deserializer so that users don't have to reinvent that wheel.
> --
> This message was sent by Atlassian JIRA
> (v6.2#6252)

View raw message