tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stephen mallette (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (TINKERPOP-2008) GraphML import / export causing duplicated edge IDs
Date Fri, 07 Sep 2018 18:33:00 GMT

     [ https://issues.apache.org/jira/browse/TINKERPOP-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stephen mallette closed TINKERPOP-2008.
---------------------------------------
    Resolution: Not A Bug

Sorry, I didn't remember to come back to this one. This isn't really a bug. It's just a configuration
problem with TinkerGraph. Because GraphML types can be lossy you need to set the {{IdManager}}
for TinkerGraph to properly coerce the identifiers to the expected type (especially when numbers
are involved) thus:

{code}
gremlin> conf = new BaseConfiguration()
==>org.apache.commons.configuration.BaseConfiguration@81ff872
gremlin> conf.setProperty("gremlin.tinkergraph.edgeIdManager", "LONG")
gremlin> graph = TinkerGraph.open(conf)
==>tinkergraph[vertices:0 edges:0]
gremlin> graph.io(graphml()).readGraph("input.xml")
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:184 edges:183], standard]
gremlin> g.V().has('label', "EPSILON").as('epsilonV').inE().as('e1').outV().
......1>   where(__.outE().count().is(eq(2))).as('choiceV').
......2>   outE().where(neq('e1')).as('e2').inV().as('optionalV').
......3>   select('choiceV').inE().as('srcE').outV().as('srcV').
......4>   addE('optional').to('optionalV').as('newE').
......5>   sideEffect(select('srcE').properties().unfold().as('p').
......6>   select('newE').property(select('p').key(), select('p').value())).
......7>   union(select('srcE').drop(),
......8>   select('e1').drop(),
......9>   select('e2').drop(),
.....10>   select('epsilonV').drop(),
.....11>   select('choiceV').drop())
gremlin> graph
==>tinkergraph[vertices:116 edges:115]
gremlin> graph.io(IoCore.graphml()).writeGraph("output.xml")
gremlin> xxx = TinkerGraph.open(conf)
==>tinkergraph[vertices:0 edges:0]
gremlin> xxx.io(graphml()).readGraph("output.xml")
gremlin> xxx
==>tinkergraph[vertices:116 edges:115]
{code}

I think that Gryo worked because it's types are non-lossy by nature.

> GraphML import / export causing duplicated edge IDs
> ---------------------------------------------------
>
>                 Key: TINKERPOP-2008
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2008
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 3.3.3
>            Reporter: Svante Schubert
>            Priority: Major
>         Attachments: input_table_table.graphml, input_table_table.kryo, output__in_table-table_graphml.graphml,
output__in_table-table_kyro.graphml
>
>
> I could only reproduce the problem of duplicated IDs in the graph output after a query
on the Gremlin console if it was a larger/complex input file and only if both GraphML import
and export were being used.
>  If I loaded the file as identical Gryo file the export was without duplication.
> I assume there are some thresholds/memory dumps where you might compare the RTM.
> There 7 edges with duplicate IDs in the following lines of the attached file 
>  'output__in_table-table_graphml.graphml'
> Line 736: Duplicate unique value [370] declared for identity constraint "edge_id_unique"
of element "graph".
>  Line 758: Duplicate unique value [374] declared for identity constraint "edge_id_unique"
of element "graph".
>  Line 783: Duplicate unique value [378] declared for identity constraint "edge_id_unique"
of element "graph".
>  Line 797: Duplicate unique value [382] declared for identity constraint "edge_id_unique"
of element "graph".
>  Line 818: Duplicate unique value [386] declared for identity constraint "edge_id_unique"
of element "graph".
>  Line 854: Duplicate unique value [390] declared for identity constraint "edge_id_unique"
of element "graph".
>  Line 887: Duplicate unique value [400] declared for identity constraint "edge_id_unique"
of element "graph".
> In addition, the problem is related to the query. If I load and save the Graph immediately
no edge problem occurs.
> See mail thread: https://groups.google.com/forum/#!searchin/gremlin-users/svante%7Csort:date/gremlin-users/P8MdzzlFtng/vYqYlukJAgAJ
> {quote}\,,,/
>  (o o)
> -----oOOo-(3)-oOOo-----
> plugin activated: tinkerpop.server
> plugin activated: tinkerpop.utilities
> plugin activated: tinkerpop.tinkergraph
> gremlin> graph = TinkerGraph.open()
> ==>tinkergraph[vertices:0 edges:0]
> gremlin> graph.io(IoCore.graphml()).readGraph("f:" + File.separator + "tmp" + File.separator
+ "table_table.graphml");
> ==>null
> gremlin> g = graph.traversal()
> ==>graphtraversalsource[tinkergraph[vertices:184 edges:183], standard]
> gremlin> g.V().has('label', "EPSILON").as('epsilonV').inE().as('e1').outV().
> ......1> where(__.outE().count().is(eq(2))).as('choiceV').
> ......2> outE().where(neq('e1')).as('e2').inV().as('optionalV').
> ......3> select('choiceV').inE().as('srcE').outV().as('srcV').
> ......4> addE('optional').to('optionalV').as('newE').
> ......5> sideEffect(select('srcE').properties().unfold().as('p').
> ......6> select('newE').property(select('p').key(), select('p').value())).
> ......7> union(select('srcE').drop(),
> ......8> select('e1').drop(),
> ......9> select('e2').drop(),
> .....10> select('epsilonV').drop(),
> .....11> select('choiceV').drop())
> gremlin> graph.io(IoCore.graphml()).writeGraph("f:" + File.separator + "tmp" + File.separator
+ "output__in_table-table_graphml.graphml");
> ==>null
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message