atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh Mestry via Review Board <>
Subject Review Request 73010: Re-Indexing Implemented as JAVA_PATCH
Date Mon, 09 Nov 2020 21:45:45 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.

Bugs: ATLAS-4015

Repository: atlas


Please see JIRA.
Re-indexing within Atlas was implemented so far as an external tool. Using this tool had number
of challenges. The biggest being the throughput of the tool. For a medium sized Atlas repository,
the tool could take days to finish.

The implementation addresses the problems. (See results below.)

Re-indexing is now implemented as a JAVA_PATCH that is applied only when the property _atlas.patch.reindex.enabled_
is set to true.

*Modified* AtlasJanusGraphManagement: New method _reindex_ implements the re-indexing logic.
*New* _ReIndexPatch_ is a JAVA_PATCH that implements the reindexing logic. This uses the PC
framework to enumerate vertices and edges. The patch application displays useful log messages
indicating progress.



  intg/src/main/java/org/apache/atlas/ 1c7915859 
  repository/src/main/java/org/apache/atlas/repository/patches/ b142a2a4a

  repository/src/main/java/org/apache/atlas/repository/patches/ PRE-CREATION



**Test Setup**
Start with a known Atlas setup with known data. Ascetain that basic search yields results.

Use these CURL commands to delete Solr indexes:

curl http://<host>:8983/solr/vertex_index/update?commit=true  -H "Content-Type: text/xml"
--data-binary '<delete><query>b2d_t:*</query></delete>'

curl http://<host>:8983/solr/edge_index/update?commit=true  -H "Content-Type: text/xml"
--data-binary '<delete><query>1151_t:*</query></delete>'

curl  -H "Content-Type:
text/xml" --data-binary '<delete><query>14at_t:*</query></delete>'

This will delete solr indexes. If basic search is performed from within the web UI, it will
not show any results.

Now set configuration parameter. Restart Atlas.

Server-side logs will indicate that the patch is run.

**Volume Testing**
Vertices: ~16M: Duration: ~5 hrs.
Edges: ~122M: ~6 hrs.

**PC Build**


Ashutosh Mestry

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message