-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73010/
-----------------------------------------------------------
Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Bugs: ATLAS-4015
https://issues.apache.org/jira/browse/ATLAS-4015
Repository: atlas
Description
-------
**Background**
Please see JIRA.
Re-indexing within Atlas was implemented so far as an external tool. Using this tool had number
of challenges. The biggest being the throughput of the tool. For a medium sized Atlas repository,
the tool could take days to finish.
The implementation addresses the problems. (See results below.)
**Approach**
Re-indexing is now implemented as a JAVA_PATCH that is applied only when the property _atlas.patch.reindex.enabled_
is set to true.
*Modified* AtlasJanusGraphManagement: New method _reindex_ implements the re-indexing logic.
*New* _ReIndexPatch_ is a JAVA_PATCH that implements the reindexing logic. This uses the PC
framework to enumerate vertices and edges. The patch application displays useful log messages
indicating progress.
**Configuration**
_atlas.patch.reindex.enabled=true_
Diffs
-----
graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasGraphManagement.java
f7d2e273c
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
2a2ef92a7
intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 1c7915859
repository/src/main/java/org/apache/atlas/repository/patches/AtlasPatchManager.java b142a2a4a
repository/src/main/java/org/apache/atlas/repository/patches/ConcurrentPatchProcessor.java
c6f0e6438
repository/src/main/java/org/apache/atlas/repository/patches/ReIndexPatch.java PRE-CREATION
Diff: https://reviews.apache.org/r/73010/diff/1/
Testing
-------
**Test Setup**
Start with a known Atlas setup with known data. Ascetain that basic search yields results.
Use these CURL commands to delete Solr indexes:
curl http://<host>:8983/solr/vertex_index/update?commit=true -H "Content-Type: text/xml"
--data-binary '<delete><query>b2d_t:*</query></delete>'
curl http://<host>:8983/solr/edge_index/update?commit=true -H "Content-Type: text/xml"
--data-binary '<delete><query>1151_t:*</query></delete>'
curl http://ve0128.halxg.cloudera.com:8983/solr/fulltext_index/update?commit=true -H "Content-Type:
text/xml" --data-binary '<delete><query>14at_t:*</query></delete>'
This will delete solr indexes. If basic search is performed from within the web UI, it will
not show any results.
Now set configuration parameter. Restart Atlas.
Server-side logs will indicate that the patch is run.
**Volume Testing**
Vertices: ~16M: Duration: ~5 hrs.
Edges: ~122M: ~6 hrs.
**PC Build**
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/177/
Thanks,
Ashutosh Mestry
|