-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73012/
-----------------------------------------------------------
Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, Sarath Subramanian, and Sidharth
Mishra.
Bugs: Atlas-4019
https://issues.apache.org/jira/browse/Atlas-4019
Repository: atlas
Description
-------
For any entity operation after the entity is created - e.g. add a column, update an attribute.
- the entire updated entity is saved in Hbase audits (for each operation). Only the delta
between two entity states needs to be stored.
Update routine could be called for an individual entity or for a list of entities. Update-request
may contain a full entity object or partial object(with only updated attributes or relations).
There could be updates in entity core attributes, relationship attributes, custom attributes,
or associated classifications. In all the cases today we store the entire updated entity in
Hbase audits. Improvement is required so that only the updated information is stored in Hbase
audits instead of the entire entity object.
Solution explanation
All sorts of update requests go through the createOrUpdate routine of AtlasEntityStoreV2.
In this routine we compare all entities submitted for updates with their stored copy to make
sure there are valid changes in the submitted entities. If valid changes are not found we
discard them from the update entity list.
Solution for this task is to modify this routine in such a way that while comparing the objects
for difference we also copy the difference in dummy differential entity for each submitted
updated-entity. The differential entity will have only the information which has changed.
Changed information could be regular attributes, essential attributes, relationship attributes,
custom attributes, or classifications.
Today audit information is created using the updated entity which has all the entity information
new and old. For this task, audits are generated from the differential entities and hence
it has only the information which got updated.
Functionality is behind the application flag. Inorder to enable it set atlas.update.audits.regular=false
Diffs
-----
intg/src/main/java/org/apache/atlas/ApplicationProperties.java e662c8fae
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
32ad65e7a
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
7cf77ea04
repository/src/test/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2Test.java
c9f491296
server-api/src/main/java/org/apache/atlas/RequestContext.java befd726ae
Diff: https://reviews.apache.org/r/73012/diff/1/
Testing
-------
Performance impact Summary
Each number in the following tables is an average time in millisecond of 50 events of the
same type. For example, the first row in the following table shows that the average event
processing time of 50 create-events was 1862 milliseconds.
Base Line (No changes) ->
eventType eventTime preCreateOrUpdate checkForUnchanged createOrUpdate graphCommit
----- ----- ----- ----- -----
create 1862 333 0 905 947
create 1978 338 0 984 986
create 1930 316 0 960 964
create 1938 350 0 977 955
create 1932 319 0 944 980
----- ----- ----- ----- -----
UpdateComment 295 85 51 157 132
UpdateComment 338 128 54 196 136
UpdateComment 245 75 52 140 100
UpdateComment 245 90 49 152 88
UpdateComment 253 90 50 153 94
----- ----- ----- ----- -----
UpdateColumns 762 132 49 374 381
UpdateColumns 818 146 63 425 388
UpdateColumns 768 133 52 366 395
UpdateColumns 772 149 50 392 374
UpdateColumns 764 136 49 374 384
----- ----- ----- ----- -----
With changes Without flag -> No performance impact if compared to above unmbers.
eventType eventTime preCreateOrUpdate checkForUnchanged createOrUpdate graphCommit
----- ----- ----- ----- -----
create 1866 349 0 911 944
create 1906 346 0 955 942
create 1896 340 0 943 944
create 1915 321 0 931 976
create 1920 314 0 927 985
----- ----- ----- ----- -----
UpdateComment 265 90 63 169 89
UpdateComment 250 91 53 158 85
UpdateComment 229 83 51 148 76
UpdateComment 241 87 48 149 86
UpdateComment 239 78 48 139 95
----- ----- ----- ----- -----
UpdateColumns 767 133 50 384 376
UpdateColumns 728 145 48 376 345
UpdateColumns 745 143 56 384 354
UpdateColumns 740 134 56 384 350
UpdateColumns 771 135 47 365 401
----- ----- ----- ----- -----
With changes With flag(atlas.update.audits.regular=false) ->
eventType eventTime preCreateOrUpdate checkForUnchanged createOrUpdate graphCommit
----- ----- ----- ----- -----
create 1856 330 0 884 960
create 1901 327 0 935 958
create 1877 325 0 937 933
create 1913 309 0 931 975
create 1915 316 0 932 976
----- ----- ----- ----- -----
UpdateComment 277 81 65 156 115
UpdateComment 250 82 63 154 90
UpdateComment 281 89 54 152 124
UpdateComment 241 74 59 143 93
UpdateComment 255 77 62 148 102
----- ----- ----- ----- -----
UpdateColumns 778 147 64 393 378
UpdateColumns 734 128 62 363 364
UpdateColumns 741 139 63 383 352
UpdateColumns 711 130 59 364 342
UpdateColumns 753 122 58 355 392
----- ----- ----- ----- -----
Thanks,
Deep Singh
|