----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/73012/ ----------------------------------------------------------- Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, Sarath Subramanian, and Sidharth Mishra. Bugs: Atlas-4019 https://issues.apache.org/jira/browse/Atlas-4019 Repository: atlas Description ------- For any entity operation after the entity is created - e.g. add a column, update an attribute. - the entire updated entity is saved in Hbase audits (for each operation). Only the delta between two entity states needs to be stored. Update routine could be called for an individual entity or for a list of entities. Update-request may contain a full entity object or partial object(with only updated attributes or relations). There could be updates in entity core attributes, relationship attributes, custom attributes, or associated classifications. In all the cases today we store the entire updated entity in Hbase audits. Improvement is required so that only the updated information is stored in Hbase audits instead of the entire entity object. Solution explanation All sorts of update requests go through the createOrUpdate routine of AtlasEntityStoreV2. In this routine we compare all entities submitted for updates with their stored copy to make sure there are valid changes in the submitted entities. If valid changes are not found we discard them from the update entity list. Solution for this task is to modify this routine in such a way that while comparing the objects for difference we also copy the difference in dummy differential entity for each submitted updated-entity. The differential entity will have only the information which has changed. Changed information could be regular attributes, essential attributes, relationship attributes, custom attributes, or classifications. Today audit information is created using the updated entity which has all the entity information new and old. For this task, audits are generated from the differential entities and hence it has only the information which got updated. Functionality is behind the application flag. Inorder to enable it set atlas.update.audits.regular=false Diffs ----- intg/src/main/java/org/apache/atlas/ApplicationProperties.java e662c8fae repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java 32ad65e7a repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 7cf77ea04 repository/src/test/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2Test.java c9f491296 server-api/src/main/java/org/apache/atlas/RequestContext.java befd726ae Diff: https://reviews.apache.org/r/73012/diff/1/ Testing ------- Performance impact Summary Each number in the following tables is an average time in millisecond of 50 events of the same type. For example, the first row in the following table shows that the average event processing time of 50 create-events was 1862 milliseconds. Base Line (No changes) -> eventType eventTime preCreateOrUpdate checkForUnchanged createOrUpdate graphCommit ----- ----- ----- ----- ----- create 1862 333 0 905 947 create 1978 338 0 984 986 create 1930 316 0 960 964 create 1938 350 0 977 955 create 1932 319 0 944 980 ----- ----- ----- ----- ----- UpdateComment 295 85 51 157 132 UpdateComment 338 128 54 196 136 UpdateComment 245 75 52 140 100 UpdateComment 245 90 49 152 88 UpdateComment 253 90 50 153 94 ----- ----- ----- ----- ----- UpdateColumns 762 132 49 374 381 UpdateColumns 818 146 63 425 388 UpdateColumns 768 133 52 366 395 UpdateColumns 772 149 50 392 374 UpdateColumns 764 136 49 374 384 ----- ----- ----- ----- ----- With changes Without flag -> No performance impact if compared to above unmbers. eventType eventTime preCreateOrUpdate checkForUnchanged createOrUpdate graphCommit ----- ----- ----- ----- ----- create 1866 349 0 911 944 create 1906 346 0 955 942 create 1896 340 0 943 944 create 1915 321 0 931 976 create 1920 314 0 927 985 ----- ----- ----- ----- ----- UpdateComment 265 90 63 169 89 UpdateComment 250 91 53 158 85 UpdateComment 229 83 51 148 76 UpdateComment 241 87 48 149 86 UpdateComment 239 78 48 139 95 ----- ----- ----- ----- ----- UpdateColumns 767 133 50 384 376 UpdateColumns 728 145 48 376 345 UpdateColumns 745 143 56 384 354 UpdateColumns 740 134 56 384 350 UpdateColumns 771 135 47 365 401 ----- ----- ----- ----- ----- With changes With flag(atlas.update.audits.regular=false) -> eventType eventTime preCreateOrUpdate checkForUnchanged createOrUpdate graphCommit ----- ----- ----- ----- ----- create 1856 330 0 884 960 create 1901 327 0 935 958 create 1877 325 0 937 933 create 1913 309 0 931 975 create 1915 316 0 932 976 ----- ----- ----- ----- ----- UpdateComment 277 81 65 156 115 UpdateComment 250 82 63 154 90 UpdateComment 281 89 54 152 124 UpdateComment 241 74 59 143 93 UpdateComment 255 77 62 148 102 ----- ----- ----- ----- ----- UpdateColumns 778 147 64 393 378 UpdateColumns 734 128 62 363 364 UpdateColumns 741 139 63 383 352 UpdateColumns 711 130 59 364 342 UpdateColumns 753 122 58 355 392 ----- ----- ----- ----- ----- Thanks, Deep Singh