From dev-return-16973-apmail-atlas-dev-archive=atlas.apache.org@atlas.apache.org Thu Aug 17 01:19:54 2017 Return-Path: X-Original-To: apmail-atlas-dev-archive@minotaur.apache.org Delivered-To: apmail-atlas-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 64FF31A276 for ; Thu, 17 Aug 2017 01:19:54 +0000 (UTC) Received: (qmail 18485 invoked by uid 500); 17 Aug 2017 01:19:54 -0000 Delivered-To: apmail-atlas-dev-archive@atlas.apache.org Received: (qmail 18279 invoked by uid 500); 17 Aug 2017 01:19:54 -0000 Mailing-List: contact dev-help@atlas.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@atlas.apache.org Delivered-To: mailing list dev@atlas.apache.org Received: (qmail 18268 invoked by uid 99); 17 Aug 2017 01:19:53 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Aug 2017 01:19:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 71517C039E; Thu, 17 Aug 2017 01:19:53 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.201 X-Spam-Level: **** X-Spam-Status: No, score=4.201 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2, KAM_LAZY_DOMAIN_SECURITY=1, NML_ADSP_CUSTOM_MED=1.2, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id Yq4pw1iZGUgY; Thu, 17 Aug 2017 01:19:52 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 7E8895FAEA; Thu, 17 Aug 2017 01:19:51 +0000 (UTC) Received: from reviews.apache.org (unknown [10.41.0.12]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D0AD3E00A7; Thu, 17 Aug 2017 01:19:50 +0000 (UTC) Received: from reviews-vm2.apache.org (localhost [IPv6:::1]) by reviews.apache.org (ASF Mail Server at reviews-vm2.apache.org) with ESMTP id DBF69C405F2; Thu, 17 Aug 2017 01:19:49 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============3473507720137568697==" MIME-Version: 1.0 Subject: Re: Review Request 61667: ATLAS-2044: In-memory filtering after index query From: Apoorv Naik To: Ashutosh Mestry , Madhan Neethiraj , Sarath Subramanian Cc: Apoorv Naik , atlas Date: Thu, 17 Aug 2017 01:19:48 -0000 Message-ID: <20170817011948.51969.32035@reviews-vm2.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Apoorv Naik X-ReviewGroup: atlas X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/61667/ X-Sender: Apoorv Naik References: <20170816220920.51968.42301@reviews-vm2.apache.org> In-Reply-To: <20170816220920.51968.42301@reviews-vm2.apache.org> X-ReviewBoard-Diff-For: repository/src/main/java/org/apache/atlas/util/SearchPredicateUtil.java Reply-To: Apoorv Naik X-ReviewRequest-Repository: atlas --===============3473507720137568697== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61667/ ----------------------------------------------------------- (Updated Aug. 17, 2017, 1:19 a.m.) Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, and Sarath Subramanian. Changes ------- Added predicates per type. Bugs: ATLAS-2044 https://issues.apache.org/jira/browse/ATLAS-2044 Repository: atlas Description ------- In-memory filtering is needed to weed out any false positive results from index query. False positive matches happen when the search string is present "as is" after a special token which is used by the indexer (solr, elasticsearch) to tokenize the string (during indexing as well as querying) eg. name = /a/b/c search string "a", "b", "c" will match this name every time under equality, startsWith, endsWith check as it's present immediately after the special token. The patch adds extra level of filtering on the the result set obtained from the index query. Diffs (updated) ----- repository/src/main/java/org/apache/atlas/discovery/ClassificationSearchProcessor.java 74197ca8 repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 9cd83fb4 repository/src/main/java/org/apache/atlas/discovery/FullTextSearchProcessor.java d556bf1a repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java b209ecb4 repository/src/main/java/org/apache/atlas/util/SearchPredicateUtil.java PRE-CREATION Diff: https://reviews.apache.org/r/61667/diff/5/ Changes: https://reviews.apache.org/r/61667/diff/4-5/ Testing ------- REST and UI testing no longer show the false positive matches. mvn clean package -Pdist (in progress) Thanks, Apoorv Naik --===============3473507720137568697==--