spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiangrui Meng (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-17647) SQL LIKE does not handle backslashes correctly
Date Mon, 26 Sep 2016 17:06:20 GMT

    [ https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15523623#comment-15523623
] 

Xiangrui Meng edited comment on SPARK-17647 at 9/26/16 5:06 PM:
----------------------------------------------------------------

Thanks [~joshrosen]! I updated the JIRA description. The LIKE escaping behaviors in MySQL/PostgreSQL
are documented here:

* MySQL: http://dev.mysql.com/doc/refman/5.7/en/string-comparison-functions.html#operator_like
* PostgreSQL: https://www.postgresql.org/docs/8.3/static/functions-matching.html

In particular, MySQL:

{noformat}
Exception: At the end of the pattern string, backslash can be specified as “\\”.
At the end of the string, backslash stands for itself because there is nothing following to
escape.
{noformat}

That explains why MySQL returns true for both `\\` like `\\\\` and `\\` like `\\`.


was (Author: mengxr):
Thanks [~joshrosen]! I updated the JIRA description. The LIKE escaping behaviors in MySQL/PostgreSQL
are documented here:

* MySQL: http://dev.mysql.com/doc/refman/5.7/en/string-comparison-functions.html#operator_like
* PostgreSQL: https://www.postgresql.org/docs/8.3/static/functions-matching.html

In particular, MySQL:

{noformat}
Exception: At the end of the pattern string, backslash can be specified as “\\”. At the
end of the string, backslash stands for itself because there is nothing following to escape.
Suppose that a table contains the following values:
{noformat}

That explains why MySQL returns true for both `\\` like `\\\\` and `\\` like `\\`.

> SQL LIKE does not handle backslashes correctly
> ----------------------------------------------
>
>                 Key: SPARK-17647
>                 URL: https://issues.apache.org/jira/browse/SPARK-17647
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Xiangrui Meng
>              Labels: correctness
>
> Try the following in SQL shell:
> {code}
> select '\\\\' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '\\\\' rlike '.*\\\\\\\\.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated as a Java
string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message