asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Lychagin <dmitry.lycha...@couchbase.com>
Subject Counting character positions inside a string
Date Tue, 30 Jan 2018 02:54:39 GMT
All,

We would like to change how string functions count character positions inside a string.
Currently string functions position(), substring() and some others assume that the first character
is at position 1.
The proposal is to change the first position to 0, to better align with array element positions
(which also start with 0), and other languages (JavaScript, etc).
This change will also apply to binary functions (see below) and will be effective in both
SQLPP and AQL.

The following functions will be affected:
position(),
regexp_position(),
substring()/substr(),
sub_binary(),
find_binary()

This might be a disrupting change for some users so we will also introduce a cluster-wide
configuration parameter (“compiler.stringoffset”) for backwards compatibility:
compiler.stringoffset = 0   // first character position is assumed to be 0 (new default)
compiler.stringoffset = 1   // first character position is assumed to be 1 (backwards-compatible
setting)

The query migration path is straightforward, for example:
substring(“abcdef”, 1) will need to be changed to substring(“abcdef”, 0), etc, same
applies to sub_binary().
position(), regexp_position(), and find_binary() will return one less than they used to, but
would still return -1 if the value is not found.

Please share your comments and concerns.
Thanks,
-- Dmitry

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message