[Coco] INSTR question

Mathieu Bouchard matju at artengine.ca
Sun Feb 19 09:36:24 EST 2017


I searched for real and it isn't exactly that universal. Let's start with some 
that are consistent :

Ruby (both plain search & pattern matching) :
"ABC".index"A"
0
"ABC".index""
0
/A/ =~ "ABC"
0
// =~ "ABC"
0

Perl :
print index("ABC","A")."\n"
0
print index("ABC","")."\n"
0
"ABC" =~ /A/; print "@-\n"
0
"ABC" =~ //; print "@-\n"
0

Python :
"ABC".index("A")
0
"ABC".index("")
0
re.search("A","ABC").start()
0
re.search("","ABC").start()
0

Java :
System.out.println("ABC".indexOf("A"));
0
System.out.println("ABC".indexOf(""));
0

C (where this behaviour probably originated from) :
const char *s="abc"; printf("%zd %zd\n",strstr(s,"a")-s,strstr(s,"")-s);
0 0

C++ STL :
string s="abc"; printf("%zd %zd\n",s.find("a"),s.find(""));
0 0

Unix shells pattern matching :
echo ABC | grep -b A
0:ABC
echo ABC | grep -b ""
0:ABC

(the list could go on)

However, Tcl is not consistent (doesn't find empty string) :
string first A ABC
0
string first "" ABC
-1

And also not consistent in PHP and issues a warning (wow !) :
var_export(strpos("abc","a"));
0
var_export(strpos("abc",""));
PHP Warning:  strpos(): Empty needle in php shell code on line 1
false

But there's an alternate consistent way in Tcl, using pattern matching :
regexp -indices a abc x; lindex $x 0
0
regexp -indices "" abc x; lindex $x 0
0

And in PHP too :
preg_match("/a/","abc",$m,PREG_OFFSET_CAPTURE); var_export($m[0][1]);
0
preg_match("//","abc",$m,PREG_OFFSET_CAPTURE); var_export($m[0][1]);
0


Le 2017-02-10 à 15:05:00, Paulo Garcia a écrit :

> Interesting discussion. Indeed the same behaviour is found in Python and
> Javascript:
>
> NodeJS:
>
>> a='ABC'
> 'ABC'
>> a.indexOf('A')
> 0
>> a.indexOf('B')
> 1
>> a.indexOf('C')
> 2
>> a.indexOf('')
> 0
>>
>
> Python:
>
>>>> a='ABC'
>>>> a.index('B')
> 1
>>>> a.index('A')
> 0
>>>> a.index('')
> 0
>>>>
>
>
> Paulo
>
> On Fri, Feb 10, 2017 at 2:29 PM, Mathieu Bouchard <matju at artengine.ca>
> wrote:
>
>>
>> Nope, it's like that in probably every language that has such a search
>> function : an empty string is found at EVERY position in the string,
>> therefore the first match it finds is wherever the search begins. It's the
>> normal way of doing it, because it logically fits the way N characters are
>> searched in a string, for N=0, and the behaviour you wish would mean adding
>> a special case for N=0 where programmers prefer to define functions so that
>> they have the least possible number of cases.
>>
>> (However, in other languages, 0 is the first position in the string,
>> whereas "no match" is represented by another value (such as -1 or nil or
>> error))
>>
>>
>> Le 2017-02-09 à 15:12:00, Allen Huffman a écrit :
>>
>> ...but I noticed today it finds the empty string: ""
>>>
>>> PRINT INSTR("ABCDE", "")
>>> 1
>>>
>>> That seems like a bug.
>>> A$=""
>>> PRINT INSTR("ABCD", A$)
>>> 1
>>>
>>
>>  ______________________________________________________________________
>> | Mathieu BOUCHARD --- tél: 514.623.3801, 514.383.3801 --- Montréal, QC
>>
>>
>> --
>> Coco mailing list
>> Coco at maltedmedia.com
>> https://pairlist5.pair.net/mailman/listinfo/coco
>>
>
>
>
> -- 
> --------------------------------------------
> Paulo
>
> -- 
> Coco mailing list
> Coco at maltedmedia.com
> https://pairlist5.pair.net/mailman/listinfo/coco

  ______________________________________________________________________
| Mathieu BOUCHARD --- tél: 514.623.3801, 514.383.3801 --- Montréal, QC


More information about the Coco mailing list