jakarta-oro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandramouli Kharidehal <Ckharide...@sapient.com>
Subject RE: Doubt about ORO
Date Thu, 10 Jan 2002 16:33:19 GMT
Sorry for troubling u 
 But if i try a \w match on this String AO?nC 
 which is a unicode string i get only 4 matches 
 Iam using the Applet that is  a part of jakarta site 
 Morever is there any way out where i can unicode characters in the Browser 
 Is there any editor where i can enter UTF-8 Characters in the Browser 


-----Original Message-----
From: Daniel F. Savarese [mailto:dfs@savarese.org]
Sent: Thursday, January 10, 2002 8:44 PM
To: ORO Users List
Subject: Re: Doubt about ORO 



In message <295A9D64E5DC2D469405DE8037DDAB694FC14D@delmmsx01.sapient.com>,
Chan
dramouli Kharidehal writes:
>How do i detect Unicode  Characters using the ORO package 
>For example usign \w i can detect all the ASCII characters 
>How about the characters beyond the ASCII Unicode Characters 

As I said before:

>\d matches based on Character.isDigit() and \w matches based on
>Character.isLetterOrDigit() or '_'.  So, you see, it's all Unicode based
>on Java's interpretation of how to classify characters with some attempt
>to remain true to Perl (e.g., the inclusion of '_' in \w).  Even though
>the source is somewhat inscrutable, these types of questions can be
>answered by looking at the source.

If you skimmed it the first time, please reread the first sentence of
my original reply which explicitly answers your question.  Since
Character.isLetterOrDigit(), or any other Character.isFoo()
method, is not restricted to ASCII, neither is \w, \d, or any of the
other character set specifiers.  Also pay special attention to the last
sentence of my original reply :)

daniel



--
To unsubscribe, e-mail:   <mailto:oro-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:oro-user-help@jakarta.apache.org>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message