| java.lang.Object | ||
| ↳ | java.text.Collator | |
| ↳ | java.text.RuleBasedCollator | |
A concrete implementation class for Collation.
 
 RuleBasedCollator has the following restrictions for efficiency
 (other subclasses may be used for more complex languages):
 
RuleBasedCollator, the
 default Unicode Collation Algorithm (UCA) rule-based table is automatically
 searched as a backup.The collation table is composed of a list of collation rules, where each rule is of three forms:
<modifier> <relation> <text-argument> <reset> <text-argument>
The rule elements are defined as follows:
b c is
 treated as bc.This sounds more complicated than it is in practice. For example, the following are equivalent ways of expressing the same thing:
a < b < c a < b & b < c a < c & a < b
Notice that the order is important, as the subsequent item goes immediately after the text-argument. The following are not equivalent:
a < b & a < c a < c & a < b
 Either the text-argument must already be present in the sequence, or some
 initial substring of the text-argument must be present. For example
 "a < b & ae < e" is valid since "a" is present in the sequence before
 "ae" is reset. In this latter case, "ae" is not entered and treated as a
 single character; instead, "e" is sorted as if it were expanded to two
 characters: "a" followed by an "e". This difference appears in natural
 languages: in traditional Spanish "ch" is treated as if it contracts to a
 single character (expressed as "c < ch < d"), while in traditional
 German a-umlaut is treated as if it expands to two characters (expressed as
 "a,A < b,B  ... & ae;ã & AE;Ã", where ã and Ã
 are the escape sequences for a-umlaut).
 
 For ignorable characters, the first rule must start with a relation (the
 examples we have used above are really fragments; "a < b" really
 should be "< a < b"). If, however, the first relation is not
 "<", then all text-arguments up to the first "<" are
 ignorable. For example, ", - < a < b" makes "-" an ignorable
 character.
 
 RuleBasedCollator automatically processes its rule table to include
 both pre-composed and combining-character versions of accented characters.
 Even if the provided rule string contains only base characters and separate
 combining accent characters, the pre-composed accented characters matching
 all canonical combinations of characters from the rule string will be entered
 in the table.
 
This allows you to use a RuleBasedCollator to compare accented strings even when the collator is set to NO_DECOMPOSITION. However, if the strings to be collated contain combining sequences that may not be in canonical order, you should set the collator to CANONICAL_DECOMPOSITION to enable sorting of combining sequences. For more information, see The Unicode Standard, Version 3.0.
The following rules are not valid:
"a < b-c < d"."a < , b"."a < b & e < f".
 If you produce one of these errors, RuleBasedCollator throws a
 ParseException.
 
 Normally, to create a rule-based collator object, you will use
 Collator's factory method getInstance. However, to create a
 rule-based collator object with specialized rules tailored to your needs, you
 construct the RuleBasedCollator with the rules contained in a
 String object. For example:
 
String Simple = "< a < b < c < d"; RuleBasedCollator mySimple = new RuleBasedCollator(Simple);
Or:
 
 String Norwegian = "< a,A< b,B< c,C< d,D< e,E< f,F< g,G< h,H< i,I"
         + "< j,J< k,K< l,L< m,M< n,N< o,O< p,P< q,Q< r,R"
         + "< s,S< t,T< u,U< v,V< w,W< x,X< y,Y< z,Z"
         + "< å=å,Å=Å"
         + ";aa,AA< æ,Æ< ø,Ø";
 RuleBasedCollator myNorwegian = new RuleBasedCollator(Norwegian);
 
 
 
 Combining Collators is as simple as concatenating strings. Here is
 an example that combines two Collators from two different locales:
 
 
 // Create an en_US Collator object
 RuleBasedCollator en_USCollator = (RuleBasedCollator)Collator
         .getInstance(new Locale("en", "US", ""));
 // Create a da_DK Collator object
 RuleBasedCollator da_DKCollator = (RuleBasedCollator)Collator
         .getInstance(new Locale("da", "DK", ""));
 // Combine the two collators
 // First, get the collation rules from en_USCollator
 String en_USRules = en_USCollator.getRules();
 // Second, get the collation rules from da_DKCollator
 String da_DKRules = da_DKCollator.getRules();
 RuleBasedCollator newCollator = new RuleBasedCollator(en_USRules + da_DKRules);
 // newCollator has the combined rules
 
 
 
 The next example shows to make changes on an existing table to create a new
 Collator object. For example, add "& C < ch, cH, Ch, CH" to
 the en_USCollator object to create your own:
 
// Create a new Collator object with additional rules String addRules = "& C < ch, cH, Ch, CH"; RuleBasedCollator myCollator = new RuleBasedCollator(en_USCollator + addRules); // myCollator contains the new rules
The following example demonstrates how to change the order of non-spacing accents:
 
 // old rule
 String oldRules = "= ¨ ; ¯ ; ¿" + "< a , A ; ae, AE ; æ , Æ"
         + "< b , B < c, C < e, E & C < d, D";
 // change the order of accent characters
 String addOn = "& ¿ ; ¯ ; ¨;";
 RuleBasedCollator myCollator = new RuleBasedCollator(oldRules + addOn);
 
 
 
 The last example shows how to put new primary ordering in before the default
 setting. For example, in the Japanese Collator, you can either sort
 English characters before or after Japanese characters:
 
 
 // get en_US Collator rules
 RuleBasedCollator en_USCollator = (RuleBasedCollator)
     Collator.getInstance(Locale.US);
 // add a few Japanese character to sort before English characters
 // suppose the last character before the first base letter 'a' in
 // the English collation rule is ア
 String jaString = "& ア , ー < ト";
 RuleBasedCollator myJapaneseCollator =
     new RuleBasedCollator(en_USCollator.getRules() + jaString);
 
 
| 
  [Expand]
   Inherited Constants  | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
   From class
java.text.Collator
 | |||||||||||
| Public Constructors | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
Constructs a new instance of  
  
  RuleBasedCollator using the
 specified rules. | |||||||||||
| Public Methods | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
Returns a new collator with the same collation rules, decomposition mode and
 strength value as this collator. 
  
   | |||||||||||
Compares the  
  
  source text to the target text according to
 the collation rules, strength and decomposition mode for this
 RuleBasedCollator. | |||||||||||
Compares the specified object with this  
  
  RuleBasedCollator and
 indicates if they are equal. | |||||||||||
Obtains a  
  
  CollationElementIterator for the given string. | |||||||||||
Obtains a  
  
  CollationElementIterator for the given
 CharacterIterator. | |||||||||||
Returns the  
  
  CollationKey for the given source text. | |||||||||||
Returns the collation rules of this collator. 
  
   | |||||||||||
Returns an integer hash code for this object. 
  
   | |||||||||||
| 
  [Expand]
   Inherited Methods  | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
   
From class java.text.Collator
 | |||||||||||
   
From class java.lang.Object
 | |||||||||||
   
From interface java.util.Comparator
 | |||||||||||
Constructs a new instance of RuleBasedCollator using the
 specified rules. The rules are usually either
 hand-written based on the class description or
 the result of a former getRules() call.
 
 Note that the rules are actually interpreted as a delta to the
 standard Unicode Collation Algorithm (UCA). This differs
 slightly from other implementations which work with full rules
 specifications and may result in different behavior.
| rules | the collation rules. | 
|---|
| NullPointerException | if rules == null. | 
        
|---|---|
| ParseException | if rules contains rules with invalid collation rule
             syntax.
 | 
        
Returns a new collator with the same collation rules, decomposition mode and strength value as this collator.
Compares the source text to the target text according to
 the collation rules, strength and decomposition mode for this
 RuleBasedCollator. See the Collator class description
 for an example of use.
 
 General recommendation: If comparisons are to be done with the same strings
 multiple times, it is more efficient to generate CollationKey
 objects for the strings and use
 CollationKey.compareTo(CollationKey) for the comparisons. If each
 string is compared to only once, using
 RuleBasedCollator.compare(String, String) has better performance.
| source | the source text. | 
|---|---|
| target | the target text. | 
source is less than,
         equivalent to, or greater than target.
Compares the specified object with this RuleBasedCollator and
 indicates if they are equal. In order to be equal, object must be
 an instance of Collator with the same collation rules and the
 same attributes.
| obj | the object to compare with this object. | 
|---|
true if the specified object is equal to this
         RuleBasedCollator; false otherwise.Obtains a CollationElementIterator for the given string.
| source | the source string. | 
|---|
CollationElementIterator for source.
Obtains a CollationElementIterator for the given
 CharacterIterator. The source iterator's integrity will be
 preserved since a new copy will be created for use.
| source | the source character iterator. | 
|---|
CollationElementIterator for source.
Returns the CollationKey for the given source text.
| source | the specified source text. | 
|---|
CollationKey for the given source text.
Returns the collation rules of this collator. These rules can be
 fed into the RuleBasedCollator(String) constructor.
 
 Note that the rules are actually interpreted as a delta to the
 standard Unicode Collation Algorithm (UCA). Hence, an empty rules
 string results in the default UCA rules being applied. This differs
 slightly from other implementations which work with full rules
 specifications and may result in different behavior.
Returns an integer hash code for this object. By contract, any two
 objects for which equals(Object) returns true must return
 the same hash code value. This means that subclasses of Object
 usually override both methods or neither method.
 
Note that hash values must not change over time unless information used in equals comparisons also changes.
See Writing a correct hashCode method
 if you intend implementing your own hashCode method.