public final class

Pattern

extends Object
implements Serializable
java.lang.Object
   ↳ java.util.regex.Pattern

Class Overview

Represents a pattern used for matching, searching, or replacing strings. Patterns are specified in terms of regular expressions and compiled using an instance of this class. They are then used in conjunction with a Matcher to perform the actual search.

A typical use case looks like this:

 Pattern p = Pattern.compile("Hello, A[a-z]*!");
  
 Matcher m = p.matcher("Hello, Android!");
 boolean b1 = m.matches(); // true
  
 m.setInput("Hello, Robot!");
 boolean b2 = m.matches(); // false
 

The above code could also be written in a more compact fashion, though this variant is less efficient, since Pattern and Matcher objects are created on the fly instead of being reused. fashion:

     boolean b1 = Pattern.matches("Hello, A[a-z]*!", "Hello, Android!"); // true
     boolean b2 = Pattern.matches("Hello, A[a-z]*!", "Hello, Robot!");   // false
 

Please consult the package documentation for an overview of the regular expression syntax used in this class as well as Android-specific implementation details.

See Also

Summary

Constants
int CANON_EQ This constant specifies that a character in a Pattern and a character in the input string only match if they are canonically equivalent.
int CASE_INSENSITIVE This constant specifies that a Pattern is matched case-insensitively.
int COMMENTS This constant specifies that a Pattern may contain whitespace or comments.
int DOTALL This constant specifies that the '.' meta character matches arbitrary characters, including line endings, which is normally not the case.
int LITERAL This constant specifies that the whole Pattern is to be taken literally, that is, all meta characters lose their meanings.
int MULTILINE This constant specifies that the meta characters '^' and '$' match only the beginning and end end of an input line, respectively.
int UNICODE_CASE This constant specifies that a Pattern is matched case-insensitively with regard to all Unicode characters.
int UNIX_LINES This constant specifies that a pattern matches Unix line endings ('\n') only against the '.', '^', and '$' meta characters.
Public Methods
static Pattern compile(String pattern, int flags)
Compiles a regular expression, creating a new Pattern instance in the process.
static Pattern compile(String pattern)
Compiles a regular expression, creating a new Pattern instance in the process.
int flags()
Returns the flags that have been set for this Pattern.
Matcher matcher(CharSequence input)
Returns a Matcher for the Pattern and a given input.
static boolean matches(String regex, CharSequence input)
Tries to match a given regular expression against a given input.
String pattern()
Returns the regular expression that was compiled into this Pattern.
static String quote(String s)
Quotes a given string using "\Q" and "\E", so that all other meta-characters lose their special meaning.
String[] split(CharSequence inputSeq, int limit)
Splits the given input sequence at occurrences of this Pattern.
String[] split(CharSequence input)
Splits a given input around occurrences of a regular expression.
String toString()
Returns a string containing a concise, human-readable description of this object.
Protected Methods
void finalize()
Is called before the object's memory is being reclaimed by the VM.
[Expand]
Inherited Methods
From class java.lang.Object

Constants

public static final int CANON_EQ

Since: API Level 1

This constant specifies that a character in a Pattern and a character in the input string only match if they are canonically equivalent. It is (currently) not supported in Android.

Constant Value: 128 (0x00000080)

public static final int CASE_INSENSITIVE

Since: API Level 1

This constant specifies that a Pattern is matched case-insensitively. That is, the patterns "a+" and "A+" would both match the string "aAaAaA".

Note: For Android, the CASE_INSENSITIVE constant (currently) always includes the meaning of the UNICODE_CASE constant. So if case insensitivity is enabled, this automatically extends to all Unicode characters. The UNICODE_CASE constant itself has no special consequences.

Constant Value: 2 (0x00000002)

public static final int COMMENTS

Since: API Level 1

This constant specifies that a Pattern may contain whitespace or comments. Otherwise comments and whitespace are taken as literal characters.

Constant Value: 4 (0x00000004)

public static final int DOTALL

Since: API Level 1

This constant specifies that the '.' meta character matches arbitrary characters, including line endings, which is normally not the case.

Constant Value: 32 (0x00000020)

public static final int LITERAL

Since: API Level 1

This constant specifies that the whole Pattern is to be taken literally, that is, all meta characters lose their meanings.

Constant Value: 16 (0x00000010)

public static final int MULTILINE

Since: API Level 1

This constant specifies that the meta characters '^' and '$' match only the beginning and end end of an input line, respectively. Normally, they match the beginning and the end of the complete input.

Constant Value: 8 (0x00000008)

public static final int UNICODE_CASE

Since: API Level 1

This constant specifies that a Pattern is matched case-insensitively with regard to all Unicode characters. It is used in conjunction with the CASE_INSENSITIVE constant to extend its meaning to all Unicode characters.

Note: For Android, the CASE_INSENSITIVE constant (currently) always includes the meaning of the UNICODE_CASE constant. So if case insensitivity is enabled, this automatically extends to all Unicode characters. The UNICODE_CASE constant then has no special consequences.

Constant Value: 64 (0x00000040)

public static final int UNIX_LINES

Since: API Level 1

This constant specifies that a pattern matches Unix line endings ('\n') only against the '.', '^', and '$' meta characters.

Constant Value: 1 (0x00000001)

Public Methods

public static Pattern compile (String pattern, int flags)

Since: API Level 1

Compiles a regular expression, creating a new Pattern instance in the process. Allows to set some flags that modify the behavior of the Pattern.

Parameters
pattern the regular expression.
flags the flags to set. Basically, any combination of the constants defined in this class is valid.

Note: Currently, the CASE_INSENSITIVE and UNICODE_CASE constants have slightly special behavior in Android, and the CANON_EQ constant is not supported at all.

Returns
  • the new Pattern instance.
Throws
PatternSyntaxException if the regular expression is syntactically incorrect.

public static Pattern compile (String pattern)

Since: API Level 1

Compiles a regular expression, creating a new Pattern instance in the process. This is actually a convenience method that calls compile(String, int) with a flags value of zero.

Parameters
pattern the regular expression.
Returns
  • the new Pattern instance.
Throws
PatternSyntaxException if the regular expression is syntactically incorrect.

public int flags ()

Since: API Level 1

Returns the flags that have been set for this Pattern.

Returns
  • the flags that have been set. A combination of the constants defined in this class.

public Matcher matcher (CharSequence input)

Since: API Level 1

Returns a Matcher for the Pattern and a given input. The Matcher can be used to match the Pattern against the whole input, find occurrences of the Pattern in the input, or replace parts of the input.

Parameters
input the input to process.
Returns
  • the resulting Matcher.

public static boolean matches (String regex, CharSequence input)

Since: API Level 1

Tries to match a given regular expression against a given input. This is actually nothing but a convenience method that compiles the regular expression into a Pattern, builds a Matcher for it, and then does the match. If the same regular expression is used for multiple operations, it is recommended to compile it into a Pattern explicitly and request a reusable Matcher.

Parameters
regex the regular expression.
input the input to process.
Returns
  • true if and only if the Pattern matches the input.

public String pattern ()

Since: API Level 1

Returns the regular expression that was compiled into this Pattern.

Returns
  • the regular expression.

public static String quote (String s)

Since: API Level 1

Quotes a given string using "\Q" and "\E", so that all other meta-characters lose their special meaning. If the string is used for a Pattern afterwards, it can only be matched literally.

Parameters
s the string to quote.
Returns
  • the quoted string.

public String[] split (CharSequence inputSeq, int limit)

Since: API Level 1

Splits the given input sequence at occurrences of this Pattern. If this Pattern does not occur in the input, the result is an array containing the input (converted from a CharSequence to a String). Otherwise, the limit parameter controls the contents of the returned array as described below.

Parameters
inputSeq the input sequence.
limit Determines the maximum number of entries in the resulting array, and the treatment of trailing empty strings.
  • For n > 0, the resulting array contains at most n entries. If this is fewer than the number of matches, the final entry will contain all remaining input.
  • For n < 0, the length of the resulting array is exactly the number of occurrences of the Pattern plus one for the text after the final separator. All entries are included.
  • For n == 0, the result is as for n < 0, except trailing empty strings will not be returned. (Note that the case where the input is itself an empty string is special, as described above, and the limit parameter does not apply there.)
Returns
  • the resulting array.

public String[] split (CharSequence input)

Since: API Level 1

Splits a given input around occurrences of a regular expression. This is a convenience method that is equivalent to calling the method split(java.lang.CharSequence, int) with a limit of 0.

Parameters
input the input sequence.
Returns
  • the resulting array.

public String toString ()

Since: API Level 1

Returns a string containing a concise, human-readable description of this object. Subclasses are encouraged to override this method and provide an implementation that takes into account the object's type and data. The default implementation simply concatenates the class name, the '@' sign and a hexadecimal representation of the object's hashCode(), that is, it is equivalent to the following expression:

 getClass().getName() + '@' + Integer.toHexString(hashCode())
 

Returns
  • a printable representation of this object.

Protected Methods

protected void finalize ()

Since: API Level 1

Is called before the object's memory is being reclaimed by the VM. This can only happen once the VM has detected, during a run of the garbage collector, that the object is no longer reachable by any thread of the running application.

The method can be used to free system resources or perform other cleanup before the object is garbage collected. The default implementation of the method is empty, which is also expected by the VM, but subclasses can override finalize() as required. Uncaught exceptions which are thrown during the execution of this method cause it to terminate immediately but are otherwise ignored.

Note that the VM does guarantee that finalize() is called at most once for any object, but it doesn't guarantee when (if at all) finalize() will be called. For example, object B's finalize() can delay the execution of object A's finalize() method and therefore it can delay the reclamation of A's memory. To be safe, use a ReferenceQueue, because it provides more control over the way the VM deals with references during garbage collection.

Throws
Throwable