Character class wraps a value of the primitive
type char in an object. An object of type
Character contains a single field whose type is
char.
In addition, this class provides several methods for determining a character's category (lowercase letter, digit, etc.) and for converting characters from uppercase to lowercase and vice versa.
Character information is based on the Unicode Standard, version 4.0.
The methods and data of class Character are defined by
the information in the UnicodeData file that is part of the
Unicode Character Database maintained by the Unicode
Consortium. This file specifies various properties including name
and general category for every defined Unicode code point or
character range.
The file and its description are available from the Unicode Consortium at:
The char data type (and therefore the value that a
Character object encapsulates) are based on the
original Unicode specification, which defined characters as
fixed-width 16-bit entities. The Unicode standard has since been
changed to allow for characters whose representation requires more
than 16 bits. The range of legal code points is now
U+0000 to U+10FFFF, known as Unicode scalar value.
(Refer to the
definition of the U+n notation in the Unicode
standard.)
The set of characters from U+0000 to U+FFFF is sometimes
referred to as the Basic Multilingual Plane (BMP). Characters whose code points are greater
than U+FFFF are called supplementary characters. The Java
2 platform uses the UTF-16 representation in char
arrays and in the String and StringBuffer
classes. In this representation, supplementary characters are
represented as a pair of char values, the first from
the high-surrogates range, (\uD800-\uDBFF), the
second from the low-surrogates range
(\uDC00-\uDFFF).
A char value, therefore, represents Basic
Multilingual Plane (BMP) code points, including the surrogate
code points, or code units of the UTF-16 encoding. An
int value represents all Unicode code points,
including supplementary code points. The lower (least significant)
21 bits of int are used to represent Unicode code
points and the upper (most significant) 11 bits must be zero.
Unless otherwise specified, the behavior with respect to
supplementary characters and surrogate char values is
as follows:
char value cannot support
supplementary characters. They treat char values from the
surrogate ranges as undefined characters. For example,
Character.isLetter('\uD840') returns false, even though
this specific value if followed by any low-surrogate value in a string
would represent a letter.
int value support all
Unicode characters, including supplementary characters. For
example, Character.isLetter(0x2F81A) returns
true because the code point value represents a letter
(a CJK ideograph).
In the Java SE API documentation, Unicode code point is
used for character values in the range between U+0000 and U+10FFFF,
and Unicode code unit is used for 16-bit
char values that are code units of the UTF-16
encoding. For more information on Unicode terminology, refer to the
Unicode Glossary.
Subset instance.
name The name of this subsetNullPointerException if name is nullObject.hashCode() method. This method
is final in order to ensure that the
equals and hashCode methods will
be consistent in all subclasses.
new UnicodeBlock("LATIN_1_SUPPLEMENT", new String[]{ "Latin-1 Supplement", "Latin-1Supplement"});
new UnicodeBlock("GENERAL_PUNCTUATION", new String[] {"General Punctuation", "GeneralPunctuation"});
new UnicodeBlock("COMBINING_MARKS_FOR_SYMBOLS", new String[] {"Combining Diacritical Marks for Symbols",
new UnicodeBlock("LETTERLIKE_SYMBOLS", new String[] { "Letterlike Symbols", "LetterlikeSymbols"});
new UnicodeBlock("OPTICAL_CHARACTER_RECOGNITION", new String[] {"Optical Character Recognition",
new UnicodeBlock("ENCLOSED_CJK_LETTERS_AND_MONTHS", new String[] {"Enclosed CJK Letters and Months",
new UnicodeBlock("ALPHABETIC_PRESENTATION_FORMS", new String[] {"Alphabetic Presentation Forms",
HIGH_SURROGATES,
HIGH_PRIVATE_USE_SURROGATES, and
LOW_SURROGATES. These new constants match
the block definitions of the Unicode Standard.
The of(char) and of(int) methods
return the new constants, not SURROGATES_AREA.new UnicodeBlock("IDEOGRAPHIC_DESCRIPTION_CHARACTERS", new String[] {"Ideographic Description Characters",
new UnicodeBlock("CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A", new String[] {"CJK Unified Ideographs Extension A",
new UnicodeBlock("PHONETIC_EXTENSIONS", new String[] {"Phonetic Extensions", "PhoneticExtensions"});
new UnicodeBlock("MISCELLANEOUS_SYMBOLS_AND_ARROWS", new String[] {"Miscellaneous Symbols and Arrows",
new UnicodeBlock("VARIATION_SELECTORS", new String[] {"Variation Selectors", "VariationSelectors"});
new UnicodeBlock("LINEAR_B_SYLLABARY", new String[] {"Linear B Syllabary", "LinearBSyllabary"});
new UnicodeBlock("LINEAR_B_IDEOGRAMS", new String[] {"Linear B Ideograms", "LinearBIdeograms"});
new UnicodeBlock("VARIATION_SELECTORS_SUPPLEMENT", new String[] {"Variation Selectors Supplement",
null if the character is not a
member of a defined block.
Note: This method cannot handle supplementary
characters. To support all Unicode characters,
including supplementary characters, use the method.
of(int)
c The character in questionUnicodeBlock instance representing the
Unicode block of which this character is a member, or
null if the character is not a member of any
Unicode blocknull if the character is not a member of a
defined block.
codePoint the character (Unicode code point) in question.UnicodeBlock instance representing the
Unicode block of which this character is a member, or
null if the character is not a member of any
Unicode blockIllegalArgumentException if the specified
codePoint is an invalid Unicode code point.Character.isValidCodePoint(int)Character class specifies
the version of the standard that it supports.
This method accepts block names in the following forms:
BASIC_LATIN block if
provided with the "BASIC_LATIN" name. This form replaces all spaces and
hyphens in the canonical name with underscores.
If the Unicode Standard changes block names, both the previous and current names will be accepted.
blockName A UnicodeBlock name.UnicodeBlock instance identified
by blockNameIllegalArgumentException if blockName is an
invalid nameNullPointerException if blockName is nullCharacter(char) , as this method is likely to yield
significantly better space and time performance by caching
frequently requested values.
c a char value.0x0000 to
0x10FFFF inclusive. This method is equivalent to
the expression:
codePoint >= 0x0000 && codePoint <= 0x10FFFF
codePoint the Unicode code point to be testedtrue if the specified code point value
is a valid code point value;
false otherwise.codePoint >= 0x10000 && codePoint <= 0x10FFFF
codePoint the character (Unicode code point) to be testedtrue if the specified character is in the Unicode
supplementary character range; false otherwise.char value is a
high-surrogate code unit (also known as leading-surrogate
code unit). Such values do not represent characters by
themselves, but are used in the representation of supplementary characters in the
UTF-16 encoding.
This method returns true if and only if
isch >= '\uD800' && ch <= '\uDBFF'
true.
ch the char value to be tested.true if the char value
is between '\uD800' and '\uDBFF' inclusive;
false otherwise.isLowSurrogate(char)Character.UnicodeBlock.of(int)char value is a
low-surrogate code unit (also known as trailing-surrogate code
unit). Such values do not represent characters by themselves,
but are used in the representation of supplementary characters in the UTF-16 encoding.
This method returns true if and only if
isch >= '\uDC00' && ch <= '\uDFFF'
true.
ch the char value to be tested.true if the char value
is between '\uDC00' and '\uDFFF' inclusive;
false otherwise.isHighSurrogate(char)char
values is a valid surrogate pair. This method is equivalent to
the expression:
isHighSurrogate(high) && isLowSurrogate(low)
high the high-surrogate code value to be testedlow the low-surrogate code value to be testedtrue if the specified high and
low-surrogate code values represent a valid surrogate pair;
false otherwise.char values needed to
represent the specified character (Unicode code point). If the
specified character is equal to or greater than 0x10000, then
the method returns 2. Otherwise, the method returns 1.
This method doesn't validate the specified character to be a
valid Unicode code point. The caller must validate the
character value using isValidCodePoint
if necessary.
codePoint the character (Unicode code point) to be tested.isSupplementaryCodePoint(int)isSurrogatePair(char,char) if necessary.
high the high-surrogate code unitlow the low-surrogate code unitCharSequence. If the char value at
the given index in the CharSequence is in the
high-surrogate range, the following index is less than the
length of the CharSequence, and the
char value at the following index is in the
low-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at the given index is returned.
seq a sequence of char values (Unicode code
units)index the index to the char values (Unicode
code units) in seq to be convertedNullPointerException if seq is null.IndexOutOfBoundsException if the value
index is negative or not less than
seq.length().char array. If the char value at
the given index in the char array is in the
high-surrogate range, the following index is less than the
length of the char array, and the
char value at the following index is in the
low-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at the given index is returned.
a the char arrayindex the index to the char values (Unicode
code units) in the char array to be convertedNullPointerException if a is null.IndexOutOfBoundsException if the value
index is negative or not less than
the length of the char array.char array, where only array elements with
index less than limit can be used. If
the char value at the given index in the
char array is in the high-surrogate range, the
following index is less than the limit, and the
char value at the following index is in the
low-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at the given index is returned.
a the char arrayindex the index to the char values (Unicode
code units) in the char array to be convertedlimit the index after the last array element that can be used in the
char arrayNullPointerException if a is null.IndexOutOfBoundsException if the index
argument is negative or not less than the limit
argument, or if the limit argument is negative or
greater than the length of the char array.CharSequence. If the char value at
(index - 1) in the CharSequence is in
the low-surrogate range, (index - 2) is not
negative, and the char value at (index -
2) in the CharSequence is in the
high-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at (index - 1) is
returned.
seq the CharSequence instanceindex the index following the code point that should be returnedNullPointerException if seq is null.IndexOutOfBoundsException if the index
argument is less than 1 or greater than CharSequence.length().char array. If the char value at
(index - 1) in the char array is in
the low-surrogate range, (index - 2) is not
negative, and the char value at (index -
2) in the char array is in the
high-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at (index - 1) is
returned.
a the char arrayindex the index following the code point that should be returnedNullPointerException if a is null.IndexOutOfBoundsException if the index
argument is less than 1 or greater than the length of the
char arraychar array, where only array elements with
index greater than or equal to start
can be used. If the char value at (index -
1) in the char array is in the
low-surrogate range, (index - 2) is not less than
start, and the char value at
(index - 2) in the char array is in
the high-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at (index - 1) is
returned.
a the char arrayindex the index following the code point that should be returnedstart the index of the first array element in the
char arrayNullPointerException if a is null.IndexOutOfBoundsException if the index
argument is not greater than the start argument or
is greater than the length of the char array, or
if the start argument is negative or not less than
the length of the char array.dst[dstIndex], and 1 is returned. If the
specified code point is a supplementary character, its
surrogate values are stored in dst[dstIndex]
(high-surrogate) and dst[dstIndex+1]
(low-surrogate), and 2 is returned.
codePoint the character (Unicode code point) to be converted.dst an array of char in which the
codePoint's UTF-16 value is stored.dstIndex the start index into the dst
array where the converted value is stored.IllegalArgumentException if the specified
codePoint is not a valid Unicode code point.NullPointerException if the specified dst is null.IndexOutOfBoundsException if dstIndex
is negative or not less than dst.length, or if
dst at dstIndex doesn't have enough
array element(s) to store the resulting char
value(s). (If dstIndex is equal to
dst.length-1 and the specified
codePoint is a supplementary character, the
high-surrogate value is not stored in
dst[dstIndex].)char array. If
the specified code point is a BMP (Basic Multilingual Plane or
Plane 0) value, the resulting char array has
the same value as codePoint. If the specified code
point is a supplementary code point, the resulting
char array has the corresponding surrogate pair.
codePoint a Unicode code pointchar array having
codePoint's UTF-16 representation.IllegalArgumentException if the specified
codePoint is not a valid Unicode code point.beginIndex and extends to the
char at index endIndex - 1. Thus the
length (in chars) of the text range is
endIndex-beginIndex. Unpaired surrogates within
the text range count as one code point each.
seq the char sequencebeginIndex the index to the first char of
the text range.endIndex the index after the last char of
the text range.NullPointerException if seq is null.IndexOutOfBoundsException if the
beginIndex is negative, or endIndex
is larger than the length of the given sequence, or
beginIndex is larger than endIndex.char array argument. The offset
argument is the index of the first char of the
subarray and the count argument specifies the
length of the subarray in chars. Unpaired
surrogates within the subarray count as one code point each.
a the char arrayoffset the index of the first char in the
given char arraycount the length of the subarray in charsNullPointerException if a is null.IndexOutOfBoundsException if offset or
count is negative, or if offset +
count is larger than the length of the given array.index by codePointOffset
code points. Unpaired surrogates within the text range given by
index and codePointOffset count as
one code point each.
seq the char sequenceindex the index to be offsetcodePointOffset the offset in code pointsNullPointerException if seq is null.IndexOutOfBoundsException if index
is negative or larger then the length of the char sequence,
or if codePointOffset is positive and the
subsequence starting with index has fewer than
codePointOffset code points, or if
codePointOffset is negative and the subsequence
before index has fewer than the absolute value
of codePointOffset code points.char subarray
that is offset from the given index by
codePointOffset code points. The
start and count arguments specify a
subarray of the char array. Unpaired surrogates
within the text range given by index and
codePointOffset count as one code point each.
a the char arraystart the index of the first char of the
subarraycount the length of the subarray in charsindex the index to be offsetcodePointOffset the offset in code pointsNullPointerException if a is null.IndexOutOfBoundsException
if start or count is negative,
or if start + count is larger than the length of
the given array,
or if index is less than start or
larger then start + count,
or if codePointOffset is positive and the text range
starting with index and ending with start
+ count - 1 has fewer than codePointOffset code
points,
or if codePointOffset is negative and the text range
starting with start and ending with index
- 1 has fewer than the absolute value of
codePointOffset code points.
A character is lowercase if its general category type, provided
by Character.getType(ch), is
LOWERCASE_LETTER.
The following are examples of lowercase characters:
a b c d e f g h i j k l m n o p q r s t u v w x y z '\u00DF' '\u00E0' '\u00E1' '\u00E2' '\u00E3' '\u00E4' '\u00E5' '\u00E6' '\u00E7' '\u00E8' '\u00E9' '\u00EA' '\u00EB' '\u00EC' '\u00ED' '\u00EE' '\u00EF' '\u00F0' '\u00F1' '\u00F2' '\u00F3' '\u00F4' '\u00F5' '\u00F6' '\u00F8' '\u00F9' '\u00FA' '\u00FB' '\u00FC' '\u00FD' '\u00FE' '\u00FF'
Many other Unicode characters are lowercase too.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isLowerCase(int)
ch the character to be tested.true if the character is lowercase;
false otherwise.isLowerCase(char)isTitleCase(char)toLowerCase(char)getType(char)
A character is lowercase if its general category type, provided
by getType(codePoint), is
LOWERCASE_LETTER.
The following are examples of lowercase characters:
a b c d e f g h i j k l m n o p q r s t u v w x y z '\u00DF' '\u00E0' '\u00E1' '\u00E2' '\u00E3' '\u00E4' '\u00E5' '\u00E6' '\u00E7' '\u00E8' '\u00E9' '\u00EA' '\u00EB' '\u00EC' '\u00ED' '\u00EE' '\u00EF' '\u00F0' '\u00F1' '\u00F2' '\u00F3' '\u00F4' '\u00F5' '\u00F6' '\u00F8' '\u00F9' '\u00FA' '\u00FB' '\u00FC' '\u00FD' '\u00FE' '\u00FF'
Many other Unicode characters are lowercase too.
codePoint the character (Unicode code point) to be tested.true if the character is lowercase;
false otherwise.isLowerCase(int)isTitleCase(int)toLowerCase(int)getType(int)
A character is uppercase if its general category type, provided by
Character.getType(ch), is UPPERCASE_LETTER.
The following are examples of uppercase characters:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z '\u00C0' '\u00C1' '\u00C2' '\u00C3' '\u00C4' '\u00C5' '\u00C6' '\u00C7' '\u00C8' '\u00C9' '\u00CA' '\u00CB' '\u00CC' '\u00CD' '\u00CE' '\u00CF' '\u00D0' '\u00D1' '\u00D2' '\u00D3' '\u00D4' '\u00D5' '\u00D6' '\u00D8' '\u00D9' '\u00DA' '\u00DB' '\u00DC' '\u00DD' '\u00DE'
Many other Unicode characters are uppercase too.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isUpperCase(int)
ch the character to be tested.true if the character is uppercase;
false otherwise.isLowerCase(char)isTitleCase(char)toUpperCase(char)getType(char)
A character is uppercase if its general category type, provided by
getType(codePoint), is UPPERCASE_LETTER.
The following are examples of uppercase characters:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z '\u00C0' '\u00C1' '\u00C2' '\u00C3' '\u00C4' '\u00C5' '\u00C6' '\u00C7' '\u00C8' '\u00C9' '\u00CA' '\u00CB' '\u00CC' '\u00CD' '\u00CE' '\u00CF' '\u00D0' '\u00D1' '\u00D2' '\u00D3' '\u00D4' '\u00D5' '\u00D6' '\u00D8' '\u00D9' '\u00DA' '\u00DB' '\u00DC' '\u00DD' '\u00DE'
Many other Unicode characters are uppercase too.
codePoint the character (Unicode code point) to be tested.true if the character is uppercase;
false otherwise.isLowerCase(int)isTitleCase(int)toUpperCase(int)getType(int)
A character is a titlecase character if its general
category type, provided by Character.getType(ch),
is TITLECASE_LETTER.
Some characters look like pairs of Latin letters. For example, there is an uppercase letter that looks like "LJ" and has a corresponding lowercase letter that looks like "lj". A third form, which looks like "Lj", is the appropriate form to use when rendering a word in lowercase with initial capitals, as for a book title.
These are some of the Unicode characters for which this method returns
true:
LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON
LATIN CAPITAL LETTER L WITH SMALL LETTER J
LATIN CAPITAL LETTER N WITH SMALL LETTER J
LATIN CAPITAL LETTER D WITH SMALL LETTER Z
Many other Unicode characters are titlecase too.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isTitleCase(int)
ch the character to be tested.true if the character is titlecase;
false otherwise.isLowerCase(char)isUpperCase(char)toTitleCase(char)getType(char)
A character is a titlecase character if its general
category type, provided by getType(codePoint),
is TITLECASE_LETTER.
Some characters look like pairs of Latin letters. For example, there is an uppercase letter that looks like "LJ" and has a corresponding lowercase letter that looks like "lj". A third form, which looks like "Lj", is the appropriate form to use when rendering a word in lowercase with initial capitals, as for a book title.
These are some of the Unicode characters for which this method returns
true:
LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON
LATIN CAPITAL LETTER L WITH SMALL LETTER J
LATIN CAPITAL LETTER N WITH SMALL LETTER J
LATIN CAPITAL LETTER D WITH SMALL LETTER Z
Many other Unicode characters are titlecase too.
codePoint the character (Unicode code point) to be tested.true if the character is titlecase;
false otherwise.isLowerCase(int)isUpperCase(int)toTitleCase(int)getType(int)
A character is a digit if its general category type, provided
by Character.getType(ch), is
DECIMAL_DIGIT_NUMBER.
Some Unicode character ranges that contain digits:
'\u0030' through '\u0039',
ISO-LATIN-1 digits ('0' through '9')
'\u0660' through '\u0669',
Arabic-Indic digits
'\u06F0' through '\u06F9',
Extended Arabic-Indic digits
'\u0966' through '\u096F',
Devanagari digits
'\uFF10' through '\uFF19',
Fullwidth digits
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isDigit(int)
ch the character to be tested.true if the character is a digit;
false otherwise.digit(char,int)forDigit(int,int)getType(char)
A character is a digit if its general category type, provided
by getType(codePoint), is
DECIMAL_DIGIT_NUMBER.
Some Unicode character ranges that contain digits:
'\u0030' through '\u0039',
ISO-LATIN-1 digits ('0' through '9')
'\u0660' through '\u0669',
Arabic-Indic digits
'\u06F0' through '\u06F9',
Extended Arabic-Indic digits
'\u0966' through '\u096F',
Devanagari digits
'\uFF10' through '\uFF19',
Fullwidth digits
codePoint the character (Unicode code point) to be tested.true if the character is a digit;
false otherwise.forDigit(int,int)getType(int)A character is defined if at least one of the following is true:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isDefined(int)
ch the character to be testedtrue if the character has a defined meaning
in Unicode; false otherwise.isDigit(char)isLetter(char)isLetterOrDigit(char)isLowerCase(char)isTitleCase(char)isUpperCase(char)A character is defined if at least one of the following is true:
codePoint the character (Unicode code point) to be tested.true if the character has a defined meaning
in Unicode; false otherwise.isDigit(int)isLetter(int)isLetterOrDigit(int)isLowerCase(int)isTitleCase(int)isUpperCase(int)
A character is considered to be a letter if its general
category type, provided by Character.getType(ch),
is any of the following:
UPPERCASE_LETTER
LOWERCASE_LETTER
TITLECASE_LETTER
MODIFIER_LETTER
OTHER_LETTER
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isLetter(int)
ch the character to be tested.true if the character is a letter;
false otherwise.isDigit(char)isJavaIdentifierStart(char)isJavaLetter(char)isJavaLetterOrDigit(char)isLetterOrDigit(char)isLowerCase(char)isTitleCase(char)isUnicodeIdentifierStart(char)isUpperCase(char)
A character is considered to be a letter if its general
category type, provided by getType(codePoint),
is any of the following:
UPPERCASE_LETTER
LOWERCASE_LETTER
TITLECASE_LETTER
MODIFIER_LETTER
OTHER_LETTER
codePoint the character (Unicode code point) to be tested.true if the character is a letter;
false otherwise.isDigit(int)isJavaIdentifierStart(int)isLetterOrDigit(int)isLowerCase(int)isTitleCase(int)isUnicodeIdentifierStart(int)isUpperCase(int)
A character is considered to be a letter or digit if either
Character.isLetter(char ch) or
Character.isDigit(char ch) returns
true for the character.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isLetterOrDigit(int)
ch the character to be tested.true if the character is a letter or digit;
false otherwise.isDigit(char)isJavaIdentifierPart(char)isJavaLetter(char)isJavaLetterOrDigit(char)isLetter(char)isUnicodeIdentifierPart(char)
A character is considered to be a letter or digit if either
isLetter(codePoint) or
isDigit(codePoint) returns
true for the character.
codePoint the character (Unicode code point) to be tested.true if the character is a letter or digit;
false otherwise.isDigit(int)isJavaIdentifierPart(int)isLetter(int)isUnicodeIdentifierPart(int)A character may start a Java identifier if and only if one of the following is true:
isLetter(ch) returns true
getType(ch) returns LETTER_NUMBER
ch the character to be tested.true if the character may start a Java
identifier; false otherwise.isJavaLetterOrDigit(char)isJavaIdentifierStart(char)isJavaIdentifierPart(char)isLetter(char)isLetterOrDigit(char)isUnicodeIdentifierStart(char)A character may be part of a Java identifier if and only if any of the following are true:
'$')
'_')
isIdentifierIgnorable returns
true for the character.
ch the character to be tested.true if the character may be part of a
Java identifier; false otherwise.isJavaLetter(char)isJavaIdentifierStart(char)isJavaIdentifierPart(char)isLetter(char)isLetterOrDigit(char)isUnicodeIdentifierPart(char)isIdentifierIgnorable(char)A character may start a Java identifier if and only if one of the following conditions is true:
isLetter(ch) returns true
getType(ch) returns LETTER_NUMBER
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isJavaIdentifierStart(int)
ch the character to be tested.true if the character may start a Java identifier;
false otherwise.isJavaIdentifierPart(char)isLetter(char)isUnicodeIdentifierStart(char)javax.lang.model.SourceVersion.isIdentifier(java.lang.CharSequence)A character may start a Java identifier if and only if one of the following conditions is true:
isLetter(codePoint)
returns true
getType(codePoint)
returns LETTER_NUMBER
codePoint the character (Unicode code point) to be tested.true if the character may start a Java identifier;
false otherwise.isJavaIdentifierPart(int)isLetter(int)isUnicodeIdentifierStart(int)javax.lang.model.SourceVersion.isIdentifier(java.lang.CharSequence)A character may be part of a Java identifier if any of the following are true:
'$')
'_')
isIdentifierIgnorable returns
true for the character
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isJavaIdentifierPart(int)
ch the character to be tested.true if the character may be part of a
Java identifier; false otherwise.isIdentifierIgnorable(char)isJavaIdentifierStart(char)isLetterOrDigit(char)isUnicodeIdentifierPart(char)javax.lang.model.SourceVersion.isIdentifier(java.lang.CharSequence)A character may be part of a Java identifier if any of the following are true:
'$')
'_')
isIdentifierIgnorable(codePoint) returns true for
the character
codePoint the character (Unicode code point) to be tested.true if the character may be part of a
Java identifier; false otherwise.isIdentifierIgnorable(int)isJavaIdentifierStart(int)isLetterOrDigit(int)isUnicodeIdentifierPart(int)javax.lang.model.SourceVersion.isIdentifier(java.lang.CharSequence)A character may start a Unicode identifier if and only if one of the following conditions is true:
isLetter(ch) returns true
getType(ch) returns
LETTER_NUMBER.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isUnicodeIdentifierStart(int)
ch the character to be tested.true if the character may start a Unicode
identifier; false otherwise.isJavaIdentifierStart(char)isLetter(char)isUnicodeIdentifierPart(char)A character may start a Unicode identifier if and only if one of the following conditions is true:
isLetter(codePoint)
returns true
getType(codePoint)
returns LETTER_NUMBER.
codePoint the character (Unicode code point) to be tested.true if the character may start a Unicode
identifier; false otherwise.isJavaIdentifierStart(int)isLetter(int)isUnicodeIdentifierPart(int)A character may be part of a Unicode identifier if and only if one of the following statements is true:
'_')
isIdentifierIgnorable returns
true for this character.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isUnicodeIdentifierPart(int)
ch the character to be tested.true if the character may be part of a
Unicode identifier; false otherwise.isIdentifierIgnorable(char)isJavaIdentifierPart(char)isLetterOrDigit(char)isUnicodeIdentifierStart(char)A character may be part of a Unicode identifier if and only if one of the following statements is true:
'_')
isIdentifierIgnorable returns
true for this character.
codePoint the character (Unicode code point) to be tested.true if the character may be part of a
Unicode identifier; false otherwise.isIdentifierIgnorable(int)isJavaIdentifierPart(int)isLetterOrDigit(int)isUnicodeIdentifierStart(int)The following Unicode characters are ignorable in a Java identifier or a Unicode identifier:
'\u0000' through '\u0008'
'\u000E' through '\u001B'
'\u007F' through '\u009F'
FORMAT general
category value
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isIdentifierIgnorable(int)
ch the character to be tested.true if the character is an ignorable control
character that may be part of a Java or Unicode identifier;
false otherwise.isJavaIdentifierPart(char)isUnicodeIdentifierPart(char)The following Unicode characters are ignorable in a Java identifier or a Unicode identifier:
'\u0000' through '\u0008'
'\u000E' through '\u001B'
'\u007F' through '\u009F'
FORMAT general
category value
codePoint the character (Unicode code point) to be tested.true if the character is an ignorable control
character that may be part of a Java or Unicode identifier;
false otherwise.isJavaIdentifierPart(int)isUnicodeIdentifierPart(int)
Note that
Character.isLowerCase(Character.toLowerCase(ch))
does not always return true for some ranges of
characters, particularly those that are symbols or ideographs.
In general, should be used to map
characters to lowercase. String.toLowerCase()String case mapping methods
have several benefits over Character case mapping methods.
String case mapping methods can perform locale-sensitive
mappings, context-sensitive mappings, and 1:M character mappings, whereas
the Character case mapping methods cannot.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
toLowerCase(int)
ch the character to be converted.isLowerCase(char)String.toLowerCase() Note that
Character.isLowerCase(Character.toLowerCase(codePoint))
does not always return true for some ranges of
characters, particularly those that are symbols or ideographs.
In general, should be used to map
characters to lowercase. String.toLowerCase()String case mapping methods
have several benefits over Character case mapping methods.
String case mapping methods can perform locale-sensitive
mappings, context-sensitive mappings, and 1:M character mappings, whereas
the Character case mapping methods cannot.
codePoint the character (Unicode code point) to be converted.isLowerCase(int)String.toLowerCase()
Note that
Character.isUpperCase(Character.toUpperCase(ch))
does not always return true for some ranges of
characters, particularly those that are symbols or ideographs.
In general, should be used to map
characters to uppercase. String.toUpperCase()String case mapping methods
have several benefits over Character case mapping methods.
String case mapping methods can perform locale-sensitive
mappings, context-sensitive mappings, and 1:M character mappings, whereas
the Character case mapping methods cannot.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
toUpperCase(int)
ch the character to be converted.isUpperCase(char)String.toUpperCase()Note that
Character.isUpperCase(Character.toUpperCase(codePoint))
does not always return true for some ranges of
characters, particularly those that are symbols or ideographs.
In general, should be used to map
characters to uppercase. String.toUpperCase()String case mapping methods
have several benefits over Character case mapping methods.
String case mapping methods can perform locale-sensitive
mappings, context-sensitive mappings, and 1:M character mappings, whereas
the Character case mapping methods cannot.
codePoint the character (Unicode code point) to be converted.isUpperCase(int)String.toUpperCase()char argument is already a titlecase
char, the same char value will be
returned.
Note that
Character.isTitleCase(Character.toTitleCase(ch))
does not always return true for some ranges of
characters.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
toTitleCase(int)
ch the character to be converted.isTitleCase(char)toLowerCase(char)toUpperCase(char)Note that
Character.isTitleCase(Character.toTitleCase(codePoint))
does not always return true for some ranges of
characters.
codePoint the character (Unicode code point) to be converted.isTitleCase(int)toLowerCase(int)toUpperCase(int)ch in the
specified radix.
If the radix is not in the range MIN_RADIX <=
radix <= MAX_RADIX or if the
value of ch is not a valid digit in the specified
radix, -1 is returned. A character is a valid digit
if at least one of the following is true:
isDigit is true of the character
and the Unicode decimal digit value of the character (or its
single-character decomposition) is less than the specified radix.
In this case the decimal digit value is returned.
'A' through 'Z' and its code is less than
radix + 'A' - 10.
In this case, ch - 'A' + 10
is returned.
'a' through 'z' and its code is less than
radix + 'a' - 10.
In this case, ch - 'a' + 10
is returned.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
digit(int,int)
ch the character to be converted.radix the radix.forDigit(int,int)isDigit(char)If the radix is not in the range MIN_RADIX <=
radix <= MAX_RADIX or if the
character is not a valid digit in the specified
radix, -1 is returned. A character is a valid digit
if at least one of the following is true:
isDigit(codePoint) is true of the character
and the Unicode decimal digit value of the character (or its
single-character decomposition) is less than the specified radix.
In this case the decimal digit value is returned.
'A' through 'Z' and its code is less than
radix + 'A' - 10.
In this case, ch - 'A' + 10
is returned.
'a' through 'z' and its code is less than
radix + 'a' - 10.
In this case, ch - 'a' + 10
is returned.
codePoint the character (Unicode code point) to be converted.radix the radix.forDigit(int,int)isDigit(int)int value that the specified Unicode
character represents. For example, the character
'\u216C' (the roman numeral fifty) will return
an int with a value of 50.
The letters A-Z in their uppercase ('\u0041' through
'\u005A'), lowercase
('\u0061' through '\u007A'), and
full width variant ('\uFF21' through
'\uFF3A' and '\uFF41' through
'\uFF5A') forms have numeric values from 10
through 35. This is independent of the Unicode specification,
which does not assign numeric values to these char
values.
If the character does not have a numeric value, then -1 is returned. If the character has a numeric value that cannot be represented as a nonnegative integer (for example, a fractional value), then -2 is returned.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
getNumericValue(int)
ch the character to be converted.int
value; -2 if the character has a numeric value that is not a
nonnegative integer; -1 if the character has no numeric value.forDigit(int,int)isDigit(char)int value that the specified
character (Unicode code point) represents. For example, the character
'\u216C' (the Roman numeral fifty) will return
an int with a value of 50.
The letters A-Z in their uppercase ('\u0041' through
'\u005A'), lowercase
('\u0061' through '\u007A'), and
full width variant ('\uFF21' through
'\uFF3A' and '\uFF41' through
'\uFF5A') forms have numeric values from 10
through 35. This is independent of the Unicode specification,
which does not assign numeric values to these char
values.
If the character does not have a numeric value, then -1 is returned. If the character has a numeric value that cannot be represented as a nonnegative integer (for example, a fractional value), then -2 is returned.
codePoint the character (Unicode code point) to be converted.int
value; -2 if the character has a numeric value that is not a
nonnegative integer; -1 if the character has no numeric value.forDigit(int,int)isDigit(int)true for the following five
characters only:
'\t' | '\u0009' | HORIZONTAL TABULATION |
'\n' | '\u000A' | NEW LINE |
'\f' | '\u000C' | FORM FEED |
'\r' | '\u000D' | CARRIAGE RETURN |
' ' | '\u0020' | SPACE |
ch the character to be tested.true if the character is ISO-LATIN-1 white
space; false otherwise.isSpaceChar(char)isWhitespace(char)SPACE_SEPARATOR
LINE_SEPARATOR
PARAGRAPH_SEPARATOR
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isSpaceChar(int)
ch the character to be tested.true if the character is a space character;
false otherwise.isWhitespace(char)codePoint the character (Unicode code point) to be tested.true if the character is a space character;
false otherwise.isWhitespace(int)SPACE_SEPARATOR,
LINE_SEPARATOR, or PARAGRAPH_SEPARATOR)
but is not also a non-breaking space ('\u00A0',
'\u2007', '\u202F').
'\u0009', HORIZONTAL TABULATION.
'\u000A', LINE FEED.
'\u000B', VERTICAL TABULATION.
'\u000C', FORM FEED.
'\u000D', CARRIAGE RETURN.
'\u001C', FILE SEPARATOR.
'\u001D', GROUP SEPARATOR.
'\u001E', RECORD SEPARATOR.
'\u001F', UNIT SEPARATOR.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isWhitespace(int)
ch the character to be tested.true if the character is a Java whitespace
character; false otherwise.isSpaceChar(char)SPACE_SEPARATOR,
LINE_SEPARATOR, or PARAGRAPH_SEPARATOR)
but is not also a non-breaking space ('\u00A0',
'\u2007', '\u202F').
'\u0009', HORIZONTAL TABULATION.
'\u000A', LINE FEED.
'\u000B', VERTICAL TABULATION.
'\u000C', FORM FEED.
'\u000D', CARRIAGE RETURN.
'\u001C', FILE SEPARATOR.
'\u001D', GROUP SEPARATOR.
'\u001E', RECORD SEPARATOR.
'\u001F', UNIT SEPARATOR.
codePoint the character (Unicode code point) to be tested.true if the character is a Java whitespace
character; false otherwise.isSpaceChar(int)'\u0000'
through '\u001F' or in the range
'\u007F' through '\u009F'.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isISOControl(int)
ch the character to be tested.true if the character is an ISO control character;
false otherwise.isSpaceChar(char)isWhitespace(char)'\u0000'
through '\u001F' or in the range
'\u007F' through '\u009F'.
codePoint the character (Unicode code point) to be tested.true if the character is an ISO control character;
false otherwise.isSpaceChar(int)isWhitespace(int)Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
getType(int)
ch the character to be tested.int representing the
character's general category.COMBINING_SPACING_MARKCONNECTOR_PUNCTUATIONCONTROLCURRENCY_SYMBOLDASH_PUNCTUATIONDECIMAL_DIGIT_NUMBERENCLOSING_MARKEND_PUNCTUATIONFINAL_QUOTE_PUNCTUATIONFORMATINITIAL_QUOTE_PUNCTUATIONLETTER_NUMBERLINE_SEPARATORLOWERCASE_LETTERMATH_SYMBOLMODIFIER_LETTERMODIFIER_SYMBOLNON_SPACING_MARKOTHER_LETTEROTHER_NUMBEROTHER_PUNCTUATIONOTHER_SYMBOLPARAGRAPH_SEPARATORPRIVATE_USESPACE_SEPARATORSTART_PUNCTUATIONSURROGATETITLECASE_LETTERUNASSIGNEDUPPERCASE_LETTERcodePoint the character (Unicode code point) to be tested.int representing the
character's general category.COMBINING_SPACING_MARK COMBINING_SPACING_MARKCONNECTOR_PUNCTUATION CONNECTOR_PUNCTUATIONCONTROL CONTROLCURRENCY_SYMBOL CURRENCY_SYMBOLDASH_PUNCTUATION DASH_PUNCTUATIONDECIMAL_DIGIT_NUMBER DECIMAL_DIGIT_NUMBERENCLOSING_MARK ENCLOSING_MARKEND_PUNCTUATION END_PUNCTUATIONFINAL_QUOTE_PUNCTUATION FINAL_QUOTE_PUNCTUATIONFORMAT FORMATINITIAL_QUOTE_PUNCTUATION INITIAL_QUOTE_PUNCTUATIONLETTER_NUMBER LETTER_NUMBERLINE_SEPARATOR LINE_SEPARATORLOWERCASE_LETTER LOWERCASE_LETTERMATH_SYMBOL MATH_SYMBOLMODIFIER_LETTER MODIFIER_LETTERMODIFIER_SYMBOL MODIFIER_SYMBOLNON_SPACING_MARK NON_SPACING_MARKOTHER_LETTER OTHER_LETTEROTHER_NUMBER OTHER_NUMBEROTHER_PUNCTUATION OTHER_PUNCTUATIONOTHER_SYMBOL OTHER_SYMBOLPARAGRAPH_SEPARATOR PARAGRAPH_SEPARATORPRIVATE_USE PRIVATE_USESPACE_SEPARATOR SPACE_SEPARATORSTART_PUNCTUATION START_PUNCTUATIONSURROGATE SURROGATETITLECASE_LETTER TITLECASE_LETTERUNASSIGNED UNASSIGNEDUPPERCASE_LETTER UPPERCASE_LETTERradix is not a
valid radix, or the value of digit is not a valid
digit in the specified radix, the null character
('\u0000') is returned.
The radix argument is valid if it is greater than or
equal to MIN_RADIX and less than or equal to
MAX_RADIX. The digit argument is valid if
0 <=digit < radix.
If the digit is less than 10, then
'0' + digit is returned. Otherwise, the value
'a' + digit - 10 is returned.
digit the number to convert to a character.radix the radix.char representation of the specified digit
in the specified radix.MIN_RADIXMAX_RADIXdigit(char,int)char values is DIRECTIONALITY_UNDEFINED.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
getDirectionality(int)
ch char for which the directionality property
is requested.char value.DIRECTIONALITY_UNDEFINEDDIRECTIONALITY_LEFT_TO_RIGHTDIRECTIONALITY_RIGHT_TO_LEFTDIRECTIONALITY_RIGHT_TO_LEFT_ARABICDIRECTIONALITY_EUROPEAN_NUMBERDIRECTIONALITY_EUROPEAN_NUMBER_SEPARATORDIRECTIONALITY_EUROPEAN_NUMBER_TERMINATORDIRECTIONALITY_ARABIC_NUMBERDIRECTIONALITY_COMMON_NUMBER_SEPARATORDIRECTIONALITY_NONSPACING_MARKDIRECTIONALITY_BOUNDARY_NEUTRALDIRECTIONALITY_PARAGRAPH_SEPARATORDIRECTIONALITY_SEGMENT_SEPARATORDIRECTIONALITY_WHITESPACEDIRECTIONALITY_OTHER_NEUTRALSDIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDINGDIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDEDIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDINGDIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDEDIRECTIONALITY_POP_DIRECTIONAL_FORMATDIRECTIONALITY_UNDEFINED.
codePoint the character (Unicode code point) for which
the directionality property is requested.DIRECTIONALITY_UNDEFINED DIRECTIONALITY_UNDEFINEDDIRECTIONALITY_LEFT_TO_RIGHT DIRECTIONALITY_LEFT_TO_RIGHTDIRECTIONALITY_RIGHT_TO_LEFT DIRECTIONALITY_RIGHT_TO_LEFTDIRECTIONALITY_RIGHT_TO_LEFT_ARABIC DIRECTIONALITY_RIGHT_TO_LEFT_ARABICDIRECTIONALITY_EUROPEAN_NUMBER DIRECTIONALITY_EUROPEAN_NUMBERDIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATORDIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATORDIRECTIONALITY_ARABIC_NUMBER DIRECTIONALITY_ARABIC_NUMBERDIRECTIONALITY_COMMON_NUMBER_SEPARATOR DIRECTIONALITY_COMMON_NUMBER_SEPARATORDIRECTIONALITY_NONSPACING_MARK DIRECTIONALITY_NONSPACING_MARKDIRECTIONALITY_BOUNDARY_NEUTRAL DIRECTIONALITY_BOUNDARY_NEUTRALDIRECTIONALITY_PARAGRAPH_SEPARATOR DIRECTIONALITY_PARAGRAPH_SEPARATORDIRECTIONALITY_SEGMENT_SEPARATOR DIRECTIONALITY_SEGMENT_SEPARATORDIRECTIONALITY_WHITESPACE DIRECTIONALITY_WHITESPACEDIRECTIONALITY_OTHER_NEUTRALS DIRECTIONALITY_OTHER_NEUTRALSDIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDINGDIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDEDIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDINGDIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDEDIRECTIONALITY_POP_DIRECTIONAL_FORMAT DIRECTIONALITY_POP_DIRECTIONAL_FORMAT'\u0028' LEFT
PARENTHESIS is semantically defined to be an opening
parenthesis. This will appear as a "(" in text that is
left-to-right but as a ")" in text that is right-to-left.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the method.
isMirrored(int)
ch char for which the mirrored property is requestedtrue if the char is mirrored, false
if the char is not mirrored or is not defined.'\u0028' LEFT PARENTHESIS is semantically
defined to be an opening parenthesis. This will appear
as a "(" in text that is left-to-right but as a ")" in text
that is right-to-left.
codePoint the character (Unicode code point) to be tested.true if the character is mirrored, false
if the character is not mirrored or is not defined.Character objects numerically.
anotherCharacter the Character to be compared.0 if the argument Character
is equal to this Character; a value less than
0 if this Character is numerically less
than the Character argument; and a value greater than
0 if this Character is numerically greater
than the Character argument (unsigned comparison).
Note that this is strictly a numerical comparison; it is not
locale-dependent.
codePoint the character (Unicode code point) to be converted.Character.ERROR)
that indicates that a 1:M char mapping exists.isLowerCase(char)isUpperCase(char)toLowerCase(char)toTitleCase(char)char itself is returned in the
char[].
codePoint the character (Unicode code point) to be converted.char[] with the uppercased character.