|
Re: \u014F \301
|
Posted: Jan 1, 2004 8:13 PM
|
|
> Hello, I want to know how these two methods gives the > character...(\u014F and \301) > thankyou
First one \u014F considers 014F as hexadecimal value, which is decimal 335. Second form \301 treats 301 as Octal number, which is decimal 193. Therefore, first one will give you character with unicode value 335 and second one will give you character with unicode value 193.
For more information, read below.
char char is a 16-bit unsigned Java primitive data type. It is used to represent a Unicode character in a Java program. Note that char is unsigned data type and so a char variable cannot have a negative value. The range of char data type is the same as the range of Unicode set i.e. 0 to 65,535. There are char literals and they can be expressed in four different ways as follows.
· A character literal can be expressed as a character enclosed in single quotes e.g. 'A', 'f', and '9'. Note that a character must be enclosed in single quotes to become a character literal in this form e.g. 'B'. If you enclose a character in double quotes, e.g. "B", then it becomes String literal not a character literal. So, 'B' is a character literal whereas "B" is a String literal. There is also another big difference between a character literal and a String literal. A character literal of this form must consist of exactly one character whereas a String literal may consist of one character, or more than one character, or no character (an empty string e.g. ""). For example, "A", "ABC", "" are three String literals. A String literal can never be assigned to a char variable even if the string literal consists of exactly one character. In fact, the assignment of a String literal to a char variable is prohibited because of the rule that a value of reference data type can never be assigned to a variable of primitive data type. All String literals represent objects of String class in Java and hence they are of reference data type whereas char represents Java primitive data type. So String ( a reference data type) to char (a primitive data type) assignment is not allowed. The use of char literals of this type has been illustrated as follows.
char c1 = 'A' ; char c2 = 'L' ; char c3 = '5' ; char c4 = '/' ;
· A character literal can also be expressed as a character escape sequence. A character escape sequence consists of a backslash (\) immediately followed by a character and is enclosed in single quotes. There are eight predefined character sequences in Java, which can be used in a Java program. They are as follows.
Character Escape Sequence Meaning '\n' Linefeed '\r' Carriage return '\f' Form feed '\b' Backspace '\t' Tab '\\' Backslash '\"' Double quote '\'' Single quote
According to the above list of character literals, a single quote character in Java is represented by '\'' and not as '''. It is important to note that a backslash character is represented as '\\' and not as '\' whereas a forward slash is represented as '/' not as '\/'. A character literal expressed in the form of a character escape sequence consists of two characters - a backslash and a character following the backslash. However, it is considered as only one character. There are only eight character escape sequences in Java as listed above, which you can use in your program. You cannot define your own character escape sequence. Examples of character escape sequences are:
char c1 = '\n' ; // Assigns linefeed to c1 char c2 = '\"' ; // Assigns double quote to c2 char c3 = '\a' ; // Compiler error. Invalid character escape // sequence
· A character literal can also be expressed as a Unicode escape sequence. A Unicode escape sequence is expressed in the form '\uxxxx' where \u (a backslash immediately followed by a lowercase U) denotes the start of the Unicode escape sequence and xxxx represents exactly four hexadecimal digits. In fact, xxxx, the four hexadecimal digits, represents the Unicode integer code for the character you want to express in Unicode escape sequence. The Unicode character A has the code value of 65 (in decimal). The value 65 in decimal can be represented in hexadecimal as 41. So, the character A can be expressed in Unicode escape sequence as '\u0041'. Note that in Unicode escape sequence exactly four hexadecimal digits must follow \u and that is why, we added 00 before 41 to make it 0041, the four hexadecimal digits. Following piece of code assigns the same character A to char variables c1 and c2.
char c1 = 'A' ; char c2 = '\u0041' ; // Same as c2 = 'A'
All Unicode characters can be expressed as Unicode escape sequence in the range of '\u0000' and '\uFFFF'. A Unicode escape sequence may use uppercase or lowercase letters A-F to represent a hexadecimal number. Therefore, '\u002a' and '\u002A' represent the same Unicode character, which is an asterisk1 (*). However, \u, the start of Unicode escape sequence, cannot be replaced by \U.
· A character literal can also expressed as an octal escape sequence. An octal escape sequence is expressed in the form '\nnn' where n is an octal digit (0-7). You might guess that the range of characters represented in octal escape sequence form would be '\000' to '\777'. If you guessed so then there is disappointing news for you. The maximum octal value you can use in octal escape sequence to represent a character in Java is '\377', not '\777'. The octal number 377 is same as decimal number 255. So, using octal escape sequence you can represent characters whose Unicode code range from 0 to 255 decimal integers. We have already seen that any Unicode character (code range 0 to 65535) can be represented as a Unicode escape sequence ('\uxxxx') in Java. Why does Java have another octal escape sequence, which is a subset of Unicode escape sequence? Java has the octal escape sequence to represent a character for compatibility with other languages, which uses 8-bit unsigned chars to represent a character. Unlike Unicode escape sequence where you are always required to use four hexadecimal digits, in octal escape sequence you can use one, two, or three octal digits. So, an octal escape sequence may take on any form '\n', '\nn', or '\nnn', where n is one of the octal digits 0, 1, 2, 3, 4, 5, 6, and 7. Examples of octal escape sequence are:
char c1 = '\52' ; char c2 = '\141' ; char c3 = '\400' ; // Compiler error. Octal 400 is Out of range char c4 = '\42' ; char c5 = '\10' ; // Same as '\n'
You can also assign an int literal to a char variable if int literal falls in the range 0-65535 . When you assign an int literal to a char variable then the char variable represents the character whose Unicode code is equal to the value represented by that int literal. The Unicode code for the character a (lowercase A) is 97 (in decimal). The decimal value 97 is represented as 141 in octal and 61 in hexadecimal. You can represent Unicode character a in three different form in Java: 'a','\141' and '\u0061'. You can also use int literal 97 to represent the Unicode character a as:
char c1 = 97; // Same as c1 = 'a', or c1 = '\141', or c1 = '\u0061'
It is true that all Unicode characters can be expressed in the form of Unicode escape sequence in Java. However, sometimes, use of Unicode escape sequence to represent a Unicode character may result in errors in your Java code. See Behind the Scene later in this chapter for detailed explanation of such cases.
Though, a byte variable in Java takes 8 bits whereas char variable takes 16 bits, you cannot assign a value stored in a byte variable to a char variable. The reason being that byte is a signed data type whereas char is unsigned data type. If the byte variable has a negative value, say -15, then it cannot be stored in a char variable without losing the precision. In such a case you need to use explicit cast. Following piece of code illustrates all possible cases of assignment from char to other integral data type and vice-versa.
byte b1 = 10 ; short s1 = 15 ; int num1 = 150 ; long num2 = 20L ; char c1 = 'A' ;
// byte and char b1 = c1 ; // Error b1 = (byte) c1; // Ok c1 = b1 ; // Error c1 = (char)b1 ; // Ok
// short and char s1 = c1 ; // Error s1 = (short) c1; // Ok c1 = s1 ; // Error c1 = (char)s1 ; // Ok
// int and char num1 = c1 ; // Ok num1 = (int) c1; // Ok. But, cast is not required. Use num1 = c1 c1 = num1 ; // Error c1 = (char)num1 ; // Ok c1 = 255 ; // Ok. 255 is in the range of 0-65535 c1 = 70000 ; // Error. 70000 is out of range 0-65535 c1 = (char)70000 ; // Ok. But, will lose the original value
// long and char num2 = c1 ; // Ok num2 = (long) c1; // Ok. But, cast is not required. Use num2 = c1 c1 = num2 ; // Error c1 = (char)num2 ; // Ok c1 = 255L ; // Error. 255L is long literal c1 = (char)255L ; // Ok. But use c1 = 255 instead
|
|