Like any other programming language, the Java programming language is defined by grammar rules that specify how syntactically legal constructs can be formed using the language elements, and by a semantic definition that specifies the meaning of syntactically legal constructs.
The low-level language elements are called lexical tokens (or just tokens) and are the building blocks for more complex constructs. Identifiers, numbers, operators, and special characters are all examples of tokens that can be used to build high-level constructs like expressions, statements, methods, and classes.
A name in a program is called an identifier. Identifiers can be used to denote classes, methods, variables, and labels.
In Java, an identifier is composed of a sequence of characters, where each character can be either a letter or a digit. However, the first character in an identifier must be a letter. Since Java programs are written in the Unicode character set (see p. 23), the definitions of letter and digit are interpreted according to this character set. Note that connecting punctuation (such as underscore_
) and any currency symbol (such as $
, ¢
, ¥
, or £
) are allowed as letters, but should be avoided in identifier names.
Identifiers in Java are case sensitive, for example, price
and Price
are two different identifiers.
number, Number, sum_$, bingo, $$_100, mål, grüß
48chevy, all@hands, grand-sum
The name 48chevy
is not a legal identifier as it starts with a digit. The character @
is not a legal character in an identifier. It is also not a legal operator, so that all@hands
cannot be interpreted as a legal expression with two operands. The character -
is also not a legal character in an identifier. However, it is a legal operator so grand-sum
could be interpreted as a legal expression with two operands.
Keywords are reserved words that are predefined in the language and cannot be used to denote other entities. All the keywords are in lowercase, and incorrect usage results in compilation errors.
Keywords currently defined in the language are listed in Table 2.1. In addition, three identifiers are reserved as predefined literals in the language: the null
reference, and the boolean literals true
and false
(see Table 2.2). Keywords currently reserved, but not in use, are listed in Table 2.3. A reserved word cannot be used as an identifier. The index contains references to relevant sections where currently used keywords are explained.
Table 2.1 Keywords in Java
Table 2.2 Reserved Literals in Java
Table 2.3 Reserved Keywords not Currently in Use
A literal denotes a constant value, i.e., the value that a literal represents remains unchanged in the program. Literals represent numerical (integer or floating-point), character, boolean or string values. In addition, there is the literal null
that represents the null reference.
Table 2.4 Examples of Literals
Integer data types comprise the following primitive data types: int
, long
, byte
, and short
(see Section 2.2, p. 28).
The default data type of an integer literal is always int
, but it can be specified as long
by appending the suffix L
(or l
) to the integer value. Without the suffix, the long
literals 2000L
and 0l
will be interpreted as int
literals. There is no direct way to specify a short
or a byte
literal.
In addition to the decimal number system, integer literals can also be specified in octal (base 8
) and hexadecimal (base16
) number systems. Octal and hexadecimal numbers are specified with a 0
and 0x
(or 0X
) prefix respectively. Examples of decimal, octal and hexadecimal literals are shown in Table 2.5. Note that the leading 0
(zero) digit is not the uppercase letter O
. The hexadecimal digits from a
to f
can also be specified with the corresponding uppercase forms (A
to F
). Negative integers (e.g. -90
) can be specified by prefixing the minus sign (-
) to the magnitude of the integer regardless of number system (e.g., -0132
or -0X5A
). Number systems and number representation are discussed in Appendix G. Java does not support literals in binary notation.
Table 2.5 Examples of Decimal, Octal, and Hexadecimal Literals
Floating-point data types come in two flavors: float
or double
.
The default data type of a floating-point literal is double
, but it can be explicitly designated by appending the suffix D
(or d
) to the value. A floating-point literal can also be specified to be a float
by appending the suffix F
(or f
).
Floating-point literals can also be specified in scientific notation, where E
(or e
) stands for Exponent. For example, the double
literal 194.9E-2
in scientific notation is interpreted as 194.9 × 10-2 (i.e., 1.949).
double
Literals0.0 0.0d 0D
0.49 .49 .49D
49.0 49. 49D
4.9E+1 4.9E+1D 4.9e1d 4900e-2 .49E2
float
Literals0.0F 0f
0.49F .49F
49.0F 49.F 49F
4.9E+1F 4900e-2f .49E2F
Note that the decimal point and the exponent are optional and that at least one digit must be specified.
The primitive data type boolean
represents the truth-values true or false that are denoted by the reserved literals true
or false
, respectively.
A character literal is quoted in single-quotes ('
). All character literals have the primitive data type char
.
A character literal is represented according to the 16-bit Unicode character set, which subsumes the 8-bit ISO-Latin-1 and the 7-bit ASCII characters. In Table 2.6, note that digits (0
to 9
), upper-case letters (A
to Z
), and lower-case letters (a
to z
) have contiguous Unicode values. A Unicode character can always be specified as a fourdigit hexadecimal number (i.e., 16 bits) with the prefix u
.
Table 2.6 Examples of Character Literals
Certain escape sequences define special characters, as shown in Table 2.7. These escape sequences can be single-quoted to define character literals. For example, the character literals ' '
and 'u0009'
are equivalent. However, the character literals 'u000a'
and 'u000d'
should not be used to represent newline and carriage return in the source code. These values are interpreted as line-terminator characters by the compiler, and will cause compile time errors. You should use the escape sequences '
'
and '
'
, respectively, for correct interpretation of these characters in the source code.
Table 2.7 Escape Sequences
We can also use the escape sequence ddd
to specify a character literal as an octal value, where each digit d
can be any octal digit (0
–7
), as shown in Table 2.8. The number of digits must be three or fewer, and the octal value cannot exceed 377
, i.e., only the first 256 characters can be specified with this notation.
Table 2.8 Examples of Escape Sequence ddd
A string literal is a sequence of characters which must be enclosed in double quotes and must occur on a single line. All string literals are objects of the class String
(see Section 10.4, p. 439).
Escape sequences as well as Unicode values can appear in string literals:
"Here comes a tab. And here comes another oneu0009!"
(1)"What's on the menu?"
(2)""String literals are double-quoted.""
(3)"Left!
Right!"
(4)"Don't split
(5)me up!"
In (1), the tab character is specified using the escape sequence and the Unicode value, respectively. In (2), the single apostrophe need not be escaped in strings, but it would be if specified as a character literal ('''
). In (3), the double quotes in the string must be escaped. In (4), we use the escape sequence
to insert a newline. (5) generates a compile time error, as the string literal is split over several lines. Printing the strings from (1) to (4) will give the following result:
Here comes a tab. And here comes another one !
What's on the menu?
"String literals are double-quoted."
Left!
Right!
One should also use the escape sequences
and
, respectively, for correct interpretation of the characters u000a
(newline) and u000d
(form feed) in string literals.
A white space is a sequence of spaces, tabs, form feeds, and line terminator characters in a Java source file. Line terminators can be newline, carriage return, or a carriage return-newline sequence.
A Java program is a free-format sequence of characters that is tokenized by the compiler, i.e., broken into a stream of tokens for further analysis. Separators and operators help to distinguish tokens, but sometimes white space has to be inserted explicitly as a separator. For example, the identifier classRoom
will be interpreted as a single token, unless white space is inserted to distinguish the keyword class
from the identifier Room
.
White space aids not only in separating tokens, but also in formatting the program so that it is easy to read. The compiler ignores the white spaces once the tokens are identified.
A program can be documented by inserting comments at relevant places in the source code. These comments are for documentation purposes only and are ignored by the compiler.
Java provides three types of comments to document a program:
• A single-line comment: // ... to the end of the line
• A multiple-line comment: /* ... */
• A documentation (Javadoc) comment: /** ... */
All characters after the comment-start sequence //
through to the end of the line constitute a single-line comment.
// This comment ends at the end of this line.
int age; // From comment-start sequence to the end of the line is a comment.
A multiple-line comment, as the name suggests, can span several lines. Such a comment starts with the sequence /*
and ends with the sequence */
.
/* A comment
on several
lines.
*/
The comment-start sequences (//
, /*
, /**
) are not treated differently from other characters when occurring within comments, and are thus ignored. This means that trying to nest multiple-line comments will result in a compile time error:
/* Formula for alchemy.
gold = wizard.makeGold(stone);
/* But it only works on Sundays. */
*/
The second occurrence of the comment-start sequence /*
is ignored. The last occurrence of the sequence */
in the code is now unmatched, resulting in a syntax error.
A documentation comment is a special-purpose comment that is used by the javadoc
tool to generate HTML documentation for the program. Documentation comments are usually placed in front of classes, interfaces, methods, and field definitions. Special tags can be used inside a documentation comment to provide more specific information. Such a comment starts with the sequence /**
and ends with the sequence */
:
/**
* This class implements a gizmo.
* @author K.A.M.
* @version 3.0
*/
For details on the javadoc
tool, see the tools documentation provided by the JDK.
2.1 Which of the following is not a legal identifier?
Select the one correct answer.
(a) a2z
(b) ödipus
(c) 52pickup
(d) _class
(e) ca$h
2.2 Which statement is true?
Select the one correct answer.
(a) new
and delete
are keywords in the Java language.
(b) try
, catch
, and thrown
are keywords in the Java language.
(c) static
, unsigned
, and long
are keywords in the Java language.
(d) exit
, class
, and while
are keywords in the Java language.
(e) return
, goto
, and default
are keywords in the Java language.
(f) for
, while
, and next
are keywords in the Java language.
2.3 Which statement about the following comment is true?
/* // */
Select the one correct answer.
(a) The comment is not valid. The multiple-line comment (/* ... */)
does not end correctly, since the comment-end sequence */
is a part of the single-line comment (// ...
).
(b) It is a completely valid comment. The //
part is ignored by the compiler.
(c) This combination of comments is illegal, and will result in a compile time error.
Figure 2.1 gives an overview of the primitive data types in Java.
Primitive data types in Java can be divided into three main categories:
• integral types—represent signed integers (byte
, short
, int
, long
) and unsigned character values (char
)
• floating-point types (float
, double
)—represent fractional signed numbers
• boolean type (boolean
)—represents logical values
Figure 2.1 Primitive Data Types in Java
Primitive data values are not objects. Each primitive data type defines the range of values in the data type, and operations on these values are defined by special operators in the language (see Chapter 5).
Each primitive data type also has a corresponding wrapper class that can be used to represent a primitive value as an object. Wrapper classes are discussed in Section 10.3, p. 428.
Integer data types are byte
, short
, int
, and long
(see Table 2.9). Their values are signed integers represented by 2’s complement (see Section G.4, p. 1010).
Table 2.9 Range of Integer Values
char
TypeThe data type char
represents characters (see Table 2.10). Their values are unsigned integers that denote all the 65536 (216) characters in the 16-bit Unicode character set. This set includes letters, digits, and special characters.
Table 2.10 Range of Character Values
The first 128 characters of the Unicode set are the same as the 128 characters of the 7-bit ASCII character set, and the first 256 characters of the Unicode set correspond to the 256 characters of the 8-bit ISO Latin-1 character set.
The integer types and the char type are collectively called integral types.
Floating-point numbers are represented by the float
and double
data types.
Floating-point numbers conform to the IEEE 754-1985 binary floating-point standard. Table 2.11 shows the range of values for positive floating-point numbers, but these apply equally to negative floating-point numbers with the '-'
sign as a prefix. Zero can be either 0.0
or -0.0
.
Table 2.11 Range of Floating-Point Values
Since the size for representation is a finite number of bits, certain floating-point numbers can only be represented as approximations. For example, the value of the expression (1.0/3.0)
is represented as an approximation due to the finite number of bits used.
boolean
TypeThe data type boolean
represents the two logical values denoted by the literals true
and false
(see Table 2.12).
Table 2.12 Boolean Values
Boolean values are produced by all relational (see Section 5.10, p. 190), conditional (see Section 5.13, p. 196) and boolean logical operators (see Section 5.12, p. 194), and are primarily used to govern the flow of control during program execution.
Table 2.13 summarizes the pertinent facts about the primitive data types: their width or size, which indicates the number of the bits required to store a primitive value; their range of legal values, which is specified by the minimum and the maximum values permissible; and the name of the corresponding wrapper class (see Section 10.3, p. 428).
Table 2.13 Summary of Primitive Data Types
2.4 Which of the following do not denote a primitive data value in Java?
Select the two correct answers.
(a) "t"
(b) 'k'
(c) 50.5F
(d) "hello"
(e) false
2.5 Which of the following primitive data types are not integer types?
Select the three correct answers.
(a) boolean
(b) byte
(c) float
(d) short
(e) double
2.6 Which integral type in Java has the exact range from -2147483648 (-2
31)
to 2147483647 (231-1)
, inclusive?
Select the one correct answer.
(a) byte
(b) short
(c) int
(d) long
(e) char
A variable stores a value of a particular type. A variable has a name, a type, and a value associated with it. In Java, variables can only store values of primitive data types and reference values of objects. Variables that store reference values of objects are called reference variables (or object references or simply references).
Variable declarations are used to specify the type and the name of variables. This implicitly determines their memory allocation and the values that can be stored in them. Examples of declaring variables that can store primitive values:
char a, b, c; // a, b and c are character variables.
double area; // area is a floating-point variable.
boolean flag; // flag is a boolean variable.
The first declaration above is equivalent to the following three declarations:
char a;
char b;
char c;
A declaration can also include an initialization expression to specify an appropriate initial value for the variable:
int i = 10, // i is an int variable with initial value 10.
j = 101; // j is an int variable with initial value 101.
long big = 2147483648L; // big is a long variable with specified initial value.
An reference variable can store the reference value of an object, and can be used to manipulate the object denoted by the reference value.
A variable declaration that specifies a reference type (i.e., a class, an array, or an interface name) declares a reference variable. Analogous to the declaration of variables of primitive data types, the simplest form of reference variable declaration only specifies the name and the reference type. The declaration determines what objects can be referenced by a reference variable. Before we can use a reference variable to manipulate an object, it must be declared and initialized with the reference value of the object.
Pizza yummyPizza; // Variable yummyPizza can reference objects of class Pizza.
Hamburger bigOne, // Variable bigOne can reference objects of class Hamburger,
smallOne; // and so can variable smallOne.
It is important to note that the declarations above do not create any objects of class Pizza
or Hamburger
. The above declarations only create variables that can store references of objects of the specified classes.
A declaration can also include an initializer expression to create an object whose reference value can be assigned to the reference variable:
Pizza yummyPizza = new Pizza("Hot&Spicy"); // Declaration with initializer.
The reference variable yummyPizza
can reference objects of class Pizza
. The keyword new
, together with the constructor call Pizza("Hot&Spicy")
, creates an object of the class Pizza
. The reference value of this object is assigned to the variable yummyPizza
. The newly created object of class Pizza
can now be manipulated through the reference variable yummyPizza
.
Initializers for initializing fields in objects, and static variables in classes and interfaces are discussed in Section 9.7, p. 406.
Reference variables for arrays are discussed in Section 3.6, p. 69.
Default values for fields of primitive data types and reference types are listed in Table 2.14. The value assigned depends on the type of the field.
Table 2.14 Default Values
If no initialization is provided for a static variable either in the declaration or in a static initializer block (see Section 9.9, p. 410), it is initialized with the default value of its type when the class is loaded.
Similarly, if no initialization is provided for an instance variable either in the declaration or in an instance initializer block (see Section 9.10, p. 413), it is initialized with the default value of its type when the class is instantiated.
The fields of reference types are always initialized with the null
reference value if no initialization is provided.
Example 2.1 illustrates default initialization of fields. Note that static variables are initialized when the class is loaded the first time, and instance variables are initialized accordingly in every object created from the class Light
.
Example 2.1 Default Values for Fields
public class Light {
// Static variable
static int counter; // Default value 0 when class is loaded.
// Instance variables:
int noOfWatts = 100; // Explicitly set to 100.
boolean indicator; // Implicitly set to default value false.
String location; // Implicitly set to default value null.
public static void main(String[] args) {
Light bulb = new Light();
System.out.println("Static variable counter: " + Light.counter);
System.out.println("Instance variable noOfWatts: " + bulb.noOfWatts);
System.out.println("Instance variable indicator: " + bulb.indicator);
System.out.println("Instance variable location: " + bulb.location);
return;
}
}
Output from the program:
Static variable counter: 0
Instance variable noOfWatts: 100
Instance variable indicator: false
Instance variable location: null
Local variables are variables that are declared in methods, constructors, and blocks (see Chapter 3, p. 39). Local variables are not initialized when they are created at method invocation, that is, when the execution of a method is started. The same applies in constructors and blocks. Local variables must be explicitly initialized before being used. The compiler will report as errors any attempts to use uninitialized local variables.
Example 2.2 Flagging Uninitialized Local Variables of Primitive Data Types
public class TooSmartClass {
public static void main(String[] args) {
int weight = 10, thePrice; // (1) Local variables
if (weight < 10) thePrice = 1000;
if (weight > 50) thePrice = 5000;
if (weight >= 10) thePrice = weight*10; // (2) Always executed.
System.out.println("The price is: " + thePrice); // (3)
}
}
In Example 2.2, the compiler complains that the local variable thePrice
used in the println
statement at (3) may not be initialized. However, it can be seen that at runtime, the local variable thePrice
will get the value 100
in the last if
-statement at (2), before it is used in the println
statement. The compiler does not perform a rigorous analysis of the program in this regard. It only compiles the body of a conditional statement if it can deduce the condition to be true
. The program will compile correctly if the variable is initialized in the declaration, or if an unconditional assignment is made to the variable.
Replacing the declaration of the local variables at (1) in Example 2.2 with the following declaration solves the problem:
int weight = 10, thePrice = 0; // (1') Both local variables initialized.
Local reference variables are bound by the same initialization rules as local variables of primitive data types.
Example 2.3 Flagging Uninitialized Local Reference Variables
public class VerySmartClass {
public static void main(String[] args) {
String importantMessage; // Local reference variable
System.out.println("The message length is: " + importantMessage.length());
}
}
In Example 2.3, the compiler complains that the local variable importantMessage
used in the println
statement may not be initialized. If the variable importantMessage
is set to the value null
, the program will compile. However, a runtime error (NullPointerException
) will occur when the code is executed, since the variable importantMessage
will not denote any object. The golden rule is to ensure that a reference variable, whether local or not, is assigned a reference to an object before it is used, that is, ensure that it does not have the value null
.
The program compiles and runs if we replace the declaration with the following declaration of the local variable, which creates a string literal and assigns its reference value to the local reference variable importantMessage
:
String importantMessage = "Initialize before use!";
Arrays and their default values are discussed in Section 3.6, p. 69.
The lifetime of a variable, that is, the time a variable is accessible during execution, is determined by the context in which it is declared. The lifetime of a variable is also called scope, and is discussed in more detail in Section 4.6, p. 129. We distinguish between lifetime of variables in three contexts:
• Instance variables—members of a class, and created for each object of the class. In other words, every object of the class will have its own copies of these variables, which are local to the object. The values of these variables at any given time constitute the state of the object. Instance variables exist as long as the object they belong to is in use at runtime.
• Static variables—also members of a class, but not created for any specific object of the class and, therefore, belong only to the class (see Section 4.6, p. 129). They are created when the class is loaded at runtime, and exist as long as the class is available at runtime.
• Local variables (also called method automatic variables)—declared in methods, constructors, and blocks; and created for each execution of the method, constructor, or block. After the execution of the method, constructor, or block completes, local (non-final
) variables are no longer accessible.
2.7 Which declarations are valid?
Select the three correct answers.
(a) char a = 'u0061';
(b) char 'a' = 'a';
(c) char u0061 = 'a';
(d) chu0061r a = 'a';
(e) ch'a'r a = 'a';
2.8 Given the following code within a method, which statement is true?
int a, b;
b = 5;
Select the one correct answer.
(a) Local variable a
is not declared.
(b) Local variable b
is not declared.
(c) Local variable a
is declared but not initialized.
(d) Local variable b
is declared but not initialized.
(e) Local variable b
is initialized but not declared.
2.9 In which of these variable declarations will the variable remain uninitialized unless it is explicitly initialized?
Select the one correct answer.
(a) Declaration of an instance variable of type int
.
(b) Declaration of a static variable of type float
.
(c) Declaration of a local variable of type float
.
(d) Declaration of a static variable of type Object
.
(e) Declaration of an instance variable of type int[]
.
2.10 What will be the result of compiling and running the following program?
public class Init {
String title;
boolean published;
static int total;
static double maxPrice;
public static void main(String[] args) {
Init initMe = new Init();
double price;
if (true)
price = 100.00;
System.out.println("|" + initMe.title + "|" + initMe.published + "|" +
Init.total + "|" + Init.maxPrice + "|" + price+ "|");
}
}
Select the one correct answer.
(a) The program will fail to compile.
(b) The program will compile, and print |null|false|0|0.0|0.0|
, when run.
(c) The program will compile, and print |null|true|0|0.0|100.0|
, when run.
(d) The program will compile, and print | |false|0|0.0|0.0|
, when run.
(e) The program will compile, and print |null|false|0|0.0|100.0|
, when run.
The following information was included in this chapter:
• basic language elements: identifiers, keywords, literals, white space, and comments
• primitive data types: integral, floating-point, and boolean
• notational representation of numbers in decimal, octal, and hexadecimal systems
• declaration and initialization of variables, including reference variables
• usage of default values for instance variables and static variables
• lifetime of instance variables, static variables, and local variables
2.1 The following program has several errors. Modify the program so that it will compile and run without errors.
// Filename: Temperature.java
PUBLIC CLASS temperature {
PUBLIC void main(string args) {
double fahrenheit = 62.5;
*/ Convert /*
double celsius = f2c(fahrenheit);
System.out.println(fahrenheit + 'F' + " = " + Celsius + 'C'),
}
double f2c(float fahr) {
RETURN (fahr - 32) * 5 / 9;
}
}
2.1 The following program compiles and runs without errors:
//Filename: Temperature.java
/* Identifiers and keywords in Java are case-sensitive. Therefore, the
case of the file name must match the class name, the keywords must
all be written in lowercase. The name of the String class has a
capital S. The main method must be static and take an array of
String objects as an argument. */
public class Temperature {
public static void main(String[] args) { // Correct method signature
double fahrenheit = 62.5;
// /* identifies the start of a "starred" comment.
// */ identifies the end.
/* Convert */
double celsius = f2c(fahrenheit);
// '' delimits character literals, "" delimits string literals.
// Only first character literal is quoted as string to avoid addition.
// The second char literal is implicitly converted to its string
// representation, as string concatenation is performed by
// the last + operator.
// Java is case-sensitive. The name Celsius should be changed to
// the variable name celsius.
System.out.println(fahrenheit + "F" + " = " + celsius + 'C'),
}
/* Method should be declared static. */
static double f2c(double fahr) { // Note parameter type should be double.
return (fahr - 32) * 5 / 9;
}
}