Antoine de Saint Exupery.
In his science-fiction novel, The Rolling Stones, Robert A. Heinlein comments:
Every technology goes through three stages: first a crudely simple and quite unsatisfactory gadget; second, an enormously complicated group of gadgets designed to overcome the shortcomings of the original and achieving thereby somewhat satisfactory performance through extremely complex compromise; third, a final proper design therefrom.
Heinlein's comment could well describe the evolution of many programming languages. Java presents a new viewpoint in the evolution of programming languages--creation of a small and simple language that's still sufficiently comprehensive to address a wide variety of software application development. While Java superficially like C and C++, Java gained its simplicity from the systematic removal of features from its predecessors. This chapter discusses two of the primary design features of Java, namely, it's simple (from removing features) and familiar (because it looks like C and C++). The next chapter discusses Java's object-oriented features in more detail. At the end of this chapter you'll find a discussion on features eliminated from C and C++ in the evolution of Java.
Design Goals
Simplicity is one of Java's overriding design goals. Simplicity and removal of many "features" of dubious worth from its C and C++-based ancestors keep Java relatively small and reduce the programmer's burden in producing reliable applications. To this end, Java design team examined many aspects of the "modern" C and C++ languages to determine features that could be eliminated in the context of modern object-oriented programming.
Another major design goal is that Java look familiar to a majority of programmers in the personal computer and workstation arenas, where a large fraction of system programmers and application programmers are familiar with C and C++. Thus, Java "looks like" C++. Programmers familiar with C, Objective C, C++, Eiffel, Ada, and related languages should find their Java language learning curve quite short--on the order of a couple of weeks.
To illustrate the simple and familiar aspects of Java, we follow the tradition of a long line of illustrious programming books by showing you the HelloWorld
program. It's about the simplest program you can write that actually does something. Here's HelloWorld
implemented in Java.
class HelloWorld { static public void main(String args[]) { System.out.println("Hello world!"); } }This example declares a class named
HelloWorld
. Classes are discussed in the next chapter on object-oriented programming, but in general we assume the reader is familiar with object technology and understands the basics of classes, objects, instance variables, and methods.
Within the HelloWorld
class, we declare a single method called main()
which in turn contains a single method invocation to display the string "Hello world!" on the standard output. The statement that prints "Hello world!" does so by invoking the println
method of the out
object. The out
object is a class variable in the System
class that performs output operations on files. That's all there is to HelloWorld
.
Numeric Data Types
Integer numeric types are 8-bit byte
, 16-bit short
, 32-bit int
, and 64-bit long
. The 8-bit byte
data type in Java has replaced the old C and C++ char
data type. Java places a different interpretation on the char
data type, as discussed below.
There is no unsigned
type specifier for integer data types in Java.
Real numeric types are 32-bit float
and 64-bit double
. Real numeric types and their arithmetic operations are as defined by the IEEE 754 specification. A floating point literal value, like 23.79
, is considered double
by default; you must explicitly cast it to float
if you wish to assign it to a float
variable.
Character Data Types
Java language character data is a departure from traditional C. Java's char
data type defines a sixteen-bit Unicode character. Unicode characters are unsigned 16-bit values that define character codes in the range 0 through 65,535. If you write a declaration such as
char myChar = 'Q';you get a Unicode (16-bit unsigned value) type that's initialized to the Unicode value of the character Q. By adopting the Unicode character set standard for its character data type, Java language applications are amenable to internationalization and localization, greatly expanding the market for world-wide applications.
Boolean Data Types
Java has added a boolean
data type as a primitive type, tacitly ratifying existing C and C++ programming practice, where developers define keywords for TRUE and FALSE or YES and NO or similar constructs. A Java boolean
variable assumes the value true
or false
. A Java boolean
is a distinct data type; unlike common C practice, a Java boolean
type can't be converted to any numeric type.
unsigned
data types, the >>>
operator has been added to the language to indicate an unsigned (logical) right shift. Java also uses the +
operator for string concatenation; concatenation is covered below in the discussion on strings.
You declare an array of, say, Point
s (a class you've declared elsewhere) with a declaration like this:
Point myPoints[];This code states that
myPoints
is an uninitialized array of Point
s. At this time, the only storage allocated for myPoints is a reference handle. At some future time you must allocate the amount of storage you need, as in:
myPoints = new Point[10];to allocate an array of ten references to
Point
s that are initialized to the null reference. Notice that this allocation of an array doesn't actually allocate any objects of the Point class for you; you will have to also allocate the Point objects, something like this:
int i;
for (i = 0; i < 10; i++) {
myPoints[i] = new Point();
}
Access to elements of myPoints
can be performed via the normal C-style indexing, but all array accesses are checked to ensure that their indices are within the range of the array. An exception is generated if the index is outside the bounds of the array.
To get the length of an array, use the length()
accessor method on the array object whose length you wish to know: myPoints.length()
returns the number of elements in myPoints
. For instance, the code fragment:
howMany = myPoints.length();
would assign the value 10 to the howMany
variable.The C notion of a pointer to an array of memory elements is gone, and with it, the arbitrary pointer arithmetic that leads to unreliable code in C. No longer can you walk off the end of an array, possibly trashing memory and leading to the famous "delayed-crash" syndrome, where a memory-access violation today manifests itself hours or days later. Programmers can be confident that array checking in Java will lead to more robust and reliable code.
String
class is for read-only (immutable) objects. The StringBuffer
class is for string objects you wish to modify (mutable string objects).
Although strings are Java language objects, Java compiler follows the C tradition of providing a syntactic convenience that C programmers have enjoyed with C-style strings, namely, the Java compiler understands that a string of characters enclosed in double quote signs is to be instantiated as a String
object. Thus, the declaration:
String hello = "Hello world!";instantiates an object of the String class behind the scenes and initializes it with a character string containing the Unicode character representation of "Hello world!".
Java has extended the meaning of the + operator to indicate string concatenation. Thus you can write statements like:
System.out.println("There are " + num + " characters in the file.");This code fragment concatenates the string
"There are "
with the result of converting the numeric value num
to a string, and concatenates that with the string " characters in the file."
. Then it prints the result of those concatenations on the standard output.
Just as with array objects, String
objects provide a length()
accessor method to obtain the number of characters in the string.
goto
statement. To break
or continue
multiple-nested loop or switch constructs, you can place labels on loop and switch
constructs, and then break
out of or continue
to the block named by the label. Here's a small fragment of code from Java's built-in String
class:
test: for (int i = fromIndex; i + max1 <= max2; i++) { if (charAt(i) == c0) { for (int k = 1; k<max1; k++) { if (charAt(i+k) != str.charAt(k)) { continue test; } } /* end of inner for loop */ } } /* end of outer for loop */
continue
test
statement is inside a for
loop nested inside another for
loop. By referencing the label test
, the continue
statement passes control to the outer for
statement. In traditional C, continue
statements can only continue the immediately enclosing block; to continue or exit outer blocks, programmers have traditionally either used auxiliary Boolean variables whose only purpose is to determine if the outer block is to be continued or exited; alternatively, programmers have (mis)used the goto
statement to exit out of nested blocks. Use of labelled blocks in Java leads to considerable simplification in programming effort and a major reduction in maintenance.
The notion of labelled blocks dates back to the mid-1970s, but it hasn't caught on to any large extent in modern programming languages. Perl is another modern programming language that implements the concept of labelled blocks. Perl's next
label and last
label are equivalent to continue
label and break
label statements in Java.
Java completely removes the memory management load from the programmer. C-style pointers, pointer arithmetic, malloc,
and free
do not exist. Automatic garbage collection is an integral part of Java and its run-time system. While Java has a new
operator to allocate memory for objects, there is no explicit free
function. Once you have allocated an object, the run-time system keeps track of the object's status and automatically reclaims memory when objects are no longer in use, freeing memory for future use.
Java's memory management model is based on objects and references to objects. Because Java has no pointers, all references to allocated storage, which in practice means all references to an object, are through symbolic "handles". The Java memory manager keeps track of references to objects. When an object has no more references, the object is a candidate for garbage collection.
Java's memory allocation model and automatic garbage collection make your programming task easier, eliminate entire classes of bugs, and in general provide better performance than you'd obtain through explicit memory management. Here's a code fragment that illustrates when garbage collection happens. It's an example from the on-line Java language programmer's guide:
class ReverseString { public static String reverseIt(String source) { int i, len = source.length(); StringBuffer dest = new StringBuffer(len); for (i = (len - 1); i >= 0; i--) { dest.appendChar(source.charAt(i)); } return dest.toString(); } }The variable
dest
is used as a temporary object reference during the execution of the reverseIt
method. When dest
goes out of scope (the reverseIt
method returns), the reference to that object has gone away and it's then a candidate for garbage collection.
This use of a thread to run the garbage collector is just one of many examples of the synergy one obtains from Java's integrated multithreading capabilities--an otherwise intractable problem is solved in a simple and elegant fashion.
The first step was to eliminate redundancy from C and C++. In many ways, the C language evolved into a collection of overlapping features, providing too many ways to say the same thing, while in many cases not providing needed features. C++, in an attempt to add "classes in C", merely added more redundancy while retaining many of the inherent problems of C.
#define
and related capabilities, no typedef
, and absent those features, no longer any need for header files. Instead of header files, Java language source files provide the definitions of other classes and their methods.
A major problem with C and C++ is the amount of context you need to understand another programmer's code: you have to read all related header files, all related #define
s, and all related typedef
s before you can even begin to analyze a program. In essence, programming with #defines
and typedef
s results in every programmer inventing a new programming language that's incomprehensible to anybody other than its creator, thus defeating the goals of good programming practices.
In Java, you obtain the effects of #define
by using constants. You obtain the effects of typedef
by declaring classes--after all, a class effectively declares a new type. You don't need header files because the Java compiler compiles class definitions into a binary form that retains all the type information through to link time.
By removing all this baggage, Java becomes remarkably context-free. Programmers can read and understand code and, more importantly, modify and reuse code much faster and easier.
The code fragment below declares a class called Point
.
class Point extends Object { double x; double y; methods to access the instance variables }The following code fragment declares a class called
Rectangle
, that uses objects of the Point
class as instance variables.
class Rectangle extends Object { Point lowerLeft; Point upperRight; methods to access the instance variables }In C you'd define these classes as structures. In Java, you simply declare classes. You can make the instance variables as private or as public as you wish, depending on how much you wish to hide the details of the implementation from other objects.
Point
class from above. We've added public methods to set and access the instance variables:
class Point extends Object { double x; double y; public void setX(double x) { this.x = x; } public void setY(double y) { this.y = y; } public double x() { return x; } public double y() { return x; } }If the
x
and y
instance variables are private to this class, the only means to access them is via the public methods of the class. Here's how you'd use objects of the Point
class from within, say, an object of the Rectangle
class:
class Rectangle extends Object { Point lowerLeft; Point upperRight; public void setEmptyRect() { lowerLeft.setX(0.0); lowerLeft.setY(0.0); upperRight.setX(0.0); upperRight.setY(0.0); } }It's not to say that functions and procedures are inherently wrong. But given classes and methods, we're now down to only one way to express a given task. By eliminating functions, your job as a programmer is immensely simplified: you work only with classes and their methods.
An interface is not a definition of an object. Rather, it's a definition of a set of methods that one or more objects will implement. An important issue of interfaces is that they declare only methods and constants. No variables may be defined in interfaces.
goto
statement. Studies illustrated that goto
is (mis)used more often than not simply "because it's there". Eliminating goto
led to a simplification of the language--there are no rules about the effects of a goto
into the middle of a for
statement, for example. Studies on approximately 100,000 lines of C code determined that roughly 90 percent of the goto
statements were used purely to obtain the effect of breaking out of nested loops. As mentioned above, multi-level break
and continue
remove most of the need for goto
statements.
int myInt; double myFloat = 3.14159; myInt = myFloat;The assignment of
myFloat
to myInt
would result in a compiler error indicating a possible loss of precision and that you must use an explicit cast. Thus, you should re-write the code fragments as:
int myInt; double myFloat = 3.14159; myInt = (int)myFloat;
You no longer have dangling pointers and trashing of memory because of incorrect pointers, because there are no pointers in Java.
The Java(tm) Language Environment: A White Paper