Appendix C

Moving C/C++ Legacy Code to Java


CONTENTS


Java is a new and powerful language that provides many useful features to the software developer. However, if your software organization is typical of most, you will have to trade off moving to Java with the constraints imposed by dependencies on in-place legacy code. This appendix summarizes the pros and cons of moving existing legacy code to Java. It identifies a spectrum of approaches to accomplishing software transition and discusses the issues involved with each approach. It also covers approaches to translating C and C++ code to Java. This appendix assumes that the transition of C/C++ code to Java is being performed by a moderately large software organization. Some of the software porting issues become insignificant if only a few small programs are translated into Java.

Why Move to Java?

When deciding whether to move existing applications to Java, a tradeoff between the advantages and disadvantages of such a move must be considered. This section identifies many of the advantages of Java programs over C- and C++-based applications. The following section considers some disadvantages of using Java and roadblocks to any software transition effort.

Platform Independence

One of the most compelling reasons to move to Java is its platform independence. Java runs on most major hardware and software platforms, including Windows 95 and NT, the Macintosh, and several varieties of UNIX. Java applets are supported by all Java-compatible browsers. By moving existing software to Java, you are able to make it instantly compatible with these software platforms. Your programs become more portable. Any hardware and operating system dependencies are removed.

Although C and C++ are supported on all platforms that support Java, these languages are not supported in a platform-independent manner. C and C++ applications that are implemented on one operating system platform are usually severely intertwined with the native windowing system and OS-specific networking capabilities. Moving between OS platforms requires recompilation, as a minimum, and significant redesign, in most cases.

Object Orientation

Java is a true object-oriented language. It does not merely provide the capability to implement object-oriented principles-it enforces these principles. You can develop object-oriented programs in C++, but you are not required to do so-you can use C++ to write C programs as well. Java does not allow you to slip outside the object-oriented framework. You either adhere to Java's object-oriented development approach or you do not program in Java.

Security

Java is one of the first programming languages to consider security as part of its design. The Java language, compiler, interpreter, and runtime environment were each developed with security in mind. The compiler, interpreter, and Java-compatible browsers all contain several levels of security measures that are designed to reduce the risk of security compromise, loss of data and program integrity, and damage to system users. Considering the enormous security problems associated with executing potentially untrusted code in a secure manner and across multiple execution environments, Java's security measures are far ahead of even those developed to secure military systems. C and C++ do not have any intrinsic security capabilities. Can you download an arbitrary untrusted C or C++ program and execute it in a secure manner?

Reliability

Security andreliability go hand in hand. Security measures cannot be implemented with any degree of assurance without a reliable framework for program execution. Java provides multiple levels of reliability measures, beginning with the Java language itself. Many of the features of C and C++ that are detrimental to program reliability, such as pointers and automatic type conversion, are avoided in Java. The Java compiler provides several levels of additional checks to identify type mismatches and other inconsistencies. The Java runtime system duplicates many of the checks performed by the compiler and performs additional checks to verify that the executable bytecodes form a valid Java program.

Simplicity

The Java language was designed to be a simple language to learn, building on the syntax and many of the features of C++. However, in order to promote security, reliability, and simplicity, Java has left out those elements of C and C++ that contribute to errors and program complexity. In addition, Java provides automated garbage collection, freeing you from having to manage memory deallocation in your programs. The end result of Java's focus on simplicity is that it is easy to get up to speed writing Java programs for those who have programmed in C or C++. Java programs are also less complex than C and C++ programs due to the fact that many of the language elements that lead to program complexity have been removed.

Language Features

The Java language provides many language features that make it preferable to C or C++ for modern software development. On the top of this list is Java's intrinsic support for multi-threading, which is lacking in both C and C++. Other features are its exception-handling capabilities, which were recently introduced into C++; its strict adherence to class and object-oriented software development; and its automated garbage-collection support. In addition to these features, Java enforces a common programming style by removing the capability to slip outside the class- and object-oriented programming paradigm to develop C-style function-oriented programs.

Standardization

Although C and C++ have been standardized by the American National Standards Institute (ANSI), many C and C++ compilers provide custom enhancements to the language, usually through additional preprocessor directives. Because these enhancements usually make their way into source code programs, a general lack of standardization results. Java does not yet suffer from any standardization problems because its syntax and semantics are controlled by a single organization.

The Java API

The predefined classes of the Java API provide a comprehensive platform-independent foundation for program development. These classes provide the capability to develop Windows and network programs that execute on a wide range of hosts. The Java I/O stream classes also provide a very useful set of filters for I/O processing. Whereas C and C++ may provide more extensive software libraries, none of these libraries provides as much platform-independent power as Java's API.

Transition to Distributed Computing

Sun has taken important steps to support fully distributed computing with its support of Remote Objects for Java. This product provides the capability to develop remote interfaces between Java objects and objects developed in other languages. The Java Interface Definition Language (IDL) can be used to support Common Object Request Broker Architecture (CORBA) integration.

Rapid Code Generation

Because Java is an interpreted language, it can be used to rapidly prototype applications that would require considerably more base software support in languages such as C or C++. The Java API also contributes to the capability to support rapid code generation. The classes of the Java API provide an integrated, easy-to-use repository for the development of application-specific software. Because the Java API provides high-level windows and networking support, custom application prototypes can be constructed more quickly using these classes as a foundation.

Ease of Documentation and Maintenance

Java software is essentially self-documenting when doc comments and the javadoc tool are used to generate software documentation. The excellent Java API documentation is an example of the superior documentation capabilities provided by Java. Because Java software is inherently better structured and documented than C or C++ software, it is generally easier to maintain. In addition, the package orientation of Java software affords considerable modularity in software design, development, documentation, and maintenance.

Reasons Against Moving to Java

Java provides many benefits that make it an attractive language to use to develop new applications and to which to port existing legacy code. The previous section discussed some of the advantages of porting existing code to Java. This section identifies some of the disadvantages of any C- or C++-to-Java migration effort.

Compatibility

Although Java is supported on many platforms, it is not supported on all of them. If your target hardware or software platform does not support Java, you are out of luck. Your alternatives are to switch to a different platform or wait for Java to be ported to your existing software platform.

Compatibility may also be a problem at a design level. Suppose that your target software platform does, in fact, support Java. If your legacy code is designed in such a manner that it is unstructured and incompatible with a class- and object-oriented model, the effort required to migrate the software may be prohibitive.

Performance

Java is interpreted, and although its execution is efficient, it might not meet the performance demands of those applications in which execution speed is of paramount importance. Examples of these types of applications include numerical "number crunching" programs, real-time control processes, language compilers, and modeling and simulation software. Just because your application fits into one of these categories does not necessarily rule out Java, however. For example, the Java compiler is written in Java and performs admirably for small programs. However, its performance is greatly enhanced when it is compiled into native machine code instructions. Java-to-C translators allow programs to be developed in Java and translated into C for native machine code compilation. The translation process generally improves the performance of Java programs.

Retraining

Although Java is simple, easy to learn, and based on C++, some training may be required to get programmers up and running writing Java code. This is especially true if the programmers have been using C++ in a nonstructured, non-object-oriented fashion. I never really appreciated the object-oriented programming features provided by C++ before I began programming in Java. Until I had adopted the Java program-development mindset, I was trying to apply my outdated and inefficient C++ programming techniques to Java software development. After I had made the mental transition to the Java object-oriented programming model, I became much more comfortable and efficient in writing Java programs.

Impact on Existing Operations

Moving legacy code to Java may result in adverse affects on company operations that are supported with legacy software. This is especially true when the legacy code is implemented in a poorly structured, convoluted manner that typically evolves from extensive software patches and upgrades. If existing system software is tightly coupled and fragile, a transition to Java (or any other language) may break the software application to the point where a complete software redevelopment is required.

Cost, Schedule, and Level of Effort

Any software transition effort is subject to cost and schedule constraints. Moving current legacy software to Java might not be cost-effective given the current software investment and its expected operational life. The software transition may also have a significant impact on system availability and previously scheduled activities. Transition from C or C++ to Java might also require a significant level of effort that would exceed the expected budget for the maintenance of the legacy code.

Transition Approaches and Issues

There are many ways to integrate Java into existing software applications. This section identifies some of these approaches and explores the issues involved in transitioning to a Java-based software environment.

Interfacing with Existing Legacy Code

One of the easiest ways to introduce Java to an operational environment is to use it to add functionality to existing legacy code. Java programs do not replace existing legacy software; they merely enhance it to support new applications. This approach involves minimal impact to existing software, but does introduce a potentially thorny maintenance issue with Java being added to the current list of languages that must be used to maintain the system.

Incremental Reimplementation of Legacy Code

An incremental approach to reimplementing legacy code in Java can be used to cut over to a Java-based software-development approach while minimizing the impact on existing legacy software. This approach assumes that the legacy software is developed in a modular fashion and can be replaced in an incremental manner. If this is the case, legacy software can be migrated to Java on a module-by-module basis with the legacy code ultimately replaced by new Java software.

Off-Boarding Access to Legacy Objects

If in-place legacy code can be upgraded using Java software that is implemented on separate hardware platforms, Java can be used to "off board" many of the functions performed by the legacy code. The use of off-board server software allows the investment in legacy code to be preserved while expanding the services provided by the system as a whole.

Full-Scale Redevelopment

In some cases, it is more cost-effective to keep legacy code in place while completely redeveloping system software from scratch. This is typically the case when the system is subject to large-scale reengineering or when it is so fragile that it breaks as the result of the simplest upgrades. Full-scale system redevelopment being necessary is actually an advantage to Java software development because the developed software is under no legacy compatibility constraints and can take full advantage of Java's capabilities.

Translation Approaches and Issues

Translation of existing C and C++ code into Java can be performed in several different ways, depending on the compatibility of the existing software with Java. This section describes some of the different approaches to software translation.

Automated Translation

Tools and utilities have been developed that allow Java source and bytecode to be translated into C to support native machine code compilation. Future Java-integrated software-development environments are planned where either Java or C++ code may be generated based on the configuration of the development software. These development tools will allow easy movement between C++ and Java. These tools require a common set of libraries that can be used by either Java or C++ programs. Automated translation between these two languages will be supported to some extent.

The degree to which C++ programs may be automatically translated into Java will depend on the planning and effort put into the code's design to develop it in a way that makes it more amenable to automated translation. Factors to be considered include the use of compatible libraries, the use of single inheritance, the use of object-oriented programming capabilities, and minimization of the use of incompatible language features.

Manual Translation

Manual translation of C and C++ to Java will probably be the most common approach to moving C and C++ legacy programs to Java. This approach requires you to use two editor windows-one for the legacy C++ code being translated and the other for the Java program being created. Some of the translation is accomplished by cutting and pasting C++ statements into the Java window, making the corrections necessary to adjust for language differences. Other parts of the translation require that new Java classes, interfaces, variables, and methods be developed to implement C++ functions and data structures that cannot be directly translated from C++ to Java. The effectiveness of the manual translation process will be determined by the degree to which the C++ legacy code meets the compatibility considerations identified at the end of the previous section.

Source-Level Redesign

In many cases, manual translation may be hampered by the fact that the C++ legacy code might be written in a style that renders it impossible to migrate using cut-and-paste-based translation methods. In these cases, a class- and object-oriented design of the legacy code needs to be extracted from the legacy code and used as the basis for the Java source code development. A two-level approach to software translation is followed. The legacy code is reverse-engineered to an object-oriented design, and the recovered design information is used to develop a Java software design which is, in turn, translated into Java source code. Code is not translated from one language to another. Instead, legacy code is translated into general design information that is used to drive the Java design and implementation.