This chapter deals with the principles involved in programming — programs that are used to solve day-to-day problems and programs that make the computer system more usable from both an application programmer's and a user's point of view. An overview of programming will be given in view of the above class of problems. A brief history of C as an efficient programming language is provided at the end of the chapter.
A program is generally made up of a sequence of instructions that clearly specify what a computer is supposed to do. Software, in contrast, is a set of programs that are either supplied by the manufacturer of the hardware associated with it or written by an application developer, with purposeful intent of commercial or public use. Thus, for a program to qualify as software, it should have the properties of correctness, versatility, and maintainability in addition to well-documented code.
Software engineering is a methodical approach used in software development. It is defined in IEEE Software Engineering Standards 1987 as Software Engineering is the systematic approach to the development, operation, maintenance and retirement of software.
A program is rarely executed correctly on a computer system unless considerable amount of time has been spent in the problem-solving phase of its development. Coding can start only after the problem-solving phase is complete. The computer language (code) is then coded from the prepared plan. The plan for solving a problem involves development of a series of instructions, referred to as an algorithm. An algorithm enables us to solve the problem at hand in a systematic way. The development of an algorithm for a problem is justified by the fact that it is necessary to select the correct logical steps to be followed so that the problem can be solved in a quick and an efficient manner.
Let us understand the working of a program by drawing an analogy with a blender used to prepare our favourite recipe. To prepare food, we follow different types of processes - the first pertaining to the use of the blender and the second to the ingredients required for the recipe. Here the recipe is analogous to the program, the ingredients are the input, and the prepared food is the output of the program.
We require two different types of software to enable a computer to process data, namely, systems software and application software. Returning to the analogy, application software can be equated with the recipe, while systems software acts as the blender. Let us take a look into each type of software first.
Computer software consists of two major types of programs:
Systems software can be subdivided into system control programs, system support programs, and system development programs.
The most important system software for any system is its operating system. An operating system is an integrated system of programs that supervises the operations of the Central Processing Unit (CPU) - controls the input/output (I/O) and storage functions of the computer system, and provides various support services. The control programs of an operating system perform three important tasks like job, resource, and data management. Other programs that could be part of the operating system or that can be acquired as service programs may include language translators and other software.
Language translator programs convert programming language instructions into machine language instruction codes. Service programs are specialized programs that perform routine and repetitive functions, and are made available to all the users of a computer system. A common service program is the linkage editor, which updates a program by defining the specific storage locations it requires and links together the parts of the program, which require other routines. Major categories of service programs are utility programs, which perform various housekeeping and file conversion functions.
Application software refers to those programs that are written to solve day-to-day problems such as payroll maintenance, inventories, billings, etc. An application program is written by a programmer or a group of programmers (employed by a software development firm), and the latter charges a fee for rendering the service. Given a certain application requirement, package software provides an easy generic solution assuming it meets the needs of the client. The use of package software counterpoints with specialized software, which is developed for a particular client requirement.
Application software includes a variety of programs that can be segregated into general purpose, business, scientific, and other application program categories.
A program is a sequence of instructions used to perform a specific task. The program development process can be viewed in a number of steps as follows:
An algorithm is a well-defined computational procedure (sequence of computational steps). It takes a set of values as its input and produces a set of values as its output. It can be viewed as a utility for solving a computational problem.
There are a number of important features that must be satisfied by any algorithm:
Let us illustrate an example of algorithm for calculating the Greatest Common Divisor (GCD) of any two numbers. The problem must be stated explicitly before an algorithm can be devised. The GCD problem is stated as:
“Given two positive integers, p and q, find their GCD x, (i.e., the largest positive integer that divides both p and q exactly).”
We present Euclid's algorithm for the problem. (Here, Si's are the steps of the algorithm and ‘i’ refers to the step number.)
Algorithm Euclid GCD
Begin
S1: | [ Calculate remainder] |
Divide p by q and let r be the remainder. | |
S2: | [Terminate algorithm if remainder is zero] |
If r is zero, go to Step S4. | |
S3: | [Loop] |
Set p ← q
set q ← r Go to Step S1. |
|
S4: | [Set output GCD to q] |
Set x ← q
End. |
End algorithm Euclid GCD.
By examining the steps illustrated in Euclid GCD, we find that the procedure has inputs p and q, and produces x as output. It can be shown that the procedure terminates after a finite sequence of steps (ensured by S3). Each step is, at the same time, precisely defined and can easily be carried out by hand. Thus, we can state that the procedure given above is an algorithm.
As there are a number of methods that can be adapted for solving a particular problem, a number of algorithms corresponding to the various methods may exist. We often have to make a choice regarding the use of a particular algorithm as a suitable one. Such a selection necessitates analysis of algorithms. When we talk about analysing an algorithm for solving a particular problem, we try to predict the resources required by the algorithm. Although resources such as memory capacity and system bandwidth might be of concern, the efficiency of an algorithm is primarily measured by its running time.
Generally, it is found that the running time of an algorithm increases with the input size. Thus, the running time is described as a function of the input size. The number of items in the input or the number of bits required for input representation is the most general measure of input size.
We consider an algorithm to be more efficient than another if its worst-case running time has a lower order of growth.
The flowchart is an important tool aiding development. A flowchart is a graphic/pictorial representation of the steps necessary to solve a problem, accomplish a task, complete a process, or illustrate the components of a system. The basic types of flowcharts are system flowcharts and program flowcharts. A system flowchart is a representation of the components and flows in a system. A program flowchart represents the information processing steps to be performed within a computer program. In this section, we shall only illustrate program flowcharts. The various symbols used in drawing flowcharts are shown in Figure 1.1. Henceforth, we shall mean program flowcharts, when we refer to flowcharts unless stated otherwise.
The various flowchart symbols are explained as follows:
A program flowchart or flowchart illustrates the detailed sequence of steps undertaken by a program. It can be used to
A self-explanatory flowchart is illustrated in Figure 1.2(a) to read and add 10 integers and display the sum.
The flowchart for the algorithm Euclid GCD presented in Section 1.5 is shown in Figure 1.2(b).
A language is a medium for communication. The languages we speak are called natural languages. A programming language is a subset of the set of natural languages. It contains all the symbols, characters, and usage rules that permit a human being to communicate with computers. A variety of programming languages have been invented over the years of computer history. However, every programming language must accept certain types of written instructions that enable a computer system to perform a number of familiar operations. In other words, every programming language must have instructions that fall under the following categories:
Although all programming languages have an instruction set that permits these familiar operations to be performed, a marked difference is found between the symbols and syntax used in machine languages, assembly languages, and high-level languages.
Some applications are developed by coding in assembly language - a language closest to the machine language. This type of application software is most efficient for processing data. However, since a particular flavour of the assembly language is designed for a particular architecture, the type of assembly language understood by a computer depends upon the underlying architecture of the microprocessor used. For example, if a Zilog® microprocessor is used, then the machine language understood is Z-80. On the other hand, a flavour of the Intel® assembly language is used for a Pentium® microprocessor which differs from the Z-80. Thus, assembly languages are architecture-dependent.
Given the inability of assembly language to adapt across different architectural platforms and given the varied nature of real-world problems, a variety of computer languages, called high-level languages, were developed, each suited best to a model in a particular class of problems. High-level language programs are architecture-independent, easier to write, and provide easier maintenance and readability than their assembly-language counterparts. However, they require longer execution time compared to assembly-language programs.
Different high-level programming languages were introduced in the 1950s to reduce the problems that arose in writing code in assembly and machine language. When the first high-level languages were developed, the longer translation and execution time were considered as serious limitations of the technique. Nonetheless, the following factors contributed to the popularity of high-level languages for application development:
Some popular high-level languages include FORTRAN (FORmula TRANslation), COBOL (COmmon Business Oriented Language), Pascal (after Blaise Pascal), C (refer to Section 1.11), and Ada (after Ada Byron, Countess of Lovelace - the world's first computer programmer).
As stated earlier, the task of coding for a problem in a convenient programming language is performed only after extensive effort in the problem-solving stage. The following sections provide an insight into some commonly adopted programming techniques.
Early programming techniques developed in the 1950s centred on problem-solving by using bottom-up design of the solution in which the extreme details of the programming solution were investigated first, as opposed to beginning with a breakdown by broad objectives. Each program was written in isolation to solve a particular sub-problem. The difficulty arose when the various sub-programs had to work together to produce the desired programs. Program logic was guided by the limitations of primary memory, and programs were designed with the objective of executing them as fast as possible. However, as application programs grew in size, several programmers worked together to solve them. Project teams were set up, consisting of several programmers and a project leader. However, programmers often switch jobs and might leave a company before a project is fully developed, thus requiring another programmer to continue the unfinished work midstream. This required formulation of a definite summary of how a problem is to be solved. This was not provided by the bottom-up approach to programming. Another approach was required.
In recent years, computer memory ceased to be the limitation factor for most of the application programs. This, along with increasing software complexity and maintenance hitches, shifted the focus from the execution time to the programming techniques adopted in development of a program. It allowed programs to be written in a more organized manner, i.e., in a structured manner, producing code that is easier to read, analyze, and modify later - if the need arose. With increasing demands for software efficiency and programming standardization, a changed approach saw the programmers examining the problem as a whole and outlining the major steps to solve the problem. Then the process was repeated and the steps thus obtained were broken down in finer details. This is the top-down programming approach and is used in structured programming.
Structured programming is a methodology that is part of a renewed emphasis on software engineering, which involves the systematic design and development of software and the management of the software development process. Software engineering views the development of a program as a coordinated activity involving people, tools, and practices; use of modern design; and development and management methods in an integrated approach. Structured approach involves the use of methods such as top-down program design and a limited number of control structures in a program to create tightly structured modules of program code.
The effect of top-down design and structured programming has been used to lower the overall cost of programming. The structured approach promises to reduce the cost of developing and maintaining computer programs by standardizing program development and structures used. This increases the simplicity and accuracy, and at the same time minimizes programming and maintenance cost.
A ’traditional’ flexible and creative environment provided to the programmer often results in complex and difficult-to-read programs requiring much testing before they are error-free. These also become costly to develop and maintain. Structured programming, on the other hand, emphasizes group responsibility for program development. It also brings in a standardization of program-design concepts and methods, which significantly reduces the program complexity.
Organizations using structured programming have shown the following characteristics:
These results highlight the reasons for structured programming to continue to be a popular programming methodology.
Structured programming is used as a set of tools to improve program organization, facilitate problem solving, and make code easier to write and analyse in both individual and group projects. Using structured programming, the solution to the problem is divided into segments called modules.
The most common techniques used in structured programming to solve all problems are called constructs. Sequence, selection, and repetition constructs are shown in Figure 1.3. These constructs are also called control structures. Using these three basic control structures, it is possible to write standardized programs, which are easy to read and understand.
C has its origin in the early 1970s in Basic Combined Programming Language (BPCL), a project shared by AT&T Bell Labs, USA, and developed by Martin Richards. B was derived from BCPL, written in 1970 by Ken Thompson for the first UNIX system on the DEC PDP-7. Dennis Ritchie is credited with defining and creating C from the language B. Many others also influenced the development of the language. C is closely related to the UNIX operating system since it was developed along with it. Most of the UNIX commands are written in C. The basic goal was to create a language for writing operating systems. C was a mechanism for providing portable assembly language. Hence, C was originally intended for systems level programming. Today, C has been implemented on many different types of hardware and the language is used to tackle a large class of programming problems.
Most commercial implementations of C differ from Kernighan and Ritchie (K&R) original definition. This has created portability problems within different platforms. The American National Standards Institute (ANSI) established a committee in the beginning of the summer of 1983 to remove discrepancies from within different versions of C that were commercially available. C was finally standardized in 1988, and was named ANSI C. A standard ANSI-compatible C compiler includes some new features to aid in programming, in addition to all the features of a UNIX-based compiler.
C is a powerful and flexible general-purpose language that has gained the widest of acceptance over the years. Several features make C a very attractive programming language for coding solutions. These include:
Of course, the extent to which a particular program takes advantage of modularity and portability depends on how well the program is written (programmer's skill).
C is both a low- and a high-level language. It supports the low-level features (i.e., the bit-manipulation facilities of assembly) needed by the operating-system implementers and the compiler writers. It also supports the high-level features of control and data structures characteristic of a procedural language. C is as flexible as a low-level language. On the other hand, C provides the tools for a structured program, as a high-level language does. Thus, C is often called a middle-level programming language.
The hardware of a digital computer is operated on by software instructions. Programs differ from software in that the programmer for his own needs usually writes programs whereas software is a commercial product.
The development of software begins with a thorough analysis of the system requirements and the formulation of a sequence of steps (usually in the form of algorithms or flowcharts) to tackle the problem. Given a number of available algorithms, an appropriate one is usually chosen by worst-case analysis.
Systems may be designed either by a top-down understanding of the nature of the problem in its entirety or by a bottom-up approach of developing solutions to sub-problems and leaving the integrated solution to be developed at the last phase. Both approaches involve coding individual modules and use of certain structured programming constructs.
Coding is then performed in an appropriate assembly or high-level language and the code is documented properly. Testing and debugging is used to remove errors in the code. The final error-free software is then installed at the client. Maintenance may have to be performed in future and it is usually the responsibility of the developer.
C is often the preferred tool for writing programs. Developed in the 1970s, as an aid to system programming, C combines features from both high- and low-level programming languages and is thus a middle-level language. C is widely accepted for its performance, portability, versatility, simplicity, and modularity.