The bigger the program, the harder that program is to read, fix, and modify. Just as it's easier to spot a spelling mistake in a recipe printed on a single page compared to trying to find that same spelling mistake buried inside a 350-page cookbook, so is it easier to fix problems in a small program than a big one.
Because small programs can perform only simple tasks, the idea behind programming is to write a lot of little programs and paste them together, like building blocks, creating one massive program. Because each little program is part of a much bigger program, those little programs are subprograms, as shown in Figure 6-1.
The biggest problem with dividing a large program into multiple subprograms is to make each subprogram as independent, or loosely coupled, as possible. That means if one subprogram fails, it doesn't wreck the entire program along with it, like yanking out a single playing card from a house of cards.
One major advantage of subprograms is that you can isolate common program features in a subprogram that you can copy and reuse in another program. For example, suppose you wrote a word processor. Although you could write it as one massive, interconnected tangle of code, a better approach might be dividing the program functions into separate parts. By doing this, you could create a separate subprogram for
Displaying pull-down menus
Editing text
Spell checking
Printing a file
If you wanted to write a horse race prediction program, you wouldn't have to write the whole thing from scratch. You could copy the subprograms from another program and reuse them in your new project, as shown in Figure 6-2.
By reusing subprograms, you can create more complicated programs faster than before. After programmers create enough useful subprograms, they can store these subprograms in a "library" that they and other programmers can use in the future.
A subprogram essentially yanks out two or more commands from your main program and stores them in another part of your main program or in a separate file, as shown in Figure 6-3.
The reasons for isolating commands in a subprogram (and out of your main program) are to
Keep your main program smaller and thus easier to read and modify.
Isolate related commands in a subprogram that can be reused.
Make programming simpler and faster by just reusing subprograms from other projects.
Every subprogram consists of a unique name and one or more commands. You can name a subprogram anything although it's usually best to give a subprogram a descriptive name. So if you create a subprogram to convert yards into meters, you might name your subprogram Yards2Meters
or MetricConverter
.
A descriptive name for a subprogram can help you identify the purpose of that subprogram.
After you define a name for your subprogram, you can fill it up with one or more commands that tell that subprogram what to do. So if you wanted to write a subprogram that prints your name on-screen, your main subprogram might look like this:
SUB PrintMyName FOR I = 1 TO 5 PRINT "John Smith" NEXT I END SUB
The preceding BASIC language example defines the beginning of a subprogram by using the SUB
keyword followed by the subprogram's name, PrintMyName
. The end of the subprogram is defined by the END SUB
keywords.
Not every language defines subprograms with keywords. In the curly bracket language family, the main program is called main
and every subprogram is just given its own name, such as
print_my_name () { for i = 1; i < 5; i++) { printf ("John Smith"); } }
Instead of using the term subprogram, the curly bracket languages use the term function.
You can store subprograms in the same file as the main program or in a separate file. If you store a subprogram in the same file as the main program, you can place the subprogram at the beginning or end of the file, as shown in Figure 6-4.
BASIC and curly bracket languages, such as C, usually put subprograms at the end of the main program. Other languages, such as Pascal, put subprograms at the beginning of the main program.
After you isolate commands inside a subprogram, your program can't run those commands until it "calls" the subprogram. Calling a subprogram basically tells the computer, "See those commands stored in that subprogram over there? Run those commands now!"
To call a subprogram, you must use the subprogram's name as a command. So if you had the following subprogram:
SUB PrintMyName FOR I = 1 TO 5 PRINT "John Smith" NEXT I END SUB
To run this subprogram, you use its name as a command in any part of your program like this:
PRINT "The subprogram is going to run now." PrintMyName END
The preceding BASIC program would print the following:
The subprogram is going to run now. John Smith John Smith John Smith John Smith John Smith
Every subprogram needs a unique name so when you call that subprogram to run, the computer knows exactly which subprogram to use. You can call a subprogram from any part of a program, even from within another subprogram.
In the curly bracket language family, calling a subprogram (function) is the same. Use the subprogram's name as a command. So if you had the following subprogram:
print_my_name () { for i = 1; i < 5; i++) { printf ("John Smith"); } }
You could call that subprogram (function), as follows:
main () { print_my_name (); }
If you've stored a subprogram in a separate file, you may need to go through two steps to call a subprogram. First, you may need to specify the filename that contains the subprogram you want to use. Second, you need to call the subprogram you want by name.
In the curly bracket languages, like C, you specify a filename where your subprogram is stored like this:
#include <filename> main() { subprogram name (); }
In this C example, the #include<filename>
command tells the computer that if it can't find a subprogram in the main program file, look in the file dubbed filename.
The #include
command tells the computer to pretend that every subprogram stored in a separate file is actually included in the main program file.
Each time you call a subprogram, that subprogram runs its commands. So if you had a subprogram like this:
SUB PrintJohnSmith FOR I = 1 TO 5 PRINT "John Smith" NEXT I END SUB
Calling that subprogram from another part of your program would always print the name John Smith exactly five times. If you wanted a subprogram that could print the name Mary Jones 16 times, you'd have to write another similar subprogram, such as
SUB PrintMaryJones FOR I = 1 TO 16 PRINT "Mary Jones" NEXT I END SUB
Obviously, writing similar subprograms that do nearly identical tasks is wasteful and time-consuming to write. So as a better alternative, you can write a generic subprogram that accepts additional data, called parameters.
These parameters let you change the way a subprogram works. So rather than write one subprogram to print the name John Smith 5 times and a second subprogram to print the name Mary Jones 16 times, you could write a single subprogram that accepts 2 parameters that define
The name to print
The number of times to print that name
SUB PrintMyName (PrintTimes as Integer, Name as String) FOR I = 1 TO PrintTimes PRINT Name NEXT I END SUB
The list of parameters, enclosed in parentheses, is a parameter list.
This BASIC language example defines a subprogram named PrintMyName
, which accepts two parameters. The first parameter is an integer variable — PrintTimes
— which defines how many times to print a name.
The second parameter is a string variable — Name
— which defines the name to print multiple times.
Every programming language offers slightly different ways of creating a subprogram. Here's what an equivalent Python subprogram might look like:
def print_my_name(printtimes, name): for i in range(printtimes): print name
By writing a generic subprogram that accepts parameters, you can create a single subprogram that can behave differently, depending on the parameters it receives.
To give or pass parameters to a subprogram, you need to call the subprogram by name along with the parameters you want to give that subprogram. So if a subprogram accepted two parameters (an integer and a string), you could call that subprogram by doing the following:
PrintMyName (5, "John Smith") PrintMyName (16, "Mary Jones")
The first command tells the PrintMyName
subprogram to use the number 5 and the string John Smith as its parameters. Because the number defines how many times to print and the string defines what to print, this first command tells the PrintMyName
subprogram to print John Smith five times.
The second command tells the PrintMyName
subprogram to use the number 16 and the string Mary Jones as its parameters, which prints Mary Jones 16 times, as shown in Figure 6-5.
Calling a subprogram works the same way in other programming languages. So if you want to call the following Python subprogram:
def print_my_name(printtimes, name): for i in range(printtimes): print name
You could print the name John Smith four times with this command:
print_my_name(4, "John Smith")
When you call a subprogram, you must give it the exact number and type of parameters it expects to receive. So the PrintMyName
subprogram accepts two parameters whereas the first parameter must be an integer and the second parameter must be a string, such as
PrintMyName (4, "Hal Berton") PrintMyName (53, "Billie Buttons")
If you don't give a subprogram the right number of parameters, your program doesn't work. So if a subprogram is expecting two parameters, the following doesn't work because they don't give the subprogram exactly two parameters:
PrintMyName (4) PrintMyName (4, 90, "Roberta Clarence")
The first command doesn't work because the PrintMyName
subprogram expects two parameters, but this command passes only one parameter.
The second command doesn't work because this command passes three parameters, but the PrintMyName
subprogram expects only two parameters.
Another problem is that you give the subprogram the exact number of parameters, but not the right type of parameters. So this subprogram expects to receive an integer and a string, so the following subprogram calls don't work because they give it the wrong data:
PrintMyName (98, 23) PrintMyName ("Victor Harris", 7)
The first command doesn't work because the PrintMyName
subprogram expects an integer and a string, but this command tries to give it two numbers.
The second command doesn't work because the PrintMyName
subprogram expects an integer first and a string second, but this command gives it the data in the wrong order.
If a subprogram doesn't need any parameters, you can just call that subprogram by using its name, such as
PrintMyName
If you aren't passing any parameters in some programming languages, you must leave the parameter list (the stuff between the parentheses) blank, such as
printMyName ();
When a program calls and passes parameters to a subprogram, the computer makes duplicate copies of those parameters. One copy of those parameters stays with the main program and the second copy gets passed to the subprogram.
Now if the subprogram changes those parameters, the values of those parameters stay trapped within the subprogram, as shown in Figure 6-6.
Figure II.6-6. Normally when you pass parameters to a subprogram, the computer makes a second copy for the subprogram to use.
When you pass parameters to a subprogram and make duplicate copies of those parameters, that's called passing by value.
Most of the time when you call a subprogram and pass it parameters, you don't want that subprogram to change the value of its parameters; you just want the subprogram to modify its behavior based on the parameters it receives, such as the subprogram that prints a name a fixed number of times.
Rather than give a subprogram parameters that modify its behavior, you can also give a subprogram parameters that the subprogram can modify and send back to the rest of the program.
To make a subprogram modify its parameters, you must use an x called pass by reference. Essentially, instead of letting a subprogram use a copy of data, passing by reference gives the subprogram the actual data to manipulate, as shown in Figure 6-7.
Figure II.6-7. Passing by reference means the subprogram can manipulate data that another part of the program will use.
Suppose you have a subprogram that converts temperatures from Celsius to Fahrenheit with this formula:
Tf = ((9/5)*Tc)+32
Your subprogram could look like this:
SUB ConvertC2F (ByRef Temperature as Single) Temperature = ((9/5) * Temperature) + 32 END SUB
This is how the preceding subprogram works:
The first line defines the subprogram name — ConvertC2F
— and its parameter list as accepting one Single variable called Temperature
. To specify that this parameter will be passed by reference, this BASIC language example uses the ByRef
keyword.
The second line plugs the value of the Temperature
variable into the conversion equation and stores the result back in the Temperature
variable, erasing the preceding value that was stored there.
The third line ends the subprogram. At this point, the modified value of the Temperature
variable is sent back to the main program to use.
Every programming language uses different ways to identify when a parameter will be passed by reference. The BASIC language uses the ByRef
keyword whereas the C language uses the ampersand symbol (&
) to identify parameters passed by reference. In the following C example, the a parameter is passed by value but the x parameter is passed by reference:
subprogram_example (int a, float &x);
If you had the following BASIC subprogram:
SUB ConvertC2F (ByRef Temperature as Single) Temperature = ((9/5) * Temperature) + 32 END SUB
You could call that subprogram like this:
DIM Temp AS SINGLE Temp = 12 PRINT "This is the temperature in Celsius = "; Temp ConvertC2F (Temp) PRINT "This is the temperature in Fahrenheit = "; Temp END
Running this program would produce the following:
This is the temperature in Celsius = 12 This is the temperature in Fahrenheit = 53.6
Notice that right before calling the ConvertC2F
subprogram, the value of the Temperature
variable is 12
, but the ConvertC2F
subprogram changes that value because the subprogram was passed the Temperature
value by reference. What happens if you run the same program except change the subprogram to accept parameters passed by value instead, such as
DIM Temp AS SINGLE Temp = 12 PRINT "This is the temperature in Celsius = "; Temp ConvertC2F (Temp) PRINT "This is the temperature in Fahrenheit = "; Temp END SUB ConvertC2F (Temperature as Single) Temperature = ((9/5) * Temperature) + 32 END SUB
This program would print the following:
This is the temperature in Celsius = 12 This is the temperature in Fahrenheit = 12
Although the ConvertC2F
subprogram changed the value of the Temperature
variable, it never passes the changed value back to the main program. So the main program blissfully uses the current value of the Temperature
variable, which is always 12
.
Passing data by reference means that the subprogram can change any data used by another part of a program. This can increase the chance of problems because the more ways data can be changed, the harder it can be to track down errors.
One problem with passing parameters by reference is that you may not always know when a subprogram will change its parameter values. To make it clear when a subprogram returns modified data, you can create a special type of subprogram called a function.
A function is nothing more than a subprogram with the subprogram name representing a value. So a typical function might look like this:
FUNCTION Name (parameter list) AS DataType Commands RETURN value END FUNCTION
In BASIC, you identify a function with the FUNCTION
keyword to define a subprogram as a function. After listing a parameter list, the first line also defines the data type that the function name can hold, such as an integer
, a string
, or a single
(decimal) number.
Defining a function in the C language looks like this:
datatype function_name (parameter list) { commands return value }
Inside the function, one or more commands must calculate a new result. Then you use the RETURN
keyword to define what value to store in the function name. Whatever value this is, it must be the same data type that you defined for the function name in the first line. So if you defined a function as a String
data type, you can't return an integer value from that function.
A typical function in BASIC might look like this:
FUNCTION ConvertC2F (Temperature AS SINGLE) AS SINGLE Temperature = ((9/5) * Temperature) + 32 RETURN Temperature END FUNCTION
The function name ConvertC2F
can hold a Single
data type.
Unlike a subprogram that may or may not return a modified value, functions always return a value. To call a function, you must assign the function name to a variable or use the function name itself as a variable, such as
PRINT "Temperature in Fahrenheit = "; ConvertC2F (12)
Because functions always return a value, they (almost always) have a parameter list. So you can identify functions in a program by looking for the parameter list in parentheses.
DIM Temp AS SINGLE Temp = 12 PRINT "Temperature in Celsius = "; Temp PRINT "Temperature in Fahrenheit = "; ConvertC2F (Temp) END FUNCTION ConvertC2F (Temperature AS SINGLE) AS SINGLE Temperature = ((9/5) * Temperature) + 32 RETURN Temperature END FUNCTION
Unlike a subprogram that you can call just by typing its name on a line by itself, you can call a function only by using that function name as if it's a variable.
This same function as seen in the Python language might look like this:
def convertc2f(temperature): new = ((9.0/5.0) * temperature) + 32 return new
To run this function, you could use the following program:
temp = 12 print "Temperature in Celsius = ", temp print "Temperature in Fahrenheit = ", convertc2f(temp)
In Chapter 5 of this mini-book, you can read about loops that can repeat one or more commands multiple times. If you want to repeat the commands stored in a subprogram, you can just call that subprogram from within a loop, such as
FOR I = 1 TO 4 Subprogram name NEXT I
This example would run all the commands in a subprogram four times. However, here's another way to run a subprogram multiple times: recursion. The idea behind recursion is that instead of defining how many times to run a subprogram, you let the subprogram call itself multiple times, such as
SUB MySubprogram MySubprogram END SUB
When this subprogram runs, it calls itself, essentially making a second copy of itself, which then makes a third copy of itself, and so on. A common problem used to demonstrate recursion is calculating a factorial (which multiples a number by a gradually decreasing series of numbers).
Not every programming language supports recursion, such as some versions of BASIC.
A factorial is often written like this:
4!
To calculate a factorial, you multiply a number (4, in this case) by a number that's one less (3) and keep repeating this until you get the value of 1, such as
4! = 4 * 3 * 2 * 1 = 24
To calculate a factorial, you could use a BASIC program like this:
FUNCTION Factorial (N as INTEGER) as REAL IF N > 1 THEN Factorial = N * Factorial (N- 1) ELSE Factorial = 1 END FUNCTION
This function uses recursion to run another copy of itself, as shown in Figure 6-8.
Ultimately, every subprogram that calls itself needs to end. (Otherwise, it can get trapped in an endless loop, which hangs or freezes your computer.) When a subprogram finally ends, it returns a value to the preceding subprogram, which returns its value to the preceding subprogram, and so on until a value is finally calculated by the first copy of the subprogram that initially ran.
The advantage of recursion is that it's much simpler to write. If you didn't use recursion, this is how you could calculate factorials using an ordinary FOR-NEXT
loop:
FUNCTION Factorial (N as INTEGER) as REAL DIM Total as REAL DIM M as INTEGER Total = 1 FOR M = N DOWNTO 1 Total = Total * M Factorial = Total END FUNCTION
Compared to the much smaller and simpler recursive subprogram, this subprogram is harder to understand although it calculates the exact same results.
Naturally, recursion has its disadvantages:
Recursion can gobble up lots of memory. It runs the same subprogram multiple times, so it makes additional copies of itself.
Recursion can crash your computer if it doesn't end. Your subprogram can keep making endless copies of itself until it runs out of memory.
If you couldn't isolate commands in a subprogram, you could never have recursion.
The whole idea behind subprograms is to make programming easier by breaking a large problem into progressively smaller problems. As long as you understand that subprograms are one technique for helping you write larger programs, you can use subprograms as building blocks to create anything you want.