Chapter 7. Maintaining BIND

“Well, in our country,” said Alice, still panting a little, “you’d generally get to somewhere else—if you ran very fast for a long time as we’ve been doing.”

“A slow sort of country!” said the Queen. “Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!”

This chapter discusses a number of related topics pertaining to nameserver maintenance. We’ll talk about controlling nameservers, modifying zone datafiles, and keeping the root hints file up to date. We’ll list common syslog error messages and explain the statistics BIND keeps.

This chapter doesn’t cover troubleshooting problems. Maintenance involves keeping your data current and watching over your nameservers as they operate. Troubleshooting involves putting out fires—those little DNS emergencies that flare up periodically. Firefighting is covered in Chapter 14.

Controlling the Nameserver

Traditionally, administrators have controlled the BIND nameserver, named, with Unix signals. The nameserver interprets the receipt of certain signals as instructions to take particular actions, such as reloading all the primary zones that have changed. However, there are a limited number of signals available, and signals offer no means of passing along additional information such as the domain name of a particular zone to reload.

In BIND 8.2, the ISC introduced a method of controlling the nameserver by sending messages to it on a special control channel. The control channel can be either a Unix domain socket or a TCP port that the nameserver listens on for messages. Because the control channel isn’t limited to a finite number of discrete signals, it’s more flexible and powerful. The ISC says that the control channel is the way of the future and that administrators should use it, rather than signals, for all nameserver management.

You send messages to a nameserver via the control channel using a program called ndc (in BIND 8) or rndc (in BIND 9). Prior to BIND 8.2, ndc was simply a shell script that allowed you to substitute convenient arguments (such as reload) for signals (such as HUP). We’ll talk about that version of ndc later in this chapter.

ndc and controls (BIND 8)

Executed without arguments, ndc will try to communicate with a nameserver running on the local host by sending messages through a Unix domain socket. The socket is usually called /var/run/ndc, though some operating systems use a different pathname. The socket is normally owned by root and is readable and writable only by the owner. BIND 8.2 and later nameservers create the Unix domain socket when they start up. You can specify an alternate pathname or permissions for the socket using the controls statement. For example, to change the socket’s path to /etc/ndc and group ownership to named, and to make the socket readable and writable by both owner and group, you can use:

controls {
    unix "/etc/ndc" perm 0660 owner 0 group 53;  // group 53 is "named"
};

The permission value must be specified as an octal quantity (with a leading zero to indicate its octalness). If you’re not familiar with this format, see the chmod(1) manual page. The owner and group values must also be numeric.

The ISC recommends, and we agree, that you restrict access to the Unix domain socket to administrative personnel authorized to control the nameserver.

You can also use ndc to send messages across a TCP socket to a nameserver, possibly remote from the host that you’re running ndc on. To use this mode of operation, run ndc with the -c command-line option, specifying the name or address of the nameserver, a slash, and the port on which it’s listening for control messages. For example:

#ndc -c 127.0.0.1/953

To configure your nameserver to listen on a particular TCP port for control messages, use the controls statement:

controls {
    inet 127.0.0.1 port 953 allow { localhost; };
};

By default, BIND 8 nameservers don’t listen on any TCP ports. BIND 9 nameservers listen on port 953 by default, so we’re using that port here. We’re configuring the nameserver to listen only on the local loopback address for messages, and to allow only messages from the local host. Even this isn’t especially prudent because anyone with a login on the local host can control the nameserver. If we felt even more imprudent (and we don’t advise this), we could widen the allow-access list and let the nameserver listen on all local network interfaces by specifying:

controls {
    inet * port 953 allow { localnets; };
};

ndc supports two modes of operation, interactive and noninteractive. In noninteractive mode, you specify the command to the nameserver on the command line. For example:

#ndc reload

If you don’t specify a command on the command line, you enter interactive mode:

#ndc
Type   help 
  -or-   /h   if you need help.
ndc>

/h gives you a list of commands that ndc (not the nameserver) understands. These apply to ndc’s operation, not the nameserver’s:

ndc>/h
        /h(elp)                 this text
        /e(xit)                 leave this program
        /t(race)                toggle tracing (protocol and system events)
        /d(ebug)                toggle debugging (internal program events)
        /q(uiet)                toggle quietude (prompts and results)
        /s(ilent)               toggle silence (suppresses nonfatal errors)
ndc>

For example, the /d command induces ndc to produce debugging output (e.g., what it’s sending to the nameserver and what it’s getting in response). It has no effect on the nameserver’s debugging level. For that, see the debug command, described later.

Note that /e, not /x or /q, exits ndc. That’s a little counterintuitive.

help tells you the commands at your disposal. These control the nameserver:

ndc>help
getpid
status
stop
exec
reload [zone] ...
reconfig [-noexpired] (just sees new/gone zones)
dumpdb
stats
trace [level]
notrace
querylog
qrylog
help
quit
ndc>

There are two commands that aren’t listed here, though you can still use them: start and restart. They’re not listed because ndc is telling you what commands the nameserver—as opposed to ndc—understands. The nameserver can’t perform a start command because to do so it would need to be running (and if it’s running, it doesn’t need to be started). It can’t perform a restart command either, because if it exited, it would have no way to start a new instance of itself (it wouldn’t be around to do it). None of this prevents ndc from doing a start or restart, though.

Here’s what those commands do:

getpid

Prints the nameserver’s current process ID.

status

Prints lots of useful status information about the nameserver, including its version, its debug level, the number of zone transfers running, and whether query logging is on.

start

Starts the nameserver. If you need to start named with any command-line arguments, you can specify these after start. For example, start -c /usr/local/etc/named.conf.

stop

Causes the nameserver to exit, writing dynamic zones to their zone datafiles.

restart

Stops and then starts the nameserver. As with start, you can specify command-line arguments for named after the command.

exec

Stops and then starts the nameserver. Unlike restart, however, you can’t specify command-line options for named; the nameserver just starts a new copy of itself with the same command-line arguments.

reload

Reloads the nameserver. Send this command to a primary nameserver after modifying its configuration file or one or more of its zone datafiles. You can also specify one or more domain names of zones as arguments to reload; if you do, the nameserver will reload only these zones.

reconfig [-noexpired]

Tells the nameserver to check its configuration file for new or deleted zones. Send this command to a nameserver if you’ve added or deleted zones but haven’t changed any existing zones’ data. Specifying the -noexpired flag tells the nameserver not to bother you with error messages about zones that have expired. This can come in handy if your nameserver is authoritative for thousands of zones and you want to avoid seeing a flurry of expiration messages you already know about.

dumpdb

Dumps a copy of the nameserver’s internal database to named_dump.db in the nameserver’s current directory.

stats

Appends the nameserver’s statistics to named.stats in the nameserver’s current directory.

trace [level]

Appends debugging information to named.run in the nameserver’s current directory. Specifying higher debug levels increases the amount of detail in the debugging information. For information on what is logged at each level, see Chapter 13.

notrace

Turns off debugging.

querylog (or qrylog)

Toggles logging all queries with syslog. Logging takes place at priority LOG_INFO. named must be compiled with QRYLOG defined (it’s defined by default).

quit

Ends the control session.

rndc and controls (BIND 9)

BIND 9, like BIND 8, uses the controls statement to determine how the nameserver listens for control messages. The syntax is the same, except that only the inet substatement is allowed. (BIND 9.3.2 doesn’t support Unix domain sockets for the control channel yet, and the ISC suggests BIND 9 probably never will.)

With BIND 9, you can leave out the port specification, and the nameserver will default to listening on port 953. You must also add a keys specification:

controls {
       inet * allow { any; } keys { "rndc-key"; };
};

This determines which cryptographic key rndc users must authenticate themselves with to send control messages to the nameserver. If you leave the keys specification out, you’ll see this message after the nameserver starts:

Jan 13 18:22:03 terminator named[13964]: type 'inet' control channel
has no 'keys' clause; control channel will be disabled

The key or keys specified in the keys substatement must be defined in a key statement:

key "rndc-key" {
        algorithm hmac-md5;
        secret "Zm9vCg==";
};

The key statement can go directly in named.conf, but if your named.conf file is world-readable, it’s safer to put it in a different file that’s not world-readable and include that file in named.conf:

include "/etc/rndc.key";

The only algorithm currently supported is HMAC-MD5, a technique for using the fast MD5 secure hash algorithm to do authentication.[*] The secret is simply the base-64 encoding of a password that named and authorized rndc users will share. You can generate the secret using programs such as mmencode or dnssec-keygen from the BIND distribution, as described in Chapter 11.

For example, you can use mmencode to generate the base-64 encoding of foobarbaz:

% mmencode foobarbaz
CmZvb2JhcmJh

To use rndc, you need to create an rndc.conf file to tell rndc which authentication keys to use and which nameservers to use them with. rndc.conf usually lives in /etc. Here’s a simple rndc.conf file:

options {
        default-server localhost;
        default-key "rndc-key";
};

key "rndc-key" {
        algorithm hmac-md5;
        secret "Zm9vCg==";
};

The syntax of the file is very similar to the syntax of named.conf. In the options statement, you define the default nameserver to send control messages to (which you can override on the command line) and the name of the default key to present to remote nameservers (which you can also override on the command line).

The syntax of the key statement is the same as that used in named.conf, described earlier. The name of the key in rndc.conf, as well as the secret, must match the key definition in named.conf.

Tip

Remember that since you’re storing keys (which are essentially passwords) in rndc.conf and named.conf, you should make sure that neither file is readable by users who aren’t authorized to control the nameserver.

If your version of BIND comes with rndc-confgen, you can let the tool do most of the work for you. Simply run:

#rndc-confgen > /etc/rndc.conf

Here is what you’ll see in /etc/rndc.conf:

# Start of rndc.conf
key "rndc-key" {
    algorithm hmac-md5;
    secret "4XErjUEy/qgnDuBvHohPtQ==";
};

options {
    default-key "rndc-key";
    default-server 127.0.0.1;
    default-port 953;
};
# End of rndc.conf

# Use with the following in named.conf,
# adjusting the allow list as needed:
#
# key "rndc-key" {
#     algorithm hmac-md5;
#     secret "4XErjUEy/qgnDuBvHohPtQ==";
# };
#
# controls {
#     inet 127.0.0.1 port 953
#         allow { 127.0.0.1; } keys { "rndc-key"; };
# };
# End of named.conf

As indicated by the comment, the second half of this file belongs in /etc/named.conf. Move those lines to /etc/named.conf and remove the comment character at the beginning of the line. As mentioned earlier, you may want to keep the key in a file outside of /etc/named.conf for security reasons. Also, notice that the controls substatement allows access only to 127.0.0.1. You may need to adjust this list.

Using rndc to control multiple servers

If you’re using rndc to control only a single nameserver, its configuration is straightforward. You define an authentication key using identical key statements in named.conf and rndc.conf. Then you define your nameserver as the default server to control with the default-server substatement in the rndc.conf options statement, and define the key as the default key using the default-key substatement. Then run rndc as:

#rndc reload

If you have multiple nameservers to control, you can associate each with a different key. Define the keys in separate key statements, and then associate each key with a different server in a server statement:

server localhost {
    key "rndc-key";
};

server wormhole.movie.edu {
    key "wormhole-key";
};

Then run rndc with the -s option to specify the server to control:

#rndc -s wormhole.movie.edu reload

If you haven’t associated a key with a particular nameserver, you can still specify which key to use on the command line with the -y option:

#rndc -s wormhole.movie.edu -y rndc-wormhole reload

Finally, if your nameserver is listening on a nonstandard port for control messages (i.e., a port other than 953), you must use the -p option to tell rndc which port to connect to:

#rndc -s toystory.movie.edu -p 54 reload

New rndc commands

In BIND 9.0.0, rndc supported only the reload command. BIND 9.3.2 supports most of the ndc commands, plus many new ones. Here’s a list and brief descriptions of each:

reload

Same as the ndc command.

refresh zone

Schedules an immediate refresh for the specified zone (i.e., an SOA query to the zone’s master).

retransfer zone

Immediately retransfers the specified zone without checking the serial number.

freeze zone

Suspends dynamic updates to the specified zone. Covered in Chapter 10.

thaw zone

Resumes dynamic updates to the specified zone. Covered in Chapter 10.

reconfig

Same as the ndc command.

stats

Same as the ndc command.

querylog

Same as the ndc command.

dumpdb

Same as the ndc command. Also allows you to specify whether to dump just cache with the -cache option, authoritative zones with the -zones option, or both with the -all option.

stop

Same as the ndc command.

halt

Same as stop, but doesn’t save pending dynamic updates.

trace

Same as the ndc command.

notrace

Same as the ndc command.

flush

Flushes (empties) the nameserver’s cache.

flushname name

Flushes all records attached to the specified domain name from the nameserver’s cache.

status

Same as the ndc command.

recursing

Dump information about the recursive queries currently being processed to the file named.recursing in the current working directory.

Using Signals

Now, back in the old days, all we had to control the nameserver with were signals. If you’re stuck in the past (with a version of BIND older than 8.2), you need to use signals to manage your nameserver. The following table is a list of the signals you can send to a nameserver; it includes which ndc command each is equivalent to. If you have the shell script version of ndc (from BIND 4.9 to 8.1.2), you don’t have to pay attention to the signal names because ndc will translate the commands into the appropriate signals. With BIND 9, you must use rndc for all activities (except reloading and stopping the server) because the signal mechanism for other features is no longer supported.

Signal

BIND 8 signals

ndc equivalent

BIND 9 signals

rndc equivalent

HUP

Reloads the server

ndc reload

Reloads the server

rndc reload

INT

Dumps the database

ndc dumpdb

Stops the server

rndc dumpdb

ILL

Dumps the statistics

ndc stats

Not supported

rndc stats

USR1

Increments the trace level

ndc trace

Not supported

rndc trace

USR2

Turns off tracing

ndc notrace

Not supported

rndc notrace

WINCH

Toggles query logging

ndc querylog

Not supported

rndc querylog

TERM

Stops the server

ndc stop

Stops the server

rndc stop

So, to toggle query logging with an older version of ndc, you can use:

#ndc querylog

just as you would with the newer version of ndc. Under the hood, though, this ndc is tracking down named’s PID and sending it the WINCH signal.

If you don’t have ndc, you’ll have to do what ndc does by hand: find named’s process ID and send it the appropriate signal. The BIND nameserver leaves its process ID in a disk file called the PID file, making it easier to chase the critter down; you don’t have to use ps. The most common path for the PID file is /var/run/named.pid. On some systems, the PID file is /etc/named.pid. Check the named manual page to see which directory named.pid is in on your system. Since the nameserver’s process ID is the only thing in the PID file, sending a HUP signal can be as simple as:

#kill -HUP `cat /var/run/named.pid`

If you can’t find the PID file, you can always find the process ID with ps. On a BSD-based system, use:

%ps -ax | grep named

On a SYS V-based system, use:

%ps -ef | grep named

However, you may find more than one named process running if you use ps on some platforms. For example, multithreaded builds of named running on Linux show up as multiple processes. If the ps output shows multiple nameservers, you can use the pstree program to determine which is the parent. This may seem like stating the obvious, but you should send signals only to the parent nameserver process.

Updating Zone Datafiles

Something is always changing on your network—new workstations arrive, you finally retire or sell the relic, or you move a host to a different network. Each change means that zone datafiles must be modified. Should you make the changes manually? Or should you wimp out and use a tool to help you?

First, we’ll discuss how to make the changes manually. Then, we’ll talk about a tool to help out: h2n. Actually, we recommend that you use a tool to create the zone datafiles; we were kidding about that wimp stuff, okay? Or at least use a tool to increment the serial number for you. The syntax of zone datafiles lends itself to making mistakes. It doesn’t help that the address and pointer records are in different files, which must agree with each other. However, even when you use a tool, it is critical to know what goes on when the files are updated, so we’ll start with the manual method.

Adding and Deleting Hosts

After creating your zone datafiles initially, it should be fairly apparent what you need to change when you add a new host. We’ll go through the steps here in case you weren’t the one to set up those files or if you’d just like a checklist to follow. Make these changes to your primary nameserver’s zone datafiles. If you make the changes to your slave nameserver’s backup zone datafiles, the slave’s data will change, but the next zone transfer will overwrite it.

  • Update the serial number in db.DOMAIN. The serial number is likely to be at the top of the file, so it’s easy to do first and reduces the chance that you’ll forget.

  • Add any A (address), CNAME (alias), and MX (mail exchanger) records for the host to the db.DOMAIN file. We added the following resource records to the db.movie.edu file when a new host (cujo) was added to our network:

    cujo  IN  A  192.253.253.5  ; cujo's internet address
          IN MX  10 cujo        ; if possible, mail directly to cujo
          IN MX  20 toystory    ; otherwise, deliver to our mail hub
  • Update the serial number and add PTR records to each db.ADDR file for which the host has an address. cujo only has one address, on network 192.253.253/24; therefore, we added the following PTR record to the db.192.253.253 file:

    5  IN PTR cujo.movie.edu.
  • Reload the primary nameserver; this forces it to load the new information:

    #rndc reload
  • If you have a snazzy BIND 9.1 or newer nameserver, you can reload just the zone you changed:

    #rndc reload movie.edu

The primary nameserver will load the new zone data. Slave nameservers will load this new data sometime within the time interval defined in the SOA record for refreshing their data. With version 8 or 9 masters and slaves, the slaves pick up the new data quickly because the primary notifies the slaves of changes within 15 minutes of the change. To delete a host, remove the resource records from db.DOMAIN and from each db.ADDR file pertaining to that host. Increment the serial number in each zone datafile you changed and reload your primary nameserver.

SOA Serial Numbers

Each zone datafile has a serial number. Every time you change the data in a zone datafile, you must increment the serial number. If you don’t increment the serial number, slave nameservers for the zone won’t pick up the updated data.

Incrementing the serial number is simple. If the original zone datafile had this SOA record:

movie.edu. IN SOA toystory.movie.edu. al.movie.edu. (
                                100     ; Serial
                                3h      ; Refresh
                                1h      ; Retry
                                1w      ; Expire
                                1h )    ; Negative caching TTL

the updated zone datafile would have this SOA record:

movie.edu. IN SOA toystory.movie.edu. al.movie.edu. (
                                101     ; Serial
                                3h      ; Refresh
                                1h      ; Retry
                                1w      ; Expire
                                1h )    ; Negative caching TTL

This simple change is the key to distributing the zone data to all your slaves. Failing to increment the serial number is the most common mistake made when updating a zone. The first few times you make a change to a zone datafile, you’ll remember to update the serial number because the process is new, and you’re paying close attention. After modifying the zone datafile becomes second nature, you’ll make some “quickie” little change, forget to update the serial number . . . and none of the slaves will pick up the new zone data. That’s why you should use a tool that updates the serial number for you! It could be h2n or something you write yourself, but it’s a good idea to use a tool.

There are several good ways to manage serial numbers. The most obvious is just to use a counter: increment the serial number by one each time you modify the file. Another method is to derive the serial number from the date. For example, you can use the eight-digit number formed by YYYYMMDD. Suppose today is January 15, 2005. In this form, your serial number would be 20050115. This scheme allows only one update per day, though, and that may not be enough. Add another two digits to this number to indicate how many times the file has been updated that day. The first number for January 15, 2005 is then 2005011500. The next modification that day changes the serial number to 2005011501. This scheme allows 100 updates per day. It also lets you know when you last incremented the serial number in the zone datafile. h2n generates the serial number from the date if you use the -y option. Whatever scheme you choose, the serial number must fit in a 32-bit, unsigned integer.

Starting Over with a New Serial Number

What do you do if the serial number on one of your zones accidentally becomes very large and you want to change it back to a more reasonable value? There is a way that works with all versions of BIND, a way that works with version 4.8.1 and later, and another that works with 4.9 and later.

The way that always works with all versions is to purge your slaves of any knowledge of the old serial number. Then you can start numbering from one (or any convenient point). Here’s how. First, change the serial number on your primary server and restart it; now the primary server has the new integer serial number. Log onto one of your slave nameserver hosts and kill the named process with the command rndc stop. Remove its backup zone datafiles (e.g., rm bak.movie.edu bak.192.249.249 bak.192.253.253) and start up your slave nameserver. Since the backup copies were removed, the slave must load a new version of the zone datafiles—picking up the new serial numbers. Repeat this process for each slave server. If any of your slave nameservers aren’t under your control, you’ll have to contact their administrators to get them to do the same.

If all your slaves run a version of BIND newer than 4.8.1 (and we pray you’re not using 4.8.1) but older than BIND 8.2, you can take advantage of the special serial number 0. If you set a zone’s serial number to 0, each slave will transfer the zone the next time it checks. In fact, the zone will be transferred every time the slave checks, so don’t forget to increment the serial number once all the slaves have synchronized on serial number 0. But there is a limit to how far you can increment the serial number. Read on.

The other method of fixing the serial number (with 4.9 and later slaves) is easier to understand if we first cover some background material. The DNS serial number is a 32-bit unsigned integer whose value ranges from 0 to 4,294,967,295. The serial number uses sequence space arithmetic, which means that for any serial number, half the numbers in the number space (2,147,483,647 numbers) are less than the serial number, and half the numbers are larger.

Let’s go over an example of sequence space numbers. Suppose the serial number is 5. Serial numbers 6 through (5 + 2,147,483,647) are larger than serial number 5, and serial numbers (5 + 2,147,483,649) through 4 are smaller. Notice that the serial number wrapped around to 4 after reaching 4,294,967,295. Also notice that we didn’t include the number (5 + 2,147,483,648), because this is exactly halfway around the number space and can be larger or smaller than 5, depending on the implementation. To be safe, don’t use it.

Now back to the original problem. If your zone serial number is 25,000, and you want to start numbering at 1 again, you can speed through the serial number space in two steps. First, add the largest increment possible to your serial number (25,000 + 2,147,483,647 = 2,147,508,647). If the number you come up with is larger than 4,294,967,295 (the largest 32-bit value), you’ll have wrap around to the beginning of the number space by subtracting 4,294,967,296 from it. After changing the serial number, you must wait for all your slaves to pick up a new copy of the zone. Second, change the zone serial number to its target value (1), which is now larger than the current serial number (2,147,508,647). After the slaves pick up a new copy of the zone, you’re done!

Additional Zone Datafile Entries

After you’ve been running a nameserver for a while, you may want to add data to your nameserver to help you manage your zone. Have you ever been stumped when someone asked you where one of your hosts is? Maybe you don’t even remember what kind of host it is. Administrators have to manage larger and larger populations of hosts these days, making it easy to lose track of this information. The nameserver can help you out. And if one of your hosts is acting up, and someone notices remotely, the nameserver can help that person get in touch with you.

So far in the book, we’ve covered SOA, NS, A, CNAME, PTR, and MX records. These records are critical to everyday operation: nameservers need them to operate, and applications look up data of these types. DNS defines many more record types, though. The next most useful resource record types are TXT and RP; these can tell you a host’s location and responsible person. For a list of common (and not-so-common) resource records, see Appendix A.

General text information

TXT stands for TeXT. These records are simply a list of strings, each less than 256 characters in length.

TXT records can be used for anything you want; one use lists a host’s location:

cujo  IN  TXT  "Location: machine room dog house"

BIND TXT records have a 2 KB limit. You can specify the TXT record as a single string or as multiple strings:

cujo  IN  TXT  "Location:" "machine room dog house"

Responsible Person

Domain administrators will undoubtedly develop a love/hate relationship with the Responsible Person, or RP, record. The RP record can be attached to any domain name, internal or leaf, and indicates who is responsible for that host or zone. This enables you to locate the miscreant responsible for the host peppering you with DNS queries, for example. But it also leads people to you when one of your hosts acts up.

The record takes two arguments as its record-specific data: an electronic mail address in domain name format and a domain name pointing to additional data about the contact. The electronic mail address is in the same format the SOA record uses: it substitutes a “.” for the “@”. The next argument is a domain name, which must have a TXT record associated with it. The TXT record then contains free-format information about the contact, such as full name and phone number. If you omit either field, you must specify the root domain (“.”) as a placeholder instead.

Here are some example RP (and associated) records:

shrek        IN  RP   root.movie.edu.  hotline.movie.edu.
             IN  RP   snewman.movie.edu.  sn.movie.edu.
hotline      IN  TXT  "Movie U. Network Hotline, (415) 555-4111"
sn           IN  TXT  "Sommer Newman, (415) 555-9612"

Note that TXT records for root.movie.edu and snewman.movie.edu aren’t necessary because they’re only the domain name encoding of electronic mail addresses, not real domain names.

Generating Zone Datafiles from the Host Table

As you saw in Chapter 4, we defined a process for converting host-table information into zone data. We’ve written a tool in Perl to automate this process, called h2n.[*] Using a tool to generate your data has one big advantage: there will be no syntax errors or inconsistencies in your zone datafiles—assuming h2n is written correctly! One common inconsistency is to have an A (address) record for a host but no corresponding PTR (pointer) record, or the other way around. Because this data is in separate zone datafiles, it is easy to err.

What does h2n do? Given the /etc/hosts file and some command-line options, h2n creates the datafiles for your zones. As a system administrator, you keep the host table current. Each time you modify the host table, you run h2n again. h2n rebuilds each zone datafile from scratch, assigning each new file the next higher serial number. It can be run manually or from cron each night. If you use h2n, you’ll never again have to worry about forgetting to increment the serial number.

First, h2n needs to know the domain name of your forward-mapping zone and your network numbers. (h2n can figure out the names of your reverse-mapping zones from your network numbers.) These map conveniently into the zone datafile names: movie.edu zone data goes in db.movie, and network 192.249.249/24 data goes into db.192.249.249. The domain name of your forward-mapping zone and your network number are specified with the -d and -n options, as follows:

-d domain name

The domain name of your forward-mapping zone.

-n network number

The network number of your network. If you are generating files for several networks, use several -n options on the command line. Omit trailing zeros and netmask specifications from the network numbers.

The h2n command requires the -d flag and at least one -n option; they have no default values. For example, to create the datafile for the zone movie.edu, which consists of two networks, give the command:

%h2n -d movie.edu -n 192.249.249 -n 192.253.253

For greater control over the data, you can use other options:

-s server

The nameservers for the NS records. As with -n, use several -s options if you have multiple primary or slave nameservers. A version 8 or 9 server will NOTIFY this list of servers when a zone changes. The default is the host that runs h2n.

-h host

The host for the MNAME field of the SOA record. host must be the primary nameserver to ensure proper operation of the NOTIFY feature. The default is the host that runs h2n.

-u user

The mail address of the person in charge of the zone data. This defaults to root on the host that runs h2n.

-o other

Other SOA values, not including the serial number, as a colon-separated list. These default to 10800:3600:604800:86400.

-f file

Reads the h2n options from the named file rather than from the command line. If you have lots of options, keep them in a file.

-v 4|8

Generates configuration files for BIND 4 or 8; version 8 is the default. Since BIND 9’s configuration file format is basically the same as BIND 8’s, you can use -v 8 for a BIND 9 nameserver.

-y

Generates the serial number from the date.

Here is an example that uses all the options mentioned so far:

%h2n -f opts

Here are the contents of file opts:

-d movie.edu
-n 192.249.249
-n 192.253.253
-s toystory.movie.edu
-s wormhole
-u al
-h toystory
-o 10800:3600:604800:86400
-v 8
-y

If an option requires a hostname, you can provide either a full domain name (e.g., toystory.movie.edu ) or just the host’s name (e.g., toystory). If you give the hostname only, h2n forms a complete domain name by adding the domain name given with the -d option. (If a trailing dot is necessary, h2n adds it too.)

There are more options to h2n than we’ve shown here. For the complete list of options, you’ll have to look at the manual page.

Of course, some kinds of resource records aren’t easy to generate from /etc/hosts; the necessary data simply isn’t there. You may need to add these records manually. But since h2n always rewrites zone datafiles, won’t your changes be overwritten?

Well, h2n provides a “back door” for inserting this kind of data. Put these special records in a file named spcl.DOMAIN, where DOMAIN is the first label of the domain name of your zone. When h2n finds this file, it “includes” it by adding the line:

$INCLUDE spcl.DOMAIN

to the end of the db.DOMAIN file. (The $INCLUDE control statement is described later in this chapter.) For example, the administrator of movie.edu may add extra MX records into the file spcl.movie so that users can mail to movie.edu directly instead of sending mail to hosts within movie.edu. Upon finding this file, h2n puts the line:

$INCLUDE spcl.movie

at the end of the zone datafile db.movie.

Keeping the Root Hints Current

As we explained in Chapter 4, the root hints file tells your nameserver where the servers for the root zone are. It must be updated periodically. The root nameservers don’t change very often, but they do change. A good practice is to check your root hints file every month or two. In Chapter 4, we told you to get the file by FTP’ing to ftp.rs.internic.net. And that’s probably the best way to keep current.

If you have a copy of dig, a query tool included in the BIND distribution (and covered in Chapter 12), you can retrieve the current list of root nameservers by running:

%dig @a.root-servers.net . ns > db.cache

Organizing Your Files

Back when you first set up your zones, organizing your files was simple: you put them all in a single directory. There was one configuration file and a handful of zone datafiles. Over time, though, your responsibilities grew. More networks were added and hence more in-addr.arpa zones. Maybe you delegated a few subdomains. You started backing up zones for other sites. After a while, an ls of your nameserver directory no longer fit on a single screen. It’s time to reorganize. BIND has a few features that will help with this reorganization.

BIND nameservers support a configuration file statement, called include, which allows you to insert the contents of a file into the current configuration file. This lets you take a very large configuration file and break it into smaller pieces.

Zone datafiles (for all BIND versions) support two[*] control statements: $ORIGIN and $INCLUDE. The $ORIGIN statement changes a zone datafile’s origin, and $INCLUDE inserts a new file into the current zone datafile. These control statements are not resource records; they facilitate the maintenance of DNS data. In particular, they make it easier for you to divide your zone into subdomains by allowing you to store the data for each subdomain in a separate file.

Using Several Directories

One way to organize your zone datafiles is to store them in separate directories. If your nameserver is a primary for several sites’ zones (both forward- and reverse-mapping), you can store each site’s zone datafiles in its own directory. Another arrangement might be to store all the primary zones’ datafiles in one directory and all the backup zone datafiles in another. Let’s look at what the configuration file might look like if you chose to split up your primary and slave zones:

options { directory "/var/named"; };
//
// These files are not specific to any zone
//
zone "." {
        type hint;
        file "db.cache";
};
zone "0.0.127.in-addr.arpa" {
        type master;
        file "db.127.0.0";
};
//
// These are our primary zone files
//
zone "movie.edu" {
        type master;
        file "primary/db.movie.edu";
};
zone "249.249.192.in-addr.arpa" {
        type master;
        file "primary/db.192.249.249";
};
zone "253.253.192.in-addr.arpa" {
        type master;
        file "primary/db.192.253.253";
};
//
// These are our slave zone files
//
zone "ora.com" {
        type slave;
        file "slave/bak.ora.com";
        masters { 198.112.208.25; };
};
zone "208.112.192.in-addr.arpa" {
        type slave;
        file "slave/bak.198.112.208";
        masters { 198.112.208.25; };
};

Another variation on this division is to break the configuration file into three files: the main file, a file that contains all the primary entries, and a file that contains all the secondary entries. Here’s what the main configuration file might look like:

options { directory "/var/named"; };
//
// These files are not specific to any zone
//
zone "." {
        type hint;
        file "db.cache";
};
zone "0.0.127.in-addr.arpa" {
        type master;
        file "db.127.0.0";
};

include "named.conf.primary";
include "named.conf.slave";

Here is named.conf.primary:

//
// These are our primary zone files
//
zone "movie.edu" {
        type master;
        file "primary/db.movie.edu";
};
zone "249.249.192.in-addr.arpa" {
        type master;
        file "primary/db.192.249.249";
};
zone "253.253.192.in-addr.arpa" {
        type master;
        file "primary/db.192.253.253";
};

Here is named.conf.slave:

//
// These are our slave zone files
//
zone "ora.com" {
        type slave;
        file "slave/bak.ora.com";
        masters { 198.112.208.25; };
};
zone "208.112.192.in-addr.arpa" {
        type slave;
        file "slave/bak.198.112.208";
        masters { 198.112.208.25; };
};

You might think the organization would be better if you put the configuration file with the primary directives into the primary subdirectory by adding a new directory directive to change to this directory, and remove the primary/ from each filename because the nameserver is now running in that directory. Then you could make comparable changes in the configuration file with the secondary lines. Unfortunately, that doesn’t work. BIND allows you to define only a single working directory. Things get rather confusing when the nameserver keeps switching around to different directories: backup zone datafiles end up in the last directory the nameserver changed to, for example.

Changing the Origin in a Zone Datafile

With BIND, the default origin for the zone datafiles is the second field of the zone statement in the named.conf file. The origin is a domain name that is automatically appended to all names in the file that don’t end in a dot. This origin can be changed in the zone datafile with the $ORIGIN control statement. In the zone datafile, $ORIGIN is followed by a domain name. (Don’t forget the trailing dot if you use the full domain name!) From this point on, all names that don’t end in a dot have the new origin appended. If your zone (e.g., movie.edu) has a number of subdomains, you can use the $ORIGIN statement to reset the origin and simplify the zone datafile. For example:

$ORIGIN classics.movie.edu.
maltese       IN  A  192.253.253.100
casablanca    IN  A  192.253.253.101

$ORIGIN comedy.movie.edu.
mash          IN  A  192.253.253.200
twins         IN  A  192.253.253.201

We cover creating subdomains in more depth in Chapter 9.

Including Other Zone Datafiles

Once you’ve subdivided your zone like this, you might find it more convenient to keep each subdomain’s records in separate files. The $INCLUDE control statement lets you do this:

$ORIGIN classics.movie.edu.
$INCLUDE db.classics.movie.edu

$ORIGIN comedy.movie.edu.
$INCLUDE db.comedy.movie.edu

To simplify the file even further, you can specify the included file and the new origin on a single line:

$INCLUDE db.classics.movie.edu classics.movie.edu.
$INCLUDE db.comedy.movie.edu   comedy.movie.edu.

When you specify the origin and the included file on a single line, the origin change applies only to the particular file that you’re including. For example, the comedy.movie.edu origin applies only to the names in db.comedy.movie.edu. After db.comedy.movie.edu has been included, the origin returns to what it was before $INCLUDE, even if there was an $ORIGIN control statement within db.comedy.movie.edu.

Changing System File Locations

BIND allows you to change the name and location of the following system files: named.pid, named-xfer, named_dump.db, and named.stats. Most of you will not need to use this feature; don’t feel obligated to change the names or locations of these files just because you can.

If you do change the location of the files written by the nameserver (named.pid, named_dump.db, or named.stats), for security reasons you should choose a directory that is not world-writable. While we don’t know of any break-ins caused by writing these files, you should follow this guideline just to be safe.

named.pid’s full path is usually /var/run/named.pid or /etc/named.pid. One reason you might change the default location of this file is if you find yourself running more than one nameserver on a single host. Yikes! Why would someone do that? Well, Chapter 10 gives an example of running two nameservers on one host (and explains the rationale behind it). You can specify a different named.pid file in the configuration file for each server:

options { pid-file "server1.pid"; };

named-xfer’s path is usually /usr/sbin/named-xfer or /etc/named-xfer. You’ll remember that named-xfer is used by a slave nameserver for inbound zone transfers. One reason you might change the default location is to build and test a new version of BIND in a local directory; your test version of named can be configured to use the local version of named-xfer:

options { named-xfer "/home/rudy/named/named-xfer"; };

Since BIND 9 doesn’t use named-xfer, of course, there’s not much call for this substatement with BIND 9.

The nameserver writes named_dump.db into its current directory when you tell it to dump its database. Here’s an example of how to change the location of the dump file:

options { dump-file "/home/rudy/named/named_dump.db"; };

The nameserver writes named.stats into its current directory when you tell it to dump statistics. Here’s an example of how to change its location:

options { statistics-file "/home/rudy/named/named.stats"; };

Logging

BIND supports extensive logging, which consists of writing information to a debug file and sending information to syslog. Extensive logging has its costs, though; there’s a lot to learn before you can effectively configure this subsystem. If you don’t have time to experiment with logging, use the defaults and come back to this topic later. Most of you won’t need to change the default logging behavior.

There are two main concepts in logging: channels and categories. A channel specifies where logged data goes: to syslog, to a file, to named’s standard error output, or to the bit bucket. A category specifies what data is logged. In the BIND source code, most messages the nameserver logs are categorized according to the function of the code they relate to. For example, a message produced by the part of BIND that handles dynamic updates is probably in the update category. We’ll give you a list of the categories shortly.

Each category of data can be sent to a single channel or to multiple channels. In Figure 7-1, queries are logged to a file while zone transfer data is both logged to a file and to syslog.

Logging categories to channels
Figure 7-1. Logging categories to channels

Channels allow you to filter by message severity. Here’s the list of severities, from most severe to least:

critical
error
warning
notice
info
debug [level]
dynamic

The top five severities (critical, error, warning, notice, and info) are the familiar severity levels used by syslog. The other two (debug and dynamic) are unique to BIND.

debug is nameserver debugging for which you can specify a debug level. If you omit the debug level, the level is assumed to be 1. If you specify a debug level, you will see messages of that level when nameserver debugging is turned on (e.g., if you specify “debug 3,” you will see level 3 debugging messages even when you send only one trace command to the nameserver). If you specify dynamic severity, the nameserver will log messages that match its debug level. (For example, if you send one trace command to the nameserver, it logs messages from level 1. If you send three trace commands to the nameserver, it logs messages from levels 1 through 3.) The default severity is info, which means that you won’t see debug messages unless you specify the severity.

Tip

You can configure a channel to log both debug messages and syslog messages to a file. However, the converse is not true: you cannot configure a channel to log both debug messages and syslog messages with syslog; debug messages can’t be sent to syslog.

Let’s configure a couple of channels to show you how this works. The first channel will go to syslog and log with facility daemon, sending those messages of severity info and above. The second channel will go to a file, logging debug messages at any level as well as syslog messages. Here is the logging statement:

logging {
  channel my_syslog {
     syslog daemon;
     // Debug messages will not be sent to syslog, so
     // there is no point to setting the severity to
     // debug or dynamic; use the lowest syslog level: info.
     severity info;
  };
  channel my_file {
     file "/tmp/log.msgs";
     // Set the severity to dynamic to see all the debug messages.
     severity dynamic;
  };
};

Now that we’ve configured a couple of channels, we have to tell the nameserver exactly what to send to those channels. Let’s implement what was pictured in Figure 7-1, with zone transfers going to syslog and to the file, and queries going to the file. The category specification is part of the logging statement, so we’ll build on the previous logging statement:

logging {
  channel my_syslog {
     syslog daemon;
     severity info;
  };
  channel my_file {
     file "/tmp/log.msgs";
     severity dynamic;
  };

  category xfer-out { my_syslog; my_file; };
  category queries { my_file; };
};

With this logging statement in your configuration file, start your nameserver and send it a few queries. If nothing is written to log.msgs, you may have to turn on nameserver debugging to get queries logged:

#rndc trace

Now if you send your nameserver some queries, they’re logged to log.msgs. But look around the nameserver’s working directory: there’s a new file called named.run. It has all the other debugging information written to it. You didn’t want all this other debugging, though; you just wanted the transfers and queries. How do you get rid of named.run?

There’s a special category we haven’t told you about: default. If you don’t specify any channels for a category, BIND sends those messages to whichever channel the default category is assigned to. Let’s change the default category to discard all logging messages (there’s a channel called null for this purpose):

logging {
  channel my_syslog {
     syslog daemon;
     severity info;
  };
  channel my_file {
     file "/tmp/log.msgs";
     severity dynamic;
  };

  category default { null; };
  category xfer-out { my_syslog; my_file; };
  category queries { my_file; };
};

Now start your server, turn on debugging to level 1 (if necessary), and send some queries. The queries end up in log.msgs, and named.run is created but stays empty. Great! We’re getting the hang of this after all.

A few days pass. One of your coworkers notices that the nameserver is sending much fewer messages to syslog than it used to. What happened?

Well, the default category is set up, by default, to send messages to both syslog and to the debug file (named.run). When you assigned the default category to the null channel, you turned off the other syslog messages, too. Here’s what we should have used:

category default { my_syslog; };

This sends the syslog messages to syslog but does not write debug or syslog messages to a file.

Remember, we said you’d have to experiment for a while with logging to get exactly what you want. We hope this example gives you a hint of what you might run into. Now, let’s go over the details of logging.

The logging Statement

Here’s the syntax of the logging statement. It’s rather intimidating. We’ll go over some more examples as we explain what each substatement means:

logging {
  [ channel channel_name {
    ( file path_name
       [ versions ( number | unlimited ) ]
       [ size size_spec ]
     | syslog ( kern | user | mail | daemon | auth | syslog | lpr |
                news | uucp | cron | authpriv | ftp |
                local0 | local1 | local2 | local3 |
                local4 | local5 | local6 | local7 )
     | stderr
     | null );

    [ severity ( critical | error | warning | notice |
                 info  | debug [ level ] | dynamic ); ]
    [ print-category yes_or_no; ]
    [ print-severity yes_or_no; ]
    [ print-time yes_or_no; ]
  }; ]

  [ category category_name {
    channel_name; [ channel_name; ... ]
  }; ]
  ...
};

Here are the default channels. The nameserver creates these channels even if you don’t want them. You can’t redefine these channels; you can only add more of them.

channel default_syslog {
    syslog daemon;        // send to syslog's daemon facility
    severity info;        // only send severity info and higher
};

channel default_debug {
    file "named.run";     // write to named.run in the
                          // working directory
    severity dynamic;     // log at the server's current debug level
};

channel default_stderr {  // writes to stderr
    stderr;               // only BIND 9 lets you define your own stderr
                          // channel, though BIND 8 has the built-in
                          // default_stderr channel.
    severity info;        // only send severity info and higher
};

channel null {
    null;                 // toss anything sent to this channel
};

If you don’t assign channels to the categories default, panic, packet, and eventlib, a BIND 8 nameserver assigns them these channels by default:

logging {
    category default { default_syslog; default_debug; };
    category panic { default_syslog; default_stderr; };
    category packet { default_debug; };
    category eventlib { default_debug; };
};

A BIND 9 nameserver uses this as the default logging statement:

logging {
    category default {
        default_syslog;
        default_debug;
    };
};

As we mentioned earlier, the default category logs to both syslog and to the debug file (which by default is named.run). This means that all syslog messages of severity info and above are sent to syslog, and when debugging is turned on, the syslog messages and debug messages are written to named.run.

Channel Details

A channel may be defined to go to a file, to syslog, or to null.

File channels

If a channel goes to a file, you must specify the file’s pathname. Optionally, you can specify how many versions of the file can exist at one time and how big the file may grow.

If you specify that there can be three versions, BIND will retain file, file.0, file.1, and file.2. After the nameserver starts or after it is reloaded, it moves file.1 to file.2, file.0 to file.1, file to file.0, and starts writing to a new copy of file. If you specify unlimited versions, BIND will keep 99 versions.

If you specify a maximum file size, the nameserver stops writing to the file after it reaches the specified size. Unlike the versions substatement (mentioned in the last paragraph), the file is not rolled over and a new file opened when the specified size is reached. The nameserver just stops writing to the file. If you do not specify a file size, the file grows indefinitely.

Here is an example file channel using the versions and size substatements:

logging{
  channel my_file {
     file "log.msgs" versions 3 size 10k;
     severity dynamic;
  };
};

The size can include a scaling factor, as in the example. K or k is kilobytes; M or m is megabytes; G or g is gigabytes.

It’s important to specify the severity as either debug or dynamic if you want to see debug messages. The default severity is info, which shows you only syslog messages.

syslog channels

If a channel goes to syslog, you can specify the facility to be any of the following: kern, user, mail, daemon, auth, syslog, lpr, news, uucp, cron, authpriv, ftp, local0, local1, local2, local3, local4, local5, local6, or local7. The default is daemon, and we recommend that you either use that or one of the local facilities.

Here’s an example syslog channel that uses the facility local0 instead of daemon:

logging {
    channel my_syslog {
        syslog local0;        // send to syslog's local0 facility
        severity info;        // only send severity info and higher
    };
};

stderr channel

There is a predefined channel called default_stderr for any messages you’d like written to the stderr file descriptor of the nameserver. With BIND 8, you cannot configure any other file descriptors to use stderr. With BIND 9, you can.

null channel

There is a predefined channel called null for messages you want to throw away.

Data formatting for all channels

The BIND logging facility also allows you some control over the formatting of messages. You can add a timestamp, a category, and a severity level to the messages.

Here’s an example debug message that has all the extra goodies:

01-Feb-1998 13:19:18.889 config: 
 debug 1:
source = db.127.0.0

The category for this message is config, and the severity is debug level 1.

Here’s an example channel configuration that includes all three additions:

logging {
  channel my_file {
     file "log.msgs";
     severity debug;
     print-category yes;
     print-severity yes;
     print-time yes;
  };
};

There isn’t much point in adding a timestamp for messages to a syslog channel because syslog adds the time and date itself.

Category Details

Both BIND 8 and BIND 9 have lots of categories —lots! Unfortunately, they’re different categories. We’ll list them here so you can see them all. Rather than trying to figure out which you want to see, we recommend that you configure your nameserver to print out all its log messages with their category and severity, and then pick out the ones you want to see. We’ll show you how to do this after describing the categories.

BIND 8 categories

default

If you don’t specify any channels for a category, the default category is used. In that sense, default is synonymous with all categories. However, there are some messages that didn’t end up in a category. So, even if you specify channels for each category individually, you’ll still want to specify a channel for the default category for all the uncategorized messages.

If you do not specify a channel for the default category, one will be specified for you:

category default { default_syslog; default_debug; };
cname

CNAME errors (e.g., " . . . has CNAME and other data”).

config

High-level configuration file processing.

db

Database operations.

eventlib

System events; must point to a single file channel. The default is:

category eventlib { default_debug; };
insist

Internal consistency check failures.

lame-servers

Detection of bad delegation.

load

Zone loading messages.

maintenance

Periodic maintenance events (e.g., system queries).

ncache

Negative caching events.

notify

Asynchronous zone change notifications.

os

Problems with the operating system.

packet

Decodes of packets received and sent; must point to a single file channel. The default is:

category packet { default_debug; };
panic

Problems that cause the shutdown of the server. These problems are logged both in the panic category and in their native category. The default is:

category panic { default_syslog; default_stderr; };
parser

Low-level configuration file processing.

queries

Query logging.

response-checks

Malformed responses, unrelated additional information, etc.

security

Approved/unapproved requests.

statistics

Periodic reports of activities.

update

Dynamic update events.

update-security

Unapproved dynamic updates. (In 8.4.0, these were moved into their own category so that administrators could more easily filter them out.)

xfer-in

Zone transfers from remote nameservers to the local nameserver.

xfer-out

Zone transfers from the local nameserver to remote nameservers.

BIND 9 categories

default

As with BIND 8, BIND 9’s default category matches all categories not specifically assigned to channels. However, BIND 9’s default category, unlike BIND 8’s, doesn’t match BIND’s messages that aren’t categorized. Those are part of the category listed next.

general

The general category contains all BIND messages that aren’t explicitly classified.

client

Processing client requests.

config

Configuration file parsing and processing.

database

Messages relating to BIND’s internal database; used to store zone data and cache records.

dnssec

Processing DNSSEC-signed responses.

lame-servers

Detection of bad delegation (re-added in BIND 9.1.0; before that, lame server messages were logged to resolver).

network

Network operations.

notify

Asynchronous zone change notifications.

queries

Query logging (added in BIND 9.1.0).

resolver

Name resolution, including the processing of recursive queries from resolvers.

security

Approved/unapproved requests.

update

Dynamic update events.

update-security

Unapproved dynamic updates. See note under the like-named BIND 8 category (added in 9.3.0).

xfer-in

Zone transfers from remote nameservers to the local nameserver.

xfer-out

Zone transfers from the local nameserver to remote nameservers.

Viewing all category messages

A good way to start your foray into logging is to configure your nameserver to log all its messages to a file, including the category and severity, and then pick out which messages you are interested in.

Earlier, we listed the categories that are configured by default. Here they are for BIND 8:

logging {
    category default { default_syslog; default_debug; };
    category panic { default_syslog; default_stderr; };
    category packet { default_debug; };
    category eventlib { default_debug; };
};

And here’s the category for BIND 9:

logging {
    category default { default_syslog; default_debug; };
};

By default, the category and severity are not included with messages written to the default_debug channel. In order to see all the log messages, with their category and severity, you’ll have to configure each category yourself.

Here’s a BIND 8 logging statement that does just that:

logging {
  channel my_file {
     file "log.msgs";
     severity dynamic;
     print-category yes;
     print-severity yes;
  };

  category default  { default_syslog; my_file; };
  category panic    { default_syslog; my_file; };
  category packet   { my_file; };
  category eventlib { my_file; };
  category queries  { my_file; };
};

(A BIND 9 logging statement wouldn’t have panic, packet, or eventlib categories.)

Notice that we’ve defined each category to include the channel my_file. We also added one category that wasn’t in the previous default logging statement: queries. Queries aren’t printed unless you configure the queries category.

Start your server, and turn on debugging to level 1. You’ll then see messages in log.msgs that look like the following. (BIND 9 shows only the query message because it doesn’t generate these debug messages anymore.)

queries: info: XX /192.253.253.4/foo.movie.edu/A
default: debug 1: req: nlookup(foo.movie.edu) id 4 type=1 class=1
default: debug 1: req: found 'foo.movie.edu' as 'foo.movie.edu' (cname=0)
default: debug 1: ns_req: answer -> [192.253.253.4].2338 fd=20 id=4 size=87

Once you’ve determined the messages that interest you, configure your server to log only those messages.

Keeping Everything Running Smoothly

A significant part of maintenance is being aware that something is wrong before it becomes a real problem. If you catch a problem early, chances are it’ll be that much easier to fix. As the old adage says, an ounce of prevention is worth a pound of cure.

This isn’t quite troubleshooting—we’ll devote an entire chapter to troubleshooting later—think of it more as “pre-troubleshooting.” Troubleshooting (the pound of cure) is what you have to do after your problem has developed complications and you need to identify the problem by its symptoms.

The next two sections deal with preventative maintenance: looking periodically at the syslog file and at the BIND nameserver statistics to see whether any problems are developing. Consider this a nameserver’s medical checkup.

Common Syslog Messages

There are a large number of syslog messages that named can emit. In practice, you’ll see only a few of them. We’ll cover the most common syslog messages here, excluding reports of syntax errors in zone datafiles.

Every time you start named, it sends out a message at priority LOG_NOTICE. For a BIND 8 nameserver, it looks like this:

Jan 10 20:48:32 toystory named[3221]: starting.  named 8.2.3 Tue May 16 09:39:40
MDT 2000 [email protected]:/usr/local/src/bind-8.2.3/src/bin/
named

For BIND 9, it’s significantly abridged:

Jul 27 16:18:41 toystory named[7045]: starting BIND 9.3.2

This message logs the fact that named started at this time and tells you the version of BIND you’re running as well as who built it and where (for BIND 8). Of course, this is nothing to be concerned about. It is a good place to look if you’re not sure what version of BIND your operating system supports.

Every time you send the nameserver a reload command, a BIND 8 nameserver sends out this message at priority LOG_NOTICE:

Jan 10 20:50:16 toystory named[3221]: reloading nameserver

Here’s the BIND 9 nameservers log:

Jul 27 16:27:45 toystory named[7047]: loading configuration from
                '/etc/named.conf'

These messages simply tell you that named reloaded its database (as a result of a reload command) at this time. Again, this is nothing to be concerned about. This message will most likely be of interest when you are tracking down how long a bad resource record has been in your zone data or how long a whole zone has been missing because of a mistake during an update.

Here’s another message you may see shortly after your nameserver starts:

Jan 10 20:50:20 toystory named[3221]: cannot set resource limits on
                this system

This means that your nameserver thinks your operating system does not support the getrlimit() and setrlimit() system calls, which are used when you try to define coresize, datasize, stacksize, or files. It doesn’t matter whether you’re actually using any of these substatements in your configuration file; BIND will print the message anyway. If you are not using these substatements, ignore the message. If you are, and you think your operating system actually does support getrlimit() and setrlimit(), you’ll have to recompile BIND with HAVE_GETRUSAGE defined. This message is logged at priority LOG_INFO.

If you run your nameserver on a host with many network interfaces (especially virtual network interfaces), you may see this message soon after startup or even after your nameserver has run well for a while:

Jan 10 20:50:31 toystory named[3221]: fcntl(dfd, F_DUPFD, 20): Too
                many open files
Jan 10 20:50:31 toystory named[3221]: fcntl(sfd, F_DUPFD, 20): Too
                many open files

This means that BIND has run out of file descriptors. BIND uses a fair number of file descriptors: two for each network interface it’s listening on (one for UDP and one for TCP), and one for opening zone datafiles. If that’s more than the limit your operating system places on processes, BIND won’t be able to get any more file descriptors, and you’ll see this message. The priority depends on which part of BIND fails to get the file descriptor: the more critical the subsystem, the higher the priority.

The next step is either to get BIND to use fewer file descriptors, or to raise the limit the operating system places on the number of file descriptors BIND can use:

  • If you don’t need BIND listening on all your network interfaces (particularly the virtual ones), use the listen-on substatement to configure BIND to listen only on those interfaces it needs to. See Chapter 10 for details on the syntax of listen-on.

  • If your operating system supports getrlimit() and setrlimit() (as just described), configure your nameserver to use a larger number of files with the files substatement. See Chapter 10 for details on using the files substatement.

  • If your operating system places too restrictive a limit on open files, raise that limit before you start named with the ulimit command.

Every time a nameserver loads a zone, it sends out a message at priority LOG_INFO:

Jan 10 21:49:50 toystory named[3221]: zone movie.edu/IN
                loaded serial 2005011000

This tells you when the nameserver loaded the zone, the class of the zone (in this case, IN), and the serial number in the zone’s SOA record.

About every hour, a BIND 8 nameserver sends a snapshot of the current statistics at priority LOG_INFO:

Feb 18 14:09:02 toystory named[3565]: USAGE 824681342 824600158
                CPU=13.01u/3.26s CHILDCPU=9.99u/12.71s
Feb 18 14:09:02 toystory named[3565]: NSTATS 824681342 824600158
                A=4 PTR=2
Feb 18 14:09:02 toystory named[3565]: XSTATS 824681342 824600158
                RQ=6 RR=2 RIQ=0 RNXD=0 RFwdQ=0 RFwdR=0 RDupQ=0 RDupR=0
                RFail=0 RFErr=0 RErr=0 RTCP=0 RAXFR=0 RLame=0 Ropts=0
                SSysQ=2 SAns=6 SFwdQ=0 SFwdR=0 SDupQ=5 SFail=0 SFErr=0
                SErr=0 RNotNsQ=6 SNaAns=2 SNXD=1

(BIND 9 doesn’t send out the statistics as a log message.) The first two numbers for each message are times. If you subtract the second number from the first number, you’ll find out how many seconds your server has been running. (You’d think the nameserver could do that for you.) The CPU entry tells you how much time your server has spent in user mode (13.01 seconds) and system mode (3.26 seconds). Then it tells you the same statistic for child processes. The NSTATS message lists the types of queries your server has received and the counts for each. The XSTATS message lists additional statistics. The statistics under NSTATS and XSTATS are explained in more detail later in this chapter.

If BIND finds a name that doesn’t conform to RFC 952, it logs a syslog error:

Jul 24 20:56:26 toystory named[1496]: ID_4.movie.edu IN:
                                      bad owner name 
 (check-names)

This message is logged at level LOG_ERROR. See Chapter 4 for the host-naming rules.

Another syslog message, sent at priority LOG_ERROR, is a message about the zone data:

Jan 10 20:48:38 toystory2 named[3221]: ts2 has CNAME
                and other data (invalid)

This message means that there’s a problem with your zone data. For example, you may have entries like these:

ts2                 IN  CNAME toystory2
ts2                 IN  MX    10 toystory2
toystory2           IN  A     192.249.249.10
toystory2           IN  MX    10 toystory2

The MX record for ts2 is incorrect and triggers the message just listed. ts2 is an alias for toystory2, which is the canonical name. As described earlier, when a nameserver looks up a name and finds a CNAME, it replaces the original name with the canonical name and then tries looking up the canonical name. Thus, when the server looks up the MX data for ts2, it finds a CNAME record and then looks up the MX record for toystory2. Since the server follows the CNAME record for ts2, it never uses the MX record for ts2; in fact, this record is illegal. In other words, all resource records for a host have to use the canonical name; it’s an error to use an alias in place of the canonical name.

The following message indicates that a BIND 8 slave was unable to reach any master server when it tried to do a zone transfer:

Jan 10 20:52:42 wormhole named[2813]: zoneref: Masters for
                secondary zone "movie.edu" unreachable

BIND 9 slaves say:

Jul 27 16:50:55 toystory named[7174]: transfer of 'movie.edu/IN'
                from 192.249.249.3#53: failed to connect: timed out

This message is sent at priority LOG_NOTICE on BIND 8, and LOG_ERROR on BIND 9, and is sent only the first time the zone transfer fails. When the zone transfer finally succeeds, BIND tells you that the zone transferred by issuing another syslog message. When this message first appears, you don’t need to take any immediate action. The nameserver will continue attempting to transfer the zone according to the retry period in the SOA record. After a few days (or half the expire time), you might check that the server was able to transfer the zone. Or, you can verify that the zone transferred by checking the timestamp on the backup zone datafile. When a zone transfer succeeds, a new backup file is created. When a nameserver finds a zone is up to date, it “touches” the backup file (à la the Unix touch command). In both cases, the timestamp on the backup file is updated, so go to the slave and give the command ls -l /usr/local/named/db*. This tells you when the slave last synchronized each zone with the master server. We’ll cover how to troubleshoot slaves failing to transfer zones in Chapter 14.

If you are watching the syslog messages, you’ll see a LOG_INFO syslog message when the slave picks up the new zone data or when a tool such as nslookup transfers a zone:

Mar  7 07:30:04 toystory named[3977]: client 192.249.249.1#1076:
                transfer of 'movie.edu/IN':AXFR started

If you’re using the allow-transfer substatement (explained in Chapter 11) to limit which servers can load zones, you may see this message saying denied instead of started:

Jul 27 16:59:26 toystory named[7174]: client 192.249.249.1#1386:
                zone transfer 'movie.edu/AXFR/IN' denied

You’d see this syslog message only if you capture LOG_INFO syslog messages:

Jan 10 20:52:42 wormhole named[2813]: Malformed response 

                from 192.1.1.1

Most often, this message means that some bug in a nameserver caused it to send an erroneous response packet. The error probably occurred on the remote nameserver (192.1.1.1) rather than the local server (wormhole). Diagnosing this kind of error involves capturing the response packet in a network trace and decoding it. Decoding DNS packets manually is beyond the scope of this book, so we won’t go into much detail. You’d see this type of error if the response packet said it contained several answers in the answer section (such as four address resource records), yet the answer section contained only a single answer. The only course of action is to notify the administrator of the offending host via email (assuming you can get the name of the host by looking up the address). You would also see this message if the underlying network altered (damaged) the UDP response packets in some way. Checksumming UDP packets is optional, so this error might not be caught at a lower level.

A BIND 8 named logs this message when you try to sneak records into your zone datafile that belong in another zone:

Jun 13 08:02:03 toystory named[2657]: db.movie.edu:28: data "foo.bar.edu"
                           outside zone "movie.edu" (ignored)

A BIND 9 named logs:

Jul 27 17:07:01 toystory named[7174]: dns_master_load:
                db.movie.edu:28: ignoring out-of-zone data

For instance, if we tried to use this zone data:

shrek       IN A  192.249.249.2
toystory    IN A  192.249.249.3

; Add this entry to the nameserver's cache
foo.bar.edu.  IN A  10.0.7.13

we’d be adding data for the bar.edu zone into our movie.edu zone datafile. This syslog message is logged at priority LOG_WARNING.

Earlier in the book, we said that you couldn’t use a CNAME in the data portion of a resource record. BIND 8 will catch this misuse:

Jun 13 08:21:04 toystory named[2699]: "movie.edu IN NS" points to a
                                        CNAME (mi.movie.edu)

BIND 9 doesn’t catch it as of 9.3.0.

Here is an example of the offending resource records:

@                        IN  NS       toystory.movie.edu.
                         IN  NS       mi.movie.edu.
toystory.movie.edu.      IN  A     192.249.249.3
monsters-inc.movie.edu.  IN  A     192.249.249.4
mi.movie.edu.            IN  CNAME monsters-inc.movie.edu.

The second NS record should have listed monsters-inc.movie.edu instead of mi.movie.edu. This syslog message won’t show up immediately when you start your nameserver.

Tip

You’ll only see the syslog message when the offending data is looked up. This syslog message is logged by a BIND 8 server at priority LOG_INFO.

The following message indicates that your nameserver may be guarding itself against one type of network attack:

Jun 11 11:40:54 toystory named[131]: Response from unexpected source 

                                        ([204.138.114.3].53)

Your nameserver sent a query to a remote nameserver, but the response that came wasn’t returned from any of the addresses your nameserver had listed for the remote nameserver. The potential security breach is this: an intruder causes your nameserver to query a remote nameserver, and at the same time the intruder sends responses (pretending the responses are from the remote nameserver) that the intruder hopes your nameserver will add to its cache. Perhaps he sends along a false PTR record, pointing the IP address of one of his hosts to the domain name of a host you trust. Once the false PTR record is in your cache, the intruder uses one of the BSD “r” commands (e.g., rlogin) to gain access to your system.

Less paranoid admins will realize that this situation can also happen if a parent zone’s nameserver knows about only one of the IP addresses of a multihomed nameserver for a child zone. The parent tells your nameserver the one IP address it knows, and when your server queries the remote nameserver, the remote nameserver responds from the other IP address. This shouldn’t happen if BIND is running on the remote nameserver host, because BIND makes every effort to use the same IP address in the response as the query was sent to. This syslog message is logged at priority LOG_INFO.

Here’s an interesting syslog message:

Jun 10 07:57:28 toystory named[131]: No root name servers for
                class 226

The only classes defined to date are: class 1, Internet (IN); class 3, Chaos (CH); and class 4, Hesiod (HS). What’s class 226? That’s exactly what your nameserver is saying with this syslog message: something is wrong because there’s no class 226. What can you do about it? Nothing, really. This message doesn’t give you enough information; you don’t know who the query is from or what the query was for. Then again, if the class field is corrupted, the domain name in the query may be garbage too. The actual cause of the problem could be a broken remote nameserver or resolver, or a corrupted UDP datagram. This syslog message is logged at priority LOG_INFO.

This message might appear if you are backing up some other zone:

Jun 7 20:14:26 wormhole named[29618]: Zone "253.253.192.in-addr.arpa"
                (class 1) SOA serial# (3345) rcvd from [192.249.249.10]
                is < ours (563319491)

Ah, the pesky admin for 253.253.192.in-addr.arpa changed the serial number format and neglected to tell you about it. Some thanks you get for running a slave for this zone, huh? Drop the admin a note to see if this change was intentional or just a typo. If the change was intentional, or if you don’t want to contact the admin, then you have to deal with it locally—kill your slave, remove the backup copy of this zone, and restart your server. This procedure removes all knowledge your slave had of the old serial number, at which point it’s quite happy with the new serial number. This syslog message is logged at priority LOG_NOTICE.

By the way, if that pesky admin was running a BIND 8 or 9 nameserver, then he must have missed (or ignored) a message his server logged, telling him that he’d rolled the zone’s serial number back. On a BIND 8 nameserver, the message looks like:

Jun 7 19:35:14 toystory named[3221]: WARNING: new serial number < old
              (zp->z_serial < serial)

On a BIND 9 nameserver, it looks like:

Jun 7 19:36:41 toystory named[9832]: dns_zone_load: zone movie.edu/IN: zone
serial has gone backwards

This message is logged at LOG_NOTICE.

You might want to remind him of the wisdom of checking syslog after making any changes to the nameserver.

This BIND 8 message will undoubtedly become familiar to you:

Aug 21 00:59:06 toystory named[12620]: Lame server on 'foo.movie.edu'
         (in 'MOVIE.EDU'?): [10.0.7.125].53 'NS.HOLLYWOOD.LA.CA.US':
         learnt (A=10.47.3.62,NS=10.47.3.62)

Under BIND 9, it looks like this:

Jan 15 10:20:16 toystory named[14205]: lame server 
 on 'foo.movie.edu' (in
         'movie.EDU'?): 10.0.7.125#53

“Aye, Captain, she’s sucking mud!” There’s some mud out there in the Internet waters in the form of bad delegations. A parent nameserver is delegating a subdomain to a child nameserver, and the child nameserver is not authoritative for the subdomain. In this case, the edu nameserver is delegating movie.edu to 10.0.7.125, and the nameserver on this host is not authoritative for movie.edu. Unless you know the admin for movie.edu, there’s probably nothing you can do about this. The syslog message is logged at LOG_INFO.

If your configuration file has:

logging { category queries { default_syslog; }; };

you will get a LOG_INFO syslog message for every query your nameserver receives:

Feb 20 21:43:25 toystory named[3830]:
            XX /192.253.253.2/carrie.movie.edu/A
Feb 20 21:43:32 toystory named[3830]:
            XX /192.253.253.2/4.253.253.192.in-addr.arpa/PTR

The format has changed slightly in BIND 9, though:

Jan 13 18:32:25 toystory named[13976]: client 192.253.253.2#1702:
           query: carrie.movie.edu IN A +
Jan 13 18:32:42 toystory named[13976]: client 192.253.253.2#1702:
           query: 4.253.253.192.in-addr.arpa IN PTR +

These messages include the IP address of the host that made the query as well as the query itself. On a BIND 8.2.1 or later nameserver, recursive queries are marked with XX+ instead of XX. A BIND 9 nameserver marks recursive queries with a + and non-recursive queries with a - character. BIND 8.4.3 and later and 9.3.0 and later even mark EDNS0 queries and TSIG-signed queries with E and S, respectively. (We’ll talk about EDNS0 in Chapter 10 and TSIG in Chapter 11.)

Make sure you have lots of disk space if you log all the queries to a busy nameserver. (On a running server, you can toggle query logging on and off with the querylog command.)

Starting with BIND 8.1.2, you might see this set of syslog messages:

May 19 11:06:08 named[21160]: bind(dfd=20, [10.0.0.1].53):
                Address already in use
May 19 11:06:08 named[21160]: deleting interface [10.0.0.1].53
May 19 11:06:08 named[21160]: bind(dfd=20, [127.0.0.1].53):
                Address already in use
May 19 11:06:08 named[21160]: deleting interface [127.0.0.1].53
May 19 11:06:08 named[21160]: not listening on any interfaces
May 19 11:06:08 named[21160]: Forwarding source address
                is [0.0.0.0].1835
May 19 11:06:08 named[21161]: Ready to answer queries.

On BIND 9 nameservers, that looks like:

Jul 27 17:15:58 toystory named[7357]: listening on IPv4 interface lo, 127.0.0.1#53
Jul 27 17:15:58 toystory named[7357]: binding TCP socket: address in use
Jul 27 17:15:58 toystory named[7357]: listening on IPv4 interface eth0,
         206.168.194.122#53
Jul 27 17:15:58 toystory named[7357]: binding TCP socket: address in use
Jul 27 17:15:58 toystory named[7357]: listening on IPv4 interface eth1,
         206.168.194.123#53
Jul 27 17:15:58 toystory named[7357]: binding TCP socket: address in use
Jul 27 17:15:58 toystory named[7357]: couldn't add command channel
0.0.0.0#953: address in use

What has happened is that you had a nameserver running, and you started up a second nameserver without killing the first one. Unlike what you might expect, the second nameserver continues to run; it just isn’t listening on any interfaces.

Understanding the BIND Statistics

Periodically, you should look over the statistics on some of your nameservers, if only to see how busy they are. We’ll now show you an example of the nameserver statistics and discuss what each line means. Nameservers handle many queries and responses during normal operation, so first we need to show you what a typical exchange might look like.

Reading the explanations for the statistics is hard without a mental picture of what goes on during a lookup. To help you understand the nameserver’s statistics, Figure 7-2 shows what might happen when an application tries to look up a domain name. The application, FTP, queries a local nameserver. The local nameserver had previously looked up data in this zone and knows where the remote nameservers are. It queries each of the remote nameservers—one of them twice—trying to find the answer. In the meantime, the application times out and sends yet another query, asking for the same information.

Example query/response exchange
Figure 7-2. Example query/response exchange

Keep in mind that even though a nameserver has sent a query to a remote nameserver, the remote nameserver may not receive the query right away. The query might be delayed or lost by the underlying network, or perhaps the remote nameserver host might be busy with another application.

Notice that a BIND nameserver is able to detect duplicate queries only while it is still trying to answer the original query. The local nameserver detects the duplicate query from the application because the local nameserver is still working on it. But remote nameserver 1 does not detect the duplicate query from the local nameserver because it answered the previous query. After the local nameserver receives the first response from remote nameserver 1, all other responses are discarded as duplicates. This dialog required the following exchanges:

Exchange

Number

Application to local nameserver

2 queries

Local nameserver to application

1 response

Local nameserver to remote nameserver 1

2 queries

Remote nameserver 1 to local nameserver

2 responses

Local nameserver to remote nameserver 2

1 query

Remote nameserver 2 to local nameserver

1 response

Local nameserver to remote nameserver 3

1 query

Remote nameserver 3 to local nameserver

0 responses

These exchanges would make the following contributions to the local nameserver’s statistics:

Statistic

Cause

2 queries received

From the application on the local host

1 duplicate query

From the application on the local host

1 answer sent

To the application on the local host

3 responses received

From remote nameservers

2 duplicate responses

From remote nameservers

2 A queries

Queries for address information

In our example, the local nameserver received queries only from an application, yet it sent queries to remote nameservers. Normally, the local nameserver would also receive queries from remote nameservers (that is, in addition to asking remote servers for information it needs to know, the local server would also be asked by remote servers for information they need to know), but we didn’t show any remote queries for the sake of simplicity.

BIND 8 statistics

Now that you’ve seen a typical exchange between applications and nameservers, as well as the statistics it generated, let’s go over a more extensive example of the statistics. To get the statistics from your BIND 8 nameserver, use ndc:

#ndc stats

Wait a few seconds, look at the file named.stats in the nameserver’s working directory. If the statistics are not dumped to this file, your server may not have been compiled with STATS defined and, thus, may not be collecting statistics. Following are the statistics from one of Paul Vixie’s BIND 4.9.3 nameservers. BIND 8 nameservers have all of the same items listed here except for RnotNsQ, and the items are arranged in a different order. BIND 9 nameservers, as of 9.1.0, keep an entirely different set of statistics, which we’ll show you in the next section.

+++ Statistics Dump +++ (800708260) Wed May 17 03:57:40 1995
746683    time since boot (secs)
392768    time since reset (secs)
14        Unknown query types
268459    A queries
3044      NS queries
5680      CNAME queries
11364     SOA queries
1008934   PTR queries
44        HINFO queries
680367    MX queries
2369      TXT queries
40        NSAP queries
27        AXFR queries
8336      ANY queries
++ Name Server Statistics ++
(Legend)
      RQ    RR    RIQ   RNXD    RFwdQ
      RFwdR RDupQ RDupR RFail   RFErr
      RErr  RTCP  RAXFR RLame   ROpts
      SSysQ SAns  SFwdQ SFwdR   SDupQ
      SFail SFErr SErr  RNotNsQ SNaAns
      SNXD
(Global)
   1992938 112600 0 19144 63462 60527 194 347 3420 0  5 2235 27 35289 0
   14886 1927930 63462 60527 107169  10025 119 0 1785426 805592  35863
[15.255.72.20]
   485 0 0 0 0  0 0 0 0 0  0 0 0 0 0  0 485 0 0 0  0 0 0 0 485  0
[15.255.152.2]
   441 137 0 1 2 108 0 0 0 0  0 0 0 0 0  13 439 85 7 84  0 0 0 0 431  0
[15.255.152.4]
   770 89 0 1 4  69 0 0 0 0  0 0 0 0 0  14 766 68 5 7  0 0 0 0 755  0
...  <lots of entries deleted>

If your BIND 8 nameserver doesn’t include any per-IP address sections after “(Global),” you need to set host-statistics to yes in your options statement if you want to track per-host statistics:

options {
    host-statistics yes;
};

However, keeping host statistics requires a fair amount of memory, so you may not want to do it routinely unless you’re trying to build a profile of your nameserver’s activity.

Let’s look at these statistics one line at a time.

+++ Statistics Dump +++ (800708260) Wed May 17 03:57:40 1995

This is when this section of the statistics was dumped. The number in parentheses (800708260) is the number of seconds since the Unix epoch, which was January 1, 1970. Mercifully, BIND converts that into a real date and time for you: May 17, 1995, 3:57:40 a.m.

746683    time since boot (secs)

This is how long the local nameserver has been running. To convert to days, divide by 86,400 (60 × 60 × 24, the number of seconds in a day). This server has been running for about 8.5 days.

392768    time since reset (secs)

This is how long the local nameserver has run since the last reload. You’ll probably see this number differ from the time since boot only if the server is a primary nameserver for one or more zones. Nameservers that are slaves for a zone automatically pick up new data with zone transfers and are not usually reloaded. Since this server has been reset, it is probably the primary nameserver for some zone.

14        Unknown query types

This nameserver received 14 queries for data of a type it didn’t recognize. Either someone is experimenting with new types, there is a defective implementation somewhere, or Paul needs to upgrade his nameserver.

268459    A queries

There have been 268,459 address lookups. Address queries are normally the most common type of query.

3044      NS queries

There have been 3,044 nameserver queries. Internally, nameservers generate NS queries when they are trying to look up servers for the root zone. Externally, applications such as dig and nslookup can also be used to look up NS records.

5680      CNAME queries

Some versions of sendmail make CNAME queries in order to canonicalize a mail address (replace an alias with the canonical name). Other versions of sendmail use ANY queries instead (we’ll get to those shortly). Otherwise, the CNAME lookups are most likely from dig or nslookup.

11364     SOA queries

SOA queries are made by slave nameservers to check if their zone data is current. If the data is not current, an AXFR query follows to cause the zone transfer. Since this set of statistics does show AXFR queries, we can conclude that slave nameservers load zone data from this server.

1008934   PTR queries

The pointer queries map addresses to names. Many kinds of software look up IP addresses: inetd, rlogind, rshd, network management software, and network-tracing software.

44        HINFO queries

The host-information queries are most likely from someone interactively looking up HINFO records.

680367    MX queries

Mailers such as sendmail make mail exchanger queries as part of the normal electronic mail delivery process.

2369      TXT queries

Some application must be making text queries for this number to be this large. It might be a tool like Harvest, which is an information search-and-retrieval technology developed at the University of Colorado.

40        NSAP queries

This is a relatively new record type used to map domain names to OSI Network Service Access Point addresses.

27        AXFR queries

Slave nameservers make AXFR queries to initiate zone transfers.

8336      ANY queries

ANY queries request records of any type for a name. sendmail is the most common program to use this query type. Since sendmail looks up CNAME, MX, and address records for a mail destination, it will make a query for ANY record type so that all the resource records are cached right away at the local nameserver.

The rest of the statistics are kept on a per-host basis. If you look over the list of hosts your nameserver has exchanged packets with, you’ll find out just how garrulous your nameserver is: you’ll see hundreds or even thousands of hosts in the list. While the size of the list is impressive, the statistics themselves are only somewhat interesting. We’ll explain all the statistics, even the ones with zero counts, although you’ll probably find only a handful of the statistics useful. To make the statistics easier to read, you’ll need a tool to expand the statistics because the output format is rather compact. We wrote a tool called bstat to do just this. Here’s what its output looks like:

hpcvsop.cv.hp.com
        485 queries received
        485 responses sent to this name server
        485 queries answered from our cache
relay.hp.com
        441 queries received
        137 responses received
          1 negative response received
          2 queries for data not in our cache or authoritative data
        108 responses from this name server passed to the querier
         13 system queries sent to this name server
        439 responses sent to this name server
         85 queries sent to this name server
          7 responses from other name servers sent to this name server
         84 duplicate queries sent to this name server
        431 queries answered from our cache
hp.com
        770 queries received
         89 responses received
          1 negative response received
          4 queries for data not in our cache or authoritative data
         69 responses from this name server passed to the querier
         14 system queries sent to this name server
        766 responses sent to this name server
         68 queries sent to this name server
          5 responses from other name servers sent to this name server
          7 duplicate queries sent to this name server
        755 queries answered from our cache

In the raw statistics (not the bstat output), each host’s IP address is followed by a table of counts. The column heading for this table is the cryptic legend at the beginning. The legend is broken into several lines, but the host statistics are all on a single line. In the following section, we’ll explain briefly what each column means as we look at the statistics for one of the hosts this nameserver conversed with—15.255.152.2 (relay.hp.com). For the sake of our explanation, we’ll first show you the column heading from the legend (e.g., RQ) followed by the count for this column for relay.

RQ 441

RQ is the count of queries received from relay. These queries were made because relay needed information about a zone served by this nameserver.

RR 137

RR is the count of responses received from relay. These are responses to queries made from this nameserver. Don’t try to correlate this number to RQ, because they are not related. RQ counts questions asked by relay; RR counts answers that relay gave to this nameserver (because this nameserver asked relay for information).

RIQ 0

RIQ is the count of inverse queries received from relay. Inverse queries were originally intended to map addresses to names, but that function is now handled by PTR records. Older versions of nslookup use an inverse query on startup, so you may see a nonzero RIQ count.

RNXD 1

RNXD is the count of “no such domain” answers received from relay.

RFwdQ 2

RFwdQ is the count of queries received (RQ) from relay that need further processing before they can be answered. This count is much higher for hosts that configure their resolver (with resolv.conf) to send all queries to your nameserver.

RFwdR 108

RFwdR is the count of responses received (RR) from relay that answer the original query and are passed back to the application that made the query.

RDupQ 0

RDupQ is the count of duplicate queries from relay. You’ll see duplicates only when the resolver is configured (with resolv.conf) to query this nameserver.

RDupR 0

RDupR is the count of duplicate responses from relay. A response is a duplicate when the nameserver can no longer find the original query in its list of pending queries that caused the response.

RFail 0

RFail is the count of SERVFAIL responses from relay. A SERVFAIL response indicates some sort of server failure. Server failure responses often occur because the remote server reads a zone datafile and finds a syntax error. Any queries for data in the zone with the erroneous zone datafile results in a server failure answer from the remote nameserver. This is probably the most common cause of SERVFAIL responses. Server failure responses also occur when the remote nameserver tries to allocate more memory and can’t, or when the remote slave nameserver’s zone data expires.

RFErr 0

RFErr is the count of FORMERR responses from relay. FORMERR means that the remote nameserver said the local nameserver’s query had a format error.

RErr 0

RErr is the count of errors that aren’t either SERVFAIL or FORMERR.

RTCP 0

RTCP is the count of queries received on TCP connections from relay. (Most queries use UDP.)

RAXFR 0

RAXFR is the count of zone transfers initiated. The count indicates that relay is not a slave for any zones served by this nameserver.

RLame 0

RLame is the count of lame delegations received. If this count is not 0, it means that some zone is delegated to the nameserver at this IP address, and the nameserver is not authoritative for the zone.

ROpts 0

ROpts is the count of packets received with IP options set.

SSysQ 13

SSysQ is the count of system queries sent to relay. System queries are queries that are initiated by the local nameserver. Most system queries will go to root nameservers because system queries are used to keep the list of root nameservers up to date. But system queries are also used to find the address of a nameserver if the address record timed out before the nameserver record did. Since relay is not a root nameserver, these queries must have been sent for the latter reason.

SAns 439

SAns is the count of answers sent to relay. This nameserver answered 439 out of the 441 (RQ) queries relay sent to it. I wonder what happened to the two queries it didn’t answer . . .

SFwdQ 85

SFwdQ is the count of queries that are sent (forwarded) to relay when the answer is not in this nameserver’s zone data or cache.

SFwdR 7

SFwdR is the count of responses from a nameserver that are sent (forwarded) to relay.

SDupQ 84

SDupQ is the count of duplicate queries sent to relay. It’s not as bad as it looks, though. The duplicate count is incremented if the query is sent to any other nameserver first. So relay might have answered all the queries it received the first time it received them, and the query still counted as a duplicate because it was sent to some other nameserver before relay.

SFail 0

SFail is the count of SERVFAIL responses sent to relay.

SFErr 0

SFErr is the count of FORMERR responses sent to relay.

SErr 0

SErr is the count of sendto() system calls that failed when the destination was relay.

RNotNsQ 0

RNotNsQ is the count of queries received that are not from port 53, the nameserver port. Prior to BIND 8, all nameserver queries came from port 53. Any queries from ports other than 53 came from a resolver. BIND 8 nameservers query from ports other than 53, however, which makes this statistic useless since you can no longer distinguish resolver queries from nameserver queries. Hence, BIND 8 dropped RNotNsQ from its statistics.

SNaAns 431

SNaAns is the count of nonauthoritative answers sent to relay. Out of the 439 answers (SAns) sent to relay, 431 are from cached data.

SNXD 0

SNXD is the count of “no such domain” answers sent to relay.

BIND 9 statistics

BIND 9.1.0 is the first version of BIND 9 to keep statistics. You use rndc to induce BIND 9 to dump its statistics:

% rndc stats

The nameserver dumps statistics, as a BIND 8 nameserver would, to a file called named.stats in its working directory. However, those statistics look completely different from BIND 8’s. Here are the contents of the stats file from one of our BIND 9 nameservers:

+++ Statistics Dump +++ (979436130)
success 9
referral 0
nxrrset 0
nxdomain 1
recursion 1
failure 1
--- Statistics Dump --- (979436130)
+++ Statistics Dump +++ (979584113)
success 651
referral 10
nxrrset 11
nxdomain 17
recursion 296
failure 217
--- Statistics Dump --- (979584113)

The nameserver appends a new statistics block (the section between “+++ Statistics Dump +++” and “--- Statistics Dump ---”) each time it receives a stats command. The number in parentheses (979436130) is, as in earlier stats files, the number of seconds since the Unix epoch. Unfortunately, BIND doesn’t convert the value for you, but you can use the date command to convert it to something more readable. For example, to convert 979584113 seconds since the Unix epoch (January 1, 1970), you can use:

% date -d '1970-01-01 979584113 sec'
Mon Jan 15 18:41:53 MST 2001

Let’s now go through these statistics one line at a time:

success 651

This is the number of successful queries the nameserver handled. Successful queries are those that didn’t result in referrals or errors.

referral 10

This is the number of queries the nameserver handled that resulted in referrals.

nxrrset 11

This is the number of queries the nameserver handled that resulted in responses saying that the type of record the querier requested didn’t exist for the domain name it specified.

nxdomain 17

This is the number of queries the nameserver handled that resulted in responses saying that the domain name the querier specified didn’t exist.

recursion 296

This is the number of queries the nameserver received that required recursive processing to answer.

failure 217

This is the number of queries the nameserver received that resulted in errors other than those covered by nxrrset and nxdomain.

These are obviously not nearly as many statistics as a BIND 8 nameserver keeps, but future versions of BIND 9 will probably record more.

Using the BIND statistics

Is your nameserver “healthy”? How do you know what a “healthy” operation looks like? From a single snapshot, you can’t really say whether a nameserver is healthy. You have to watch the statistics generated by your server over a period of time to get a feel for what sorts of numbers are normal for your configuration. These numbers will vary markedly among nameservers depending on the mix of applications generating lookups, the type of server (primary, slave, caching-only), and the level of the zones in the namespace it is serving.

One thing to watch for in the statistics is how many queries per second your nameserver receives. Take the number of queries received and divide by the number of seconds the nameserver has been running. Paul’s BIND 4.9.3 nameserver received 1,992,938 queries in 746,683 seconds, or approximately 2.7 queries per second—not a very busy server.[*] If the number you come up with for your nameserver seems out of line, look at which hosts are making all the queries and decide if it makes sense for them to be making all those queries. At some point, you may decide that you need more nameservers to handle the load. We’ll cover that situation in the next chapter.



[*] See RFCs 2085 and 2104 for more information on HMAC-MD5.

[*] In case you’ve forgotten how to get h2n, see the Preface.

[*] Three if you count $TTL, which BIND 8.2 and later nameservers support.

[*] Recall that the root nameservers, which run plain vanilla BIND, can handle thousands of queries per second.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset