The net_device
structure is at the very core of the
network driver layer and deserves a complete description. At a first
reading, however, you can skip this section, because you don’t need a
thorough understanding of the structure to get started. This list
describes all the fields, but more to provide a reference than to be
memorized. The rest of this chapter briefly describes each field as
soon as it is used in the sample code, so you don’t need to keep
referring back to this section.
struct net_device
can be conceptually divided into
two parts: visible and invisible. The visible part of the structure
is made up of the fields that can be explicitly assigned in static
net_device
structures. All structures in
drivers/net/Space.c
are initialized in this way,
without using the tagged syntax for structure initialization. The
remaining fields are used internally by the network code and usually
are not initialized at compilation time, not even by tagged
initialization. Some of the fields are accessed by drivers (for
example, the ones that are assigned at initialization time), while
some shouldn’t be touched.
The first part of struct net_device
is composed of
the following fields, in this order:
char name[IFNAMSIZ];
The name of the device. If the name contains a %d
format string, the first available device name with the given base is
used; assigned numbers start at zero.
unsigned long rmem_end;
,
unsigned long rmem_start;
,
unsigned long mem_end;
,
unsigned long mem_start;
Device memory information. These fields hold the beginning and ending
addresses of the shared memory used by the device. If the device has
different receive and transmit memories, the mem
fields are used for transmit memory and the rmem
fields for receive memory. mem_start
and
mem_end
can be specified on the kernel command line
at system boot, and their values are retrieved by
ifconfig. The rmem
fields are never referenced outside of the driver itself. By
convention, the end
fields are set so that
end - start
is the amount of available on-board
memory.
unsigned long base_addr;
The I/O base address of the network interface. This field, like the
previous ones, is assigned during device probe. The
ifconfig command can be used to display or
modify the current value. The base_addr
can be
explicitly assigned on the kernel command line at system boot or at
load time. The field is not used by the kernel, like the memory fields
shown previously.
unsigned char irq;
The assigned interrupt number. The value of
dev->irq
is printed by
ifconfig when interfaces are listed. This
value can usually be set at boot or load time and modified later using
ifconfig.
unsigned char if_port;
Which port is in use on multiport devices. This field is used, for
example, with devices that support both coaxial
(IF_PORT_10BASE2
) and twisted-pair
(IF_PORT_10BASET
) Ethernet connections. The full
set of known port types is defined in
<linux/netdevice.h>
.
unsigned char dma;
The DMA channel allocated by the device. The field makes sense only with some peripheral buses, like ISA. It is not used outside of the device driver itself, but for informational purposes (in ifconfig).
unsigned long state;
Device state. The field includes several flags. Drivers do not normally manipulate these flags directly; instead, a set of utility functions has been provided. These functions will be discussed shortly when we get into driver operations.
struct net_device *next;
Pointer to the next device in the global linked list. This field shouldn’t be touched by the driver.
int (*init)(struct net_device *dev);
The initialization function, described earlier.
The net_device
structure includes many additional
fields, which are usually assigned at device initialization. Some of
these fields convey information about the interface, while some exist
only for the benefit of the driver (i.e., they are not used by the
kernel); other fields, most notably the device methods, are part
of the kernel-driver interface.
We will list the three groups separately, independent of the actual order of the fields, which is not significant.
Most of the information about the interface is correctly set up by the
function ether_setup. Ethernet cards can rely on
this general-purpose function for most of these fields, but the
flags
and dev_addr
fields are
device specific and must be explicitly assigned at initialization
time.
Some non-Ethernet interfaces can use helper functions similar to
ether_setup. drivers/net/net_init.c
exports a number of such functions, including the following:
void ltalk_setup(struct net_device *dev);
void fc_setup(struct net_device *dev);
void fddi_setup(struct net_device *dev);
Configures an interface for a Fiber Distributed Data Interface (FDDI) network.
void hippi_setup(struct net_device *dev);
Prepares fields for a High-Performance Parallel Interface (HIPPI) high-speed interconnect driver.
void tr_configure(struct net_device *dev);
Handles setup for token ring network interfaces. Note that the 2.4 kernel also exports a function tr_setup, which, interestingly, does nothing at all.
Most devices will be covered by one of these classes. If yours is something radically new and different, however, you will need to assign the following fields by hand.
unsigned short hard_header_len;
The hardware header length, that is, the number of octets that lead
the transmitted packet before the IP header, or other protocol
information. The value of hard_header_len
is 14
(ETH_HLEN
) for Ethernet interfaces.
unsigned mtu;
The maximum transfer unit (MTU). This field is used by the network
layer to drive packet transmission. Ethernet has an MTU of 1500
octets (ETH_DATA_LEN
).
unsigned long tx_queue_len;
The maximum number of frames that can be queued on the device’s transmission queue. This value is set to 100 by ether_setup, but you can change it. For example, plip uses 10 to avoid wasting system memory (plip has a lower throughput than a real Ethernet interface).
unsigned short type;
The hardware type of the interface. The type
field
is used by ARP to determine what kind of hardware address the
interface supports. The proper value for Ethernet interfaces is
ARPHRD_ETHER
, and that is the value set by
ether_setup. The recognized types are defined in
<linux/if_arp.h>
.
unsigned char addr_len;
,
unsigned char broadcast[MAX_ADDR_LEN];
,
unsigned char dev_addr[MAX_ADDR_LEN];
Hardware (MAC) address length and device hardware addresses. The
Ethernet address length is six octets (we are referring to the
hardware ID of the interface board), and the broadcast address is made
up of six 0xff
octets;
ether_setup arranges for these values to be
correct. The device address, on the other hand, must be read from the
interface board in a device-specific way, and the driver should copy
it to dev_addr
. The hardware address is used to
generate correct Ethernet headers before the packet is handed over to
the driver for transmission. The snull
device doesn’t use a physical interface, and it invents its own
hardware address.
unsigned short flags;
Interface flags, detailed next.
The flags
field is a bit mask including the
following bit values. The IFF_
prefix stands for
“interface flags.” Some flags are managed by the kernel, and some
are set by the interface at initialization time to assert various
capabilities and other features of the interface. The valid flags,
which are defined in <linux/if.h>
, are as follows:
IFF_UP
This flag is read-only for the driver. The kernel turns it on when the interface is active and ready to transfer packets.
IFF_BROADCAST
This flag states that the interface allows broadcasting. Ethernet boards do.
IFF_DEBUG
This marks debug mode. The flag can be used to control the verbosity
of your printk calls or for other debugging
purposes. Although no official driver currently uses this flag, it
can be set and reset by user programs via ioctl,
and your driver can use it. The
misc-progs/netifdebug
program can be used to turn
the flag on and off.
IFF_LOOPBACK
This flag should be set only in the loopback interface. The kernel
checks for IFF_LOOPBACK
instead of hardwiring the
lo
name as a special interface.
IFF_POINTOPOINT
This flag signals that the interface is connected to a point-to-point link. It is set by ifconfig. For example, plip and the PPP driver have it set.
IFF_NOARP
This means that the interface can’t perform ARP. For example, point-to-point interfaces don’t need to run ARP, which would only impose additional traffic without retrieving useful information. snull runs without ARP capabilities, so it sets the flag.
IFF_PROMISC
This flag is set to activate promiscuous operation. By default, Ethernet interfaces use a hardware filter to ensure that they receive broadcast packets and packets directed to that interface’s hardware address only. Packet sniffers such as tcpdump set promiscuous mode on the interface in order to retrieve all packets that travel on the interface’s transmission medium.
IFF_MULTICAST
This flag is set by interfaces that are capable of multicast
transmission. ether_setup sets
IFF_MULTICAST
by default, so if your driver does
not support multicast, it must clear the flag at initialization time.
IFF_ALLMULTI
This flag tells the interface to receive all multicast packets. The
kernel sets it when the host performs multicast routing, only if
IFF_MULTICAST
is
set. IFF_ALLMULTI
is read-only for the interface.
We’ll see the multicast flags used in Section 14.13 later
in this chapter.
IFF_MASTER
,
IFF_SLAVE
These flags are used by the load equalization code. The interface driver doesn’t need to know about them.
IFF_PORTSEL
,
IFF_AUTOMEDIA
These flags signal that the device is capable of switching between
multiple media types, for example, unshielded twisted pair (UTP) versus
coaxial Ethernet cables. If IFF_AUTOMEDIA
is set,
the device selects the proper medium automatically.
IFF_DYNAMIC
This flag indicates that the address of this interface can change; used with dialup devices.
IFF_RUNNING
This flag indicates that the interface is up and running. It is
mostly present for BSD compatibility; the kernel makes little use of
it. Most network drivers need not worry about
IFF_RUNNING
.
IFF_NOTRAILERS
This flag is unused in Linux, but it exists for BSD compatibility.
When a program changes IFF_UP
, the
open or stop device method
is called. When IFF_UP
or any other flag is
modified, the set_multicast_list method is
invoked. If the driver needs to perform some action because of a
modification in the flags, it must take that action in
set_multicast_list. For example, when
IFF_PROMISC
is set or reset,
set_multicast_list must notify the onboard
hardware filter. The responsibilities of this device method are
outlined in Section 14.13.
As happens with the char and block drivers, each network device
declares the functions that act on it. Operations that can be
performed on network interfaces are listed in this section. Some of
the operations can be left NULL
, and some are
usually untouched because ether_setup assigns
suitable methods to them.
Device methods for a network interface can be divided into two groups: fundamental and optional. Fundamental methods include those that are needed to be able to use the interface; optional methods implement more advanced functionalities that are not strictly required. The following are the fundamental methods:
int (*open)(struct net_device *dev);
Opens the interface. The interface is opened whenever ifconfig activates it. The open method should register any system resource it needs (I/O ports, IRQ, DMA, etc.), turn on the hardware, and increment the module usage count.
int (*stop)(struct net_device *dev);
Stops the interface. The interface is stopped when it is brought down; operations performed at open time should be reversed.
int (*hard_start_xmit) (struct sk_buff *skb, struct net_device *dev);
This method initiates the transmission of a packet. The full packet
(protocol headers and all) is contained in a socket buffer
(sk_buff
) structure. Socket buffers are introduced
later in this chapter.
int (*hard_header) (struct sk_buff *skb, struct net_device *dev, unsigned short type, void *daddr, void *saddr, unsigned len);
This function builds the hardware header from the source and destination hardware addresses that were previously retrieved; its job is to organize the information passed to it as arguments into an appropriate, device-specific hardware header. eth_header is the default function for Ethernet-like interfaces, and ether_setup assigns this field accordingly.
int (*rebuild_header)(struct sk_buff *skb);
This function is used to rebuild the hardware header before a packet is transmitted. The default function used by Ethernet devices uses ARP to fill the packet with missing information. The rebuild_header method is used rarely in the 2.4 kernel; hard_header is used instead.
void (*tx_timeout)(struct net_device *dev);
This method is called when a packet transmission fails to complete within a reasonable period, on the assumption that an interrupt has been missed or the interface has locked up. It should handle the problem and resume packet transmission.
struct net_device_stats *(*get_stats)(struct net_device *dev);
Whenever an application needs to get statistics for the interface, this method is called. This happens, for example, when ifconfig or netstat -i is run. A sample implementation for snull is introduced in Section 14.12 later in this chapter.
int (*set_config)(struct net_device *dev, struct ifmap *map);
Changes the interface configuration. This method is the entry point for configuring the driver. The I/O address for the device and its interrupt number can be changed at runtime using set_config. This capability can be used by the system administrator if the interface cannot be probed for. Drivers for modern hardware normally do not need to implement this method.
The remaining device operations may be considered optional.
int (*do_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd);
Perform interface-specific ioctl commands.
Implementation of those commands is described later in Section 14.11. The corresponding field in struct net_device
can be left as NULL
if the
interface doesn’t need any interface-specific commands.
void (*set_multicast_list)(struct net_device *dev);
This method is called when the multicast list for the device changes and when the flags change. See Section 14.13 for further details and a sample implementation.
int (*set_mac_address)(struct net_device *dev, void *addr);
This function can be implemented if the interface supports the ability
to change its hardware address. Many interfaces don’t support this
ability at all. Others use the default
eth_mac_addr implementation (from
drivers/net/net_init.c
).
eth_mac_addr only copies the new address into
dev->dev_addr
, and it will only do so if the
interface is not running. Drivers that use
eth_mac_addr should set the hardware MAC address
from dev->dev_addr
when they are configured.
int (*change_mtu)(struct net_device *dev, int new_mtu);
This function is in charge of taking action if there is a change in the MTU (maximum transfer unit) for the interface. If the driver needs to do anything particular when the MTU is changed, it should declare its own function; otherwise, the default will do the right thing. snull has a template for the function if you are interested.
int (*header_cache) (struct neighbour *neigh, struct hh_cache *hh);
header_cache is called to fill in the
hh_cache
structure with the results of an ARP
query. Almost all drivers can use the default
eth_header_cache implementation.
int (*header_cache_update) (struct hh_cache *hh, struct net_device *dev, unsigned char *haddr);
This method updates the destination address in the
hh_cache
structure in response to a change.
Ethernet devices use eth_header_cache_update.
int (*hard_header_parse) (struct sk_buff *skb, unsigned char *haddr);
The hard_header_parse method extracts the source
address from the packet contained in skb
, copying
it into the buffer at haddr
. The return value from
the function is the length of that address. Ethernet devices normally
use eth_header_parse.
The remaining struct net_device
data fields are
used by the interface to hold useful status information. Some of the
fields are used by ifconfig and
netstat to provide the user with
information about the current configuration. An interface should thus
assign values to these fields.
unsigned long trans_start;
,
unsigned long last_rx;
Both of these fields are meant to hold a jiffies value. The driver is responsible for updating these values when
transmission begins and when a packet is received, respectively. The
trans_start
value is used by the networking
subsystem to detect transmitter lockups. last_rx
is currently unused, but the driver should maintain this field anyway
to be prepared for future use.
int watchdog_timeo;
The minimum time (in jiffies) that should pass before the networking layer decides that a transmission timeout has occurred and calls the driver’s tx_timeout function.
void *priv;
The equivalent of filp->private_data
. The driver
owns this pointer and can use it at will. Usually the private data
structure includes a struct net_device_stats
item.
The field is used in Section 14.2.2, later in this chapter.
struct dev_mc_list *mc_list;
,
int mc_count;
These two fields are used in handling multicast transmission.
mc_count
is the count of items in
mc_list
. See Section 14.13 for
further details.
spinlock_t xmit_lock;
,
int xmit_lock_owner;
The xmit_lock
is used to avoid multiple
simultaneous calls to the driver’s
hard_start_xmit function.
xmit_lock_owner
is the number of the CPU that has obtained xmit_lock
. The driver should make no
changes to these fields.
struct module *owner;
The module that “owns” this device structure; it is used to maintain the use count for the module.
There are other fields in struct net_device
, but
they are not used by network drivers.