Network device driver
• Architecture
sk_buff structure
• The struct sk_buff is the structure
representing a network packet
• Designed to easily support
encapsulation/decapsulation of data through the protocol layers
• In addition to the data itself, an
sk_buff maintains
– head, the start of the packet
– data, the start of the packet
payload
– tail, the end of the packet payload
– end, the end of the packet
– len, the amount of data of the
packet
• These fields are updated when the
packet goes through the protocol layers
Allocating a skb
• Function dev_alloc_skb() allows to
allocate an SKB
• Can be called from an interrupt
handler. Usually the case on reception.
• On Ethernet, the size allocated is
usually the length of the packet + 2, so that the IP header is word-aligned(the
Ethernet header is 14 bytes)
• skb = dev_alloc_skb(length +
NET_IP_ALIGN);
Reserving space in a
skb
• Need to skip NET_IP_ALIGN bytes at
the beginning of the SKB
• Done with skb_reserve()
• skb_reserve(skb, NET_IP_ALIGN);
Copy the received
data
• The packet payload must be copied
from the DMA buffer to the SKB, using
• static inline void skb_copy_to_linear_data(struct
sk_buff *skb, const void *from, const unsigned int len);
• static inline void skb_copy_to_linear_data_offset(struct
sk_buff *skb, const int offset, const void *from, const unsigned int len);
skb_copy_to_linear_data(skb,
dmabuffer, length);
Update pointers in
skb
• skb_put() is used to update the SKB
pointers after copying the payload
• skb_put(skb, length);
Struct net_device
• This structure represents a single
network interface
• Allocation takes place with
alloc_etherdev()
– The size of private data must be
passed as argument. The pointer to these private data can be read in
net_device->priv
– alloc_etherdev() is a specialization
of alloc_netdev() for Ethernet interfaces
• Registration with register_netdev()
• Unregistration with
unregister_netdev()
• Liberation with free_netdev()
Struct net_device_ops
• The methods of a network interface.
The most important ones:
– ndo_open(), called when the network
interface is up'ed
– ndo_close(), called when the network
interface is down'ed
– ndo_start_xmit(), to start the
transmission of a packet
• And others:
– ndo_get_stats(), to get statistics
– ndo_do_ioctl(), to implement device
specific operations
– ndo_set_rx_mode(), to select
promiscuous, multicast, etc.
– ndo_set_mac_address(), to set the
MAC address
– ndo_set_multicast_list(), to set
multicast filters
• Set the netdev_ops field in the
struct net_device structure to point to the struct net_device_ops structure.
Utility functions
• netif_start_queue()
– Tells the kernel that the driver is
ready to send packets
• netif_stop_queue()
– Tells the kernel to stop sending
packets. Useful at driver cleanup of course, but also when all transmission
buffers are full.
• netif_queue_stopped()
– Tells whether the queue is currently
stopped or not
• netif_wake_queue()
– Wakeup a queue after a
netif_stop_queue(). The kernel will resume sending packets
Transmission
• The driver implements the ndo_start_xmit()
operation
• The kernel calls this operation with
a SKB as argument
• The driver sets up DMA buffers and
other hardware-dependent mechanisms and starts the transmission
– Depending on the number of free DMA
buffers available, the driver can also stop the queue with netif_stop_queue()
• When the packet has been sent, an
interrupt is raised. The driver is responsible for
– Acknowledging the interrupt
– Freeing the used DMA buffers
– Freeing the SKB with
dev_kfree_skb_irq()
– If the queue was stopped, start it
again
• Returns NETDEV_TX_OK or NETDEV_TX_BUSY
Reception: original
mode
• Reception is notified by an
interrupt. The interrupt handler should
– Allocate an SKB with dev_alloc_skb()
– Reserve the 2 bytes offset with
skb_reserve()
– Copy the packet data from the DMA
buffers to the SKB
• skb_copy_to_linear_data() or
• skb_copy_to_linear_data_offset()
– Update the SKB pointers with
skb_put()
– Update the skb->protocol field
with eth_type_trans(skb,netdevice)
– Give the SKB to the kernel network
stack with netif_rx(skb)
Reception: NAPI mode
• The original mode is nice and
simple, but when the network traffic is high, the interrupt rate is high. The
NAPI mode allows to switch to polled mode when the interrupt rate is too high.
• In the network interface private
structure, add a struct napi_struct
• At driver initialization, register
the NAPI poll operation:
– netif_napi_add(dev,
&bp->napi, macb_poll, 64);
– dev is the network interface
– &bp->napi is the struct
napi_struct
– macb_poll is the NAPI poll operation
– 64 is the «weight» that represents
the importance of the network interface. It limits the number of packets each
interface can feed to the networking core in each polling cycle. If this quota
is not met, the driver will return back to interrupt mode. Don't send this
quota to a value greater than the number of packets the interface can store.
• In the interrupt handler, when a
packet has been received:
if
(napi_schedule_prep(&bp->napi)) {
/* Disable
reception interrupts */
__napi_schedule(&
bp->napi);
}
– The kernel will call our poll()
operation regularly
• The poll() operation has the
following prototype
– static int macb_poll(struct
napi_struct *napi, int budget)
• It must receive at most budget
packets and push them to the network stack using netif_receive_skb().
• If less than budget packets have
been received, switch back to interrupt mode using napi_complete(&
bp->napi) and re-enable interrupts
• Must return the number of packets
received
ethtool
• ethtool is a userspace tool that
allows to query low-level information from an Ethernet interface and to modify
its configuration
• On the kernel side, at the driver
level, a struct ethtool_ops structure can be declared and connected to the
struct net_device using the ethtool_ops field.
• List of operations: get_settings(),
set_settings(), get_drvinfo(), get_wol(), set_wol(), get_link(), get_eeprom(),
set_eeprom(), get_tso(), set_tso(), get_flags(), set_flags(), etc.
statistics
• The network driver is also
responsible for keeping statistics up to date about the number of packets/bytes
received/transmitted, the number of errors, of collisions, etc.
– Collecting these information is left
to the driver
• To expose these information, the
driver must implement a
– get_stats() operation, with the
following prototype
– struct net_device_stats
*foo_get_stats (struct net_device *dev);
• The net_device_stats structure must
be filled with the driver. It contains fields such as rx_packets, tx_packets,
rx_bytes, tx_bytes, rx_errors, tx_errors, rx_dropped, tx_dropped, multicast,
collisions, etc.
Quick references
• Linux Device Drivers, chapter 17 (a
little bit old)
• Documentation/networking/netdevices.txt
• Documentation/networking/phy.txt
• include/linux/netdevice.h,
• include/linux/ethtool.h,
include/linux/phy.h,
• include/linux/sk_buff.h
• And of course, drivers/net/ for
several examples of drivers
• Driver code templates in the kernel
sources:
• drivers/usb/usbskeleton.c
• drivers/net/isaskeleton.c
• drivers/net/pciskeleton.c
• drivers/pci/hotplug/pcihp_skeleton.c
No comments:
Post a Comment