Saturday, 3 December 2016

Network device driver

Network device driver
       Architecture



sk_buff structure
       The struct sk_buff is the structure representing a network packet
       Designed to easily support encapsulation/decapsulation of data through the protocol layers
       In addition to the data itself, an sk_buff maintains
      head, the start of the packet
      data, the start of the packet payload
      tail, the end of the packet payload
      end, the end of the packet
      len, the amount of data of the packet
       These fields are updated when the packet goes through the protocol layers

Allocating a skb
       Function dev_alloc_skb() allows to allocate an SKB
       Can be called from an interrupt handler. Usually the case on reception.
       On Ethernet, the size allocated is usually the length of the packet + 2, so that the IP header is word-aligned(the Ethernet header is 14 bytes)
       skb = dev_alloc_skb(length + NET_IP_ALIGN);



Reserving space in a skb
       Need to skip NET_IP_ALIGN bytes at the beginning of the SKB
       Done with skb_reserve()
       skb_reserve(skb, NET_IP_ALIGN);



Copy the received data
       The packet payload must be copied from the DMA buffer to the SKB, using
       static inline void skb_copy_to_linear_data(struct sk_buff *skb, const void *from, const unsigned int len);
       static inline void skb_copy_to_linear_data_offset(struct sk_buff *skb, const int offset, const void *from, const unsigned int len);
                                skb_copy_to_linear_data(skb, dmabuffer, length);



Update pointers in skb
       skb_put() is used to update the SKB pointers after copying the payload
       skb_put(skb, length);



Struct net_device
       This structure represents a single network interface
       Allocation takes place with alloc_etherdev()
      The size of private data must be passed as argument. The pointer to these private data can be read in
                net_device->priv
      alloc_etherdev() is a specialization of alloc_netdev() for Ethernet interfaces
       Registration with register_netdev()
       Unregistration with unregister_netdev()
       Liberation with free_netdev()

Struct net_device_ops
       The methods of a network interface. The most important ones:
      ndo_open(), called when the network interface is up'ed
      ndo_close(), called when the network interface is down'ed
      ndo_start_xmit(), to start the transmission of a packet
       And others:
      ndo_get_stats(), to get statistics
      ndo_do_ioctl(), to implement device specific operations
      ndo_set_rx_mode(), to select promiscuous, multicast, etc.
      ndo_set_mac_address(), to set the MAC address
      ndo_set_multicast_list(), to set multicast filters
       Set the netdev_ops field in the struct net_device structure to point to the struct net_device_ops structure.

Utility functions
       netif_start_queue()
      Tells the kernel that the driver is ready to send packets
       netif_stop_queue()
      Tells the kernel to stop sending packets. Useful at driver cleanup of course, but also when all transmission buffers are full.
       netif_queue_stopped()
      Tells whether the queue is currently stopped or not
       netif_wake_queue()
      Wakeup a queue after a netif_stop_queue(). The kernel will resume sending packets

Transmission
                       The driver implements the ndo_start_xmit() operation
       The kernel calls this operation with a SKB as argument
       The driver sets up DMA buffers and other hardware-dependent mechanisms and starts the transmission
      Depending on the number of free DMA buffers available, the driver can also stop the queue with netif_stop_queue()
       When the packet has been sent, an interrupt is raised. The driver is responsible for
      Acknowledging the interrupt
      Freeing the used DMA buffers
      Freeing the SKB with dev_kfree_skb_irq()
      If the queue was stopped, start it again
       Returns NETDEV_TX_OK or NETDEV_TX_BUSY

Reception: original mode
       Reception is notified by an interrupt. The interrupt handler should
      Allocate an SKB with dev_alloc_skb()
      Reserve the 2 bytes offset with skb_reserve()
      Copy the packet data from the DMA buffers to the SKB
       skb_copy_to_linear_data() or
       skb_copy_to_linear_data_offset()
      Update the SKB pointers with skb_put()
      Update the skb->protocol field with eth_type_trans(skb,netdevice)
      Give the SKB to the kernel network stack with netif_rx(skb)

Reception: NAPI mode
       The original mode is nice and simple, but when the network traffic is high, the interrupt rate is high. The NAPI mode allows to switch to polled mode when the interrupt rate is too high.
       In the network interface private structure, add a struct napi_struct
       At driver initialization, register the NAPI poll operation:
      netif_napi_add(dev, &bp->napi, macb_poll, 64);
      dev is the network interface
      &bp->napi is the struct napi_struct
      macb_poll is the NAPI poll operation     
      64 is the «weight» that represents the importance of the network interface. It limits the number of packets each interface can feed to the networking core in each polling cycle. If this quota is not met, the driver will return back to interrupt mode. Don't send this quota to a value greater than the number of packets the interface can store.
       In the interrupt handler, when a packet has been received:
              if (napi_schedule_prep(&bp->napi)) {
              /* Disable reception interrupts */
              __napi_schedule(& bp->napi);
              }
      The kernel will call our poll() operation regularly
       The poll() operation has the following prototype
      static int macb_poll(struct napi_struct *napi, int budget)
       It must receive at most budget packets and push them to the network stack using netif_receive_skb().
       If less than budget packets have been received, switch back to interrupt mode using napi_complete(& bp->napi) and re-enable interrupts
       Must return the number of packets received

ethtool
       ethtool is a userspace tool that allows to query low-level information from an Ethernet interface and to modify its configuration
       On the kernel side, at the driver level, a struct ethtool_ops structure can be declared and connected to the struct net_device using the ethtool_ops field.
       List of operations: get_settings(), set_settings(), get_drvinfo(), get_wol(), set_wol(), get_link(), get_eeprom(), set_eeprom(), get_tso(), set_tso(), get_flags(), set_flags(), etc.

statistics
       The network driver is also responsible for keeping statistics up to date about the number of packets/bytes received/transmitted, the number of errors, of collisions, etc.
      Collecting these information is left to the driver
       To expose these information, the driver must implement a
      get_stats() operation, with the following prototype
      struct net_device_stats *foo_get_stats (struct net_device *dev);
       The net_device_stats structure must be filled with the driver. It contains fields such as rx_packets, tx_packets, rx_bytes, tx_bytes, rx_errors, tx_errors, rx_dropped, tx_dropped, multicast, collisions, etc.

Quick references
       Linux Device Drivers, chapter 17 (a little bit old)
       Documentation/networking/netdevices.txt
       Documentation/networking/phy.txt
       include/linux/netdevice.h,
       include/linux/ethtool.h, include/linux/phy.h,
       include/linux/sk_buff.h
       And of course, drivers/net/ for several examples of drivers
       Driver code templates in the kernel sources:
       drivers/usb/usbskeleton.c
       drivers/net/isaskeleton.c
       drivers/net/pciskeleton.c
       drivers/pci/hotplug/pcihp_skeleton.c


No comments:

Post a Comment