DPDK无法获取网口和统计异常问题解决
发表于 2025/09/12
0
作者 | 吴亦航
1 问题场景描述
某客户采购了鲲鹏机器,在做软件适配时,反馈其软件在鲲鹏+DPDK 19.11.14上无法获取网口,具体现象为rte_eth_dev_count_avail()函数返回0;以及在DPDK 20.11.10上无法通过rte_eth_stats_get()函数获取ibytes和imissed的统计,需协助解决。
硬件环境信息:
鲲鹏硬件 |
配置信息 |
服务器型号 |
TaiShan 2280 |
CPU型号 |
鲲鹏920 5250处理器 |
内存 |
12*32G 2933MHz |
网卡 |
4 * 25GE SP580 |
网卡 |
4 * 25GE TM280 |
操作系统与软件信息:
名称 |
版本 |
Kylin |
V10 kernel:4.19.90-17.ky10.aarch64 |
DPDK |
20.11.10 |
DPDK |
19.11.14 |
2 问题1:无法获取网口问题
问题现象描述
在鲲鹏服务器 + SP580上使用DPDK 19.11.14版本时出现无法获取网口的问题。具体现象为rte_eth_dev_count_avail()函数返回0。
结论、解决方案及效果
Demo程序如下:
#include <rte_eal.h>
#include <rte_ethdev.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
int ret;
int port_count;
ret = rte_eal_init(argc, argv);
if (ret < 0)
rte_exit(EXIT_FAILURE, "Error with EAL initialization\n");
port_count = rte_eth_dev_count_avail();
printf("port count: %d\n", port_count);
}
采用下述方式编译并静态链接程序:
gcc -o main main.c -I /home/dpdk-stable-19.11.14/arm64-armv8a-linuxapp-gcc/include/ -L /home/dpdk-stable-19.11.14/arm64-armv8a-linuxapp-gcc/lib/ -ldpdk -lrte_eal -lrte_vhost -lrte_pmd_ark -lpthread -lnuma -ldl
操作DPDK接管网口,运行后发现无法获取网口(port count:0)。
可以用编译官方helloworld的形式编译该文件。方式为将Demo程序放至“dpdk-stable-19.11.14/examples/helloworld”目录,修改Makefile编译Demo。
经验证,用这种方式编译运行,能正常获取网口(port count:2)。
通过正常获取网口时的回显,在源码中找到打印这些日志的函数hinic_func_init:
该函数被hinic_dev_init调用:
从函数定义可知,hinic_dev_init负责初始化设备。继续走读代码,发现该函数通过RTE_PMD_REGISTER_PCI宏注册到框架中。
通过代码得知,该宏实际上定义了一个名为pciinitfn_net_hinic的构造函数:
static void __attribute__((constructor(...), used)) pciinitfn_net_hinic(void)
按GCC的constructor语义,构造函数会在main函数执行前被调用,即网卡相关的驱动程序会在DPDK应用程序执行main函数前被注册。
走读构建脚本,pciinitfn_net_hinic构造函数会被编译到hinic_pmd_ethdev.o目标文件中。在静态链接的过程中,如果目标文件中的所有符号都没有被显式引用,那么即使在构建时使用-lxxx,目标文件里的所有符号都不会被打包到可执行文件中。由于使用静态链接,怀疑是这个原因导致pciinitfn_net_hinic函数没有被打包到程序中,导致没有注册到网卡驱动。
用nm查看无法获取网口的程序,发现没有这个符号:
在能正常获取网口的程序中有这个符号:
因此判断,之前采用的编译方式没有将pciinitfn_net_hinic函数保留下来,导致程序启动时没有注册相关设备的驱动,进而在获取网口数时回显0。
通过make -n可以看出,官方的helloworld Makefile等效于使用以下选项编译构建:
gcc -o main main.c -pthread -I /home/dpdk-stable-19.11.14/arm64-armv8a-linuxapp-gcc/include/ -L /home/dpdk-stable-19.11.14/arm64-armv8a-linuxapp-gcc/lib/ -Wl,-lrte_flow_classify -Wl,--whole-archive -Wl,-lrte_pipeline -Wl,--no-whole-archive -Wl,--whole-archive -Wl,-lrte_table -Wl,--no-whole-archive -Wl,--whole-archive -Wl,-lrte_port -Wl,--no-whole-archive -Wl,-lrte_pdump -Wl,-lrte_distributor -Wl,-lrte_ip_frag -Wl,-lrte_meter -Wl,-lrte_fib -Wl,-lrte_rib -Wl,-lrte_lpm -Wl,-lrte_acl -Wl,-lrte_jobstats -Wl,-lrte_metrics -Wl,-lrte_bitratestats -Wl,-lrte_latencystats -Wl,-lrte_power -Wl,-lrte_efd -Wl,-lrte_bpf -Wl,-lrte_ipsec -Wl,--whole-archive -Wl,-lrte_cfgfile -Wl,-lrte_gro -Wl,-lrte_gso -Wl,-lrte_hash -Wl,-lrte_member -Wl,-lrte_vhost -Wl,-lrte_kvargs -Wl,-lrte_mbuf -Wl,-lrte_net -Wl,-lrte_ethdev -Wl,-lrte_bbdev -Wl,-lrte_cryptodev -Wl,-lrte_security -Wl,-lrte_compressdev -Wl,-lrte_eventdev -Wl,-lrte_rawdev -Wl,-lrte_timer -Wl,-lrte_mempool -Wl,-lrte_stack -Wl,-lrte_mempool_ring -Wl,-lrte_mempool_octeontx2 -Wl,-lrte_ring -Wl,-lrte_pci -Wl,-lrte_eal -Wl,-lrte_cmdline -Wl,-lrte_reorder -Wl,-lrte_sched -Wl,-lrte_rcu -Wl,-lrte_kni -Wl,-lrte_common_cpt -Wl,-lrte_common_octeontx -Wl,-lrte_common_octeontx2 -Wl,-lrte_common_dpaax -Wl,-lrte_bus_pci -Wl,-lrte_bus_vdev -Wl,-lrte_bus_dpaa -Wl,-lrte_bus_fslmc -Wl,-lrte_mempool_bucket -Wl,-lrte_mempool_stack -Wl,-lrte_mempool_dpaa -Wl,-lrte_mempool_dpaa2 -Wl,-lrte_pmd_af_packet -Wl,-lrte_pmd_ark -Wl,-lrte_pmd_atlantic -Wl,-lrte_pmd_axgbe -Wl,-lrte_pmd_bnxt -Wl,-lrte_pmd_bond -Wl,-lrte_pmd_cxgbe -Wl,-lrte_pmd_dpaa -Wl,-lrte_pmd_dpaa2 -Wl,-lrte_pmd_e1000 -Wl,-lrte_pmd_ena -Wl,-lrte_pmd_enetc -Wl,-lrte_pmd_enic -Wl,-lrte_pmd_failsafe -Wl,-lrte_pmd_hinic -Wl,-lrte_pmd_hns3 -Wl,-lrte_pmd_i40e -Wl,-lrte_pmd_iavf -Wl,-lrte_pmd_ice -Wl,-lrte_pmd_ixgbe -Wl,-lrte_pmd_kni -Wl,-lrte_pmd_lio -Wl,-lrte_pmd_memif -Wl,-lrte_pmd_nfp -Wl,-lrte_pmd_null -Wl,-lrte_pmd_octeontx2 -Wl,-lrte_pmd_pfe -Wl,-lrte_pmd_qede -Wl,-lrte_pmd_ring -Wl,-lrte_pmd_softnic -Wl,-lrte_pmd_tap -Wl,-lrte_pmd_thunderx_nicvf -Wl,-lrte_pmd_vdev_netvsc -Wl,-lrte_pmd_virtio -Wl,-lrte_pmd_vhost -Wl,-lrte_pmd_ifc -Wl,-lrte_pmd_vmxnet3_uio -Wl,-lrte_bus_vmbus -Wl,-lrte_pmd_netvsc -Wl,-lrte_pmd_bbdev_null -Wl,-lrte_pmd_bbdev_fpga_lte_fec -Wl,-lrte_pmd_bbdev_turbo_sw -Wl,-lrte_pmd_null_crypto -Wl,-lrte_pmd_nitrox -Wl,-lrte_pmd_octeontx_crypto -Wl,-lrte_pmd_octeontx2_crypto -Wl,-lrte_pmd_crypto_scheduler -Wl,-lrte_pmd_dpaa2_sec -Wl,-lrte_pmd_dpaa_sec -Wl,-lrte_pmd_caam_jr -Wl,-lrte_pmd_virtio_crypto -Wl,-lrte_pmd_octeontx_zip -Wl,-lrte_pmd_qat -Wl,-lrte_pmd_skeleton_event -Wl,-lrte_pmd_sw_event -Wl,-lrte_pmd_dsw_event -Wl,-lrte_pmd_octeontx_ssovf -Wl,-lrte_pmd_dpaa_event -Wl,-lrte_pmd_dpaa2_event -Wl,-lrte_mempool_octeontx -Wl,-lrte_pmd_octeontx -Wl,-lrte_pmd_octeontx2_event -Wl,-lrte_pmd_opdl_event -Wl,-lrte_rawdev_skeleton -Wl,-lrte_rawdev_dpaa2_cmdif -Wl,-lrte_rawdev_dpaa2_qdma -Wl,-lrte_bus_ifpga -Wl,-lrte_rawdev_ntb -Wl,-lrte_rawdev_octeontx2_dma -Wl,--no-whole-archive -Wl,-lrt -Wl,-lm -Wl,-lnuma -Wl,-ldl -Wl,-export-dynamic -Wl,-export-dynamic
按经验和二分精简后,可编译通Demo并获取网口数的较简命令为:
gcc -o main main.c -pthread -I /home/dpdk-stable-19.11.14/arm64-armv8a-linuxapp-gcc/include/ -L /home/dpdk-stable-19.11.14/arm64-armv8a-linuxapp-gcc/lib/ -Wl,--whole-archive -Wl,-lrte_cfgfile -Wl,-lrte_hash -Wl,-lrte_kvargs -Wl,-lrte_mbuf -Wl,-lrte_net -Wl,-lrte_ethdev -Wl,-lrte_mempool -Wl,-lrte_mempool_ring -Wl,-lrte_ring -Wl,-lrte_pci -Wl,-lrte_bus_pci -Wl,-lrte_eal -Wl,--whole-archive -Wl,-lrte_cmdline -Wl,--whole-archive -Wl,-lrte_pmd_hinic -Wl,--no-whole-archive -Wl,-lrt -Wl,-lm -Wl,-lnuma -Wl,-ldl -Wl,-export-dynamic -Wl,-export-dynamic
其关键点在于“-Wl,--whole-archive -Wl,-lrte_pmd_hinic -Wl,--no-whole-archive”。该方式在链接时保留了rte_pmd_hinic.a库(包含hinic_pmd_ethdev.o)的全部符号,使pciinitfn_net_hinic函数在程序进入main函数前被调用,注册hinic设备。
故推荐使用类似官方Makefile的形式去构建程序,保留设备驱动的符号。
3 问题2:无法获取ibytes和imissed统计
问题现象描述
业务代码在鲲鹏服务器 + TM280环境下使用DPDK 20.11.10版本时无法通过rte_eth_stats_get接口获取ibytes和imissed的统计。获取方式为:
rte_eth_stats_get(pt_id, &stats);
其中stats对应的结构体定义如下:
struct rte_eth_stats {
uint64_t ipackets; /**< Total number of successfully received packets. */ uint64_t opackets; /**< Total number of successfully transmitted packets.*/ uint64_t ibytes; /**< Total number of successfully received bytes. */ uint64_t obytes; /**< Total number of successfully transmitted bytes. */ uint64_t imissed; /**< Total of RX packets dropped by the HW, * because there are no available buffer (i.e. RX queues are full). */ uint64_t ierrors; /**< Total number of erroneous received packets. */ uint64_t oerrors; /**< Total number of failed transmitted packets. */ uint64_t rx_nombuf; /**< Total number of RX mbuf allocation failures. */ uint64_t q_ipackets[RTE_ETHDEV_QUEUE_STAT_CNTRS]; /**< Total number of queue RX packets. */ uint64_t q_opackets[RTE_ETHDEV_QUEUE_STAT_CNTRS]; /**< Total number of queue TX packets. */ uint64_t q_ibytes[RTE_ETHDEV_QUEUE_STAT_CNTRS]; /**< Total number of successfully received queue bytes. */ uint64_t q_obytes[RTE_ETHDEV_QUEUE_STAT_CNTRS]; /**< Total number of successfully transmitted queue bytes. */ uint64_t q_errors[RTE_ETHDEV_QUEUE_STAT_CNTRS]; /**< Total number of queue packets received that are dropped. */ };
在业务代码中添加stats的打印,发现ipackets统计正常,但ibytes和imissed一直为0。为屏蔽伙伴自研代码影响,让伙伴使用DPDK自带的testpmd程序进行验证。执行show port stats all时,程序会调用rte_eth_stats_get接口获取相关统计。经验证,仍无法获取。由此怀疑是共性问题。
关键过程、根本原因分析
问题分析:
ipackets,ibytes,imissed等统计数据由硬件按字节块的形式上报,驱动解析后计算得出。此处先分析驱动解析计算的问题。TM280使用的是hns3驱动,走读20.11版本统计相关的代码drivers/net/hns3/hns3_stats.c,发现只有统计ipackets,errors的代码,没有ibytes,imissed。在方法退出前打印rte_stats,也发现ibytes,imissed的统计为0。
int hns3_stats_get(struct rte_eth_dev *eth_dev, struct rte_eth_stats *rte_stats)
{
struct hns3_adapter *hns = eth_dev->data->dev_private;
struct hns3_hw *hw = &hns->hw;
struct hns3_tqp_stats *stats = &hw->tqp_stats;
struct hns3_rx_queue *rxq;
struct hns3_tx_queue *txq;
uint64_t cnt;
uint64_t num;
uint16_t i;
int ret;
/* Update tqp stats by read register */
ret = hns3_update_tqp_stats(hw);
if (ret) {
hns3_err(hw, "Update tqp stats fail : %d", ret);
return ret;
}
/* Get the error stats of received packets */
num = RTE_MIN(RTE_ETHDEV_QUEUE_STAT_CNTRS, eth_dev->data->nb_rx_queues);
for (i = 0; i != num; ++i) {
rxq = eth_dev->data->rx_queues[i];
if (rxq) {
cnt = rxq->l2_errors + rxq->pkt_len_errors;
rte_stats->q_errors[i] = cnt;
rte_stats->q_ipackets[i] =
stats->rcb_rx_ring_pktnum[i] - cnt;
rte_stats->ierrors += cnt;
}
}
/* Get the error stats of transmitted packets */
num = RTE_MIN(RTE_ETHDEV_QUEUE_STAT_CNTRS, eth_dev->data->nb_tx_queues);
for (i = 0; i < num; i++) {
txq = eth_dev->data->tx_queues[i];
if (txq)
rte_stats->q_opackets[i] = stats->rcb_tx_ring_pktnum[i];
}
rte_stats->oerrors = 0;
rte_stats->ipackets = stats->rcb_rx_ring_pktnum_rcd -
rte_stats->ierrors;
rte_stats->opackets = stats->rcb_tx_ring_pktnum_rcd -
rte_stats->oerrors;
rte_stats->rx_nombuf = eth_dev->data->rx_mbuf_alloc_failed;
return 0;
}
切换到21.11版本的源码看该函数,发现有统计imissed,ibytes的代码。
int hns3_stats_get(struct rte_eth_dev *eth_dev, struct rte_eth_stats *rte_stats)
{
...
/* Update imissed stats */
ret = hns3_update_imissed_stats(hw, false);
if (ret) {
hns3_err(hw, "update imissed stats failed, ret = %d",
ret);
return ret;
}
rte_stats->imissed = imissed_stats->rpu_rx_drop_cnt +
imissed_stats->ssu_rx_drop_cnt;
/* Get the error stats and bytes of received packets */
for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
rxq = eth_dev->data->rx_queues[i];
if (rxq == NULL)
continue;
cnt = hns3_read_dev(rxq, HNS3_RING_RX_PKTNUM_RECORD_REG);
/*
* Read hardware and software in adjacent positions to minumize
* the timing variance.
*/
rte_stats->ierrors += rxq->err_stats.l2_errors +
rxq->err_stats.pkt_len_errors;
stats->rcb_rx_ring_pktnum_rcd += cnt;
stats->rcb_rx_ring_pktnum[i] += cnt;
rte_stats->ibytes += rxq->basic_stats.bytes;
}
/* Reads all the stats of a txq in a loop to keep them synchronized */
for (i = 0; i < eth_dev->data->nb_tx_queues; i++) {
txq = eth_dev->data->tx_queues[i];
if (txq == NULL)
continue;
cnt = hns3_read_dev(txq, HNS3_RING_TX_PKTNUM_RECORD_REG);
stats->rcb_tx_ring_pktnum_rcd += cnt;
stats->rcb_tx_ring_pktnum[i] += cnt;
rte_stats->obytes += txq->basic_stats.bytes;
}
...
}
问题原因:
使用Git查看,添加ibytes和imissed统计代码对应的commit为3e9f30和fdcd6a。
通过git rev-list --count $object推测出这些commit合入的时间介于v20.11和v21.11:
找到这些代码对应的author,确认了hns3驱动特性当时处于迭代开发的过程,因此部分统计项在旧版本没有实现,归在了新版本中。
结论、解决方案及效果
在环境上编译21.11版本的DPDK,用testpmd应用的show port stats all发现可以正常获取imissed和ibytes统计。