On Wed, Nov 23, 2022 at 05:45:53PM +0800, Guoqing Jiang wrote:Yes, I misanalyzed earlier.
But it is the caller's responsibility to destroy it since commit
dd37d2f59eb8.
The causes are as follows:
rdma_listen()
rdma_bind_addr()
cma_acquire_dev_by_src_ip()
cma_attach_to_dev()
_cma_attach_to_dev()
cma_dev_get()
Thanks for the analysis.
And for the two callers of cma_listen_on_dev, looks they have
different behaviors with regard to handling failure.
Yes, the CM is not the problem, and that print from it is unrelated
I patched in netdevice_tracker and get this:Rxe dose not have this issue, maybe because it does not support vlan dev.
[ 237.475070][ T7541] unregister_netdevice: waiting for vlan0 to become free. Usage count = 2
[ 237.477311][ T7541] leaked reference.
[ 237.478378][ T7541] ib_device_set_netdev+0x266/0x730
[ 237.479848][ T7541] siw_newlink+0x4e0/0xfd0
[ 237.481100][ T7541] nldev_newlink+0x35c/0x5c0
[ 237.482121][ T7541] rdma_nl_rcv_msg+0x36d/0x690
[ 237.483312][ T7541] rdma_nl_rcv+0x2ee/0x430
[ 237.484483][ T7541] netlink_unicast+0x543/0x7f0
[ 237.485746][ T7541] netlink_sendmsg+0x918/0xe20
[ 237.486866][ T7541] sock_sendmsg+0xcf/0x120
[ 237.488006][ T7541] ____sys_sendmsg+0x70d/0x8b0
[ 237.489294][ T7541] ___sys_sendmsg+0x11d/0x1b0
[ 237.490404][ T7541] __sys_sendmsg+0xfa/0x1d0
[ 237.491451][ T7541] do_syscall_64+0x35/0xb0
[ 237.492566][ T7541] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Which seems to confirm my original prediction, except this is siw not
rxe..
Maybe rxe was the wrong guess, or maybe it is troubled too in other
reports?
Jason