knot with xdp will cause the ssh server to fail
OS: Rocky Linux release 9.4 (Blue Onyx)
knot version: knot-3.3.8
eno1: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ enable xdp
xdp:
listen: eno1@53
After starting the knotd program, a new ssh session cannot be established
Unless I change the mtu of the NIC
ifconfig eno1 mtu 1010
Obviously this is not the best solution
Do you know the reason
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Maintainer
What was the MTU beforehand, please?
Collapse replies - Author
The default value is 1500
- Owner
Is all incoming traffic affected (ping)?
Do you see any error statistic counters increasing (
ethtool -S eno1
)?Any errors in
sudo dmesg
?Edited by Daniel Salzman - Author
no, ping is ok, dig is ok too, new ssh session failed
The error statistic counters did not increase `
ethtool -S eno1|grep err
rx_errors: 0 tx_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_fifo_errors: 0 rx_missed_errors: 2221399 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 rx_length_errors: 0 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_csum_offload_errors: 0
`
the demsg doesn't have any useful information either
[107580.593103] ixgbe 0000:01:00.0 eno1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [107580.593275] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready [107834.848927] ixgbe 0000:01:00.0 eno1: detected SFP+: 6 [107835.103830] ixgbe 0000:01:00.0 eno1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [125795.500271] ixgbe 0000:01:00.0: removed PHC on eno1 [125795.569944] ixgbe 0000:01:00.0: Multiqueue Enabled: Rx Queue count = 40, Tx Queue count = 40 XDP Queue count = 40 [125795.814978] ixgbe 0000:01:00.0: registered PHC device on eno1 [125795.979346] ixgbe 0000:01:00.0 eno1: detected SFP+: 6 [125796.542122] ixgbe 0000:01:00.0 eno1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [168038.273797] ixgbe 0000:01:00.0 eno1: detected SFP+: 6 [168038.524700] ixgbe 0000:01:00.0 eno1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [168325.668767] device eno1 entered promiscuous mode [168327.002373] device eno1 left promiscuous mode [168344.306066] device eno1 entered promiscuous mode [168348.905656] device eno1 left promiscuous mode [168356.016447] device eno1 entered promiscuous mode [168359.087512] device eno1 left promiscuous mode [168396.049894] device eno1 entered promiscuous mode [168396.065757] device eno1 left promiscuous mode [168402.729719] device eno1 entered promiscuous mode
- Owner
Using tcpdump you don't see the incoming TCP SSH packets?
You can use
sudo xdpdump -i eno1
from package xdp-tools for diagnostics. - Author
no,i can
tcpdump -nnn -i any host xxx.xxx.xxx.xxx and port 22
tcpdump: data link type LINUX_SLL2 dropped privs to tcpdump tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 15:07:44.765721 eno1 In IP xxx.xxx.xxx.xxx.32984 > yyy.yyy.yyy.yyy.22: Flags [S], seq 2853627326, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 9], length 0 15:07:44.765820 eno1 Out IP yyy.yyy.yyy.yyy.22 > xxx.xxx.xxx.xxx.32984: Flags [S.], seq 3691694133, ack 2853627327, win 21900, options [mss 1460,nop,nop,sackOK,nop,wscale 9], length 0 15:07:44.983948 eno1 In IP xxx.xxx.xxx.xxx.32984 > yyy.yyy.yyy.yyy.22: Flags [.], ack 1, win 58, length 0 15:07:44.985044 eno1 In IP xxx.xxx.xxx.xxx.32984 > yyy.yyy.yyy.yyy.22: Flags [P.], seq 1:22, ack 1, win 58, length 21: SSH: SSH-2.0-OpenSSH_8.0 15:07:44.985079 eno1 Out IP yyy.yyy.yyy.yyy.22 > xxx.xxx.xxx.xxx.32984: Flags [.], ack 22, win 43, length 0 15:07:45.009086 eno1 Out IP yyy.yyy.yyy.yyy.22 > xxx.xxx.xxx.xxx.32984: Flags [P.], seq 1:22, ack 22, win 43, length 21: SSH: SSH-2.0-OpenSSH_8.7 15:07:45.227268 eno1 In IP xxx.xxx.xxx.xxx.32984 > yyy.yyy.yyy.yyy.22: Flags [.], ack 22, win 58, length 0 15:07:45.227315 eno1 Out IP yyy.yyy.yyy.yyy.22 > xxx.xxx.xxx.xxx.32984: Flags [P.], seq 22:990, ack 22, win 43, length 968 15:07:45.486478 eno1 In IP xxx.xxx.xxx.xxx.32984 > yyy.yyy.yyy.yyy.22: Flags [.], ack 990, win 61, length 0 15:07:45.884562 eno1 In IP xxx.xxx.xxx.xxx.32984 > yyy.yyy.yyy.yyy.22: Flags [P.], seq 1278:1326, ack 990, win 61, length 48 15:07:45.884609 eno1 Out IP yyy.yyy.yyy.yyy.22 > xxx.xxx.xxx.xxx.32984: Flags [.], ack 22, win 43, options [nop,nop,sack 1 {1278:1326}], length 0
- Maintainer
You can use
ssh -vvv
to see in what phase of connection process it fails and why. Or you can raise theLogLevel
insshd_config
toDEBUG
or up toDEBUG3
temporarily to see what exactly is going on. - Author
OpenSSH_8.0p1, OpenSSL 1.1.1k FIPS 25 Mar 2021 debug1: Reading configuration data /root/.ssh/config debug1: /root/.ssh/config line 2: Applying options for * debug3: kex names ok: [diffie-hellman-group1-sha1] debug1: Reading configuration data /etc/ssh/ssh_config debug3: /etc/ssh/ssh_config line 52: Including file /etc/ssh/ssh_config.d/05-redhat.conf depth 0 debug1: Reading configuration data /etc/ssh/ssh_config.d/05-redhat.conf debug2: checking match for 'final all' host yyy.yyy.yyy.yyy originally yyy.yyy.yyy.yyy debug3: /etc/ssh/ssh_config.d/05-redhat.conf line 3: not matched 'final' debug2: match not found debug3: /etc/ssh/ssh_config.d/05-redhat.conf line 5: Including file /etc/crypto-policies/back-ends/openssh.config depth 1 (parse only) debug1: Reading configuration data /etc/crypto-policies/back-ends/openssh.config debug3: gss kex names ok: [gss-curve25519-sha256-,gss-nistp256-sha256-,gss-group14-sha256-,gss-group16-sha512-,gss-gex-sha1-,gss-group14-sha1-] debug3: kex names ok: [curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1] debug1: configuration requests final Match pass debug2: resolve_canonicalize: hostname yyy.yyy.yyy.yyy is address debug1: re-parsing configuration debug1: Reading configuration data /root/.ssh/config debug1: /root/.ssh/config line 2: Applying options for * debug3: kex names ok: [diffie-hellman-group1-sha1] debug1: Reading configuration data /etc/ssh/ssh_config debug3: /etc/ssh/ssh_config line 52: Including file /etc/ssh/ssh_config.d/05-redhat.conf depth 0 debug1: Reading configuration data /etc/ssh/ssh_config.d/05-redhat.conf debug2: checking match for 'final all' host yyy.yyy.yyy.yyy originally yyy.yyy.yyy.yyy debug3: /etc/ssh/ssh_config.d/05-redhat.conf line 3: matched 'final' debug2: match found debug3: /etc/ssh/ssh_config.d/05-redhat.conf line 5: Including file /etc/crypto-policies/back-ends/openssh.config depth 1 debug1: Reading configuration data /etc/crypto-policies/back-ends/openssh.config debug3: gss kex names ok: [gss-curve25519-sha256-,gss-nistp256-sha256-,gss-group14-sha256-,gss-group16-sha512-,gss-gex-sha1-,gss-group14-sha1-] debug3: kex names ok: [curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1] debug2: ssh_connect_direct debug1: Connecting to yyy.yyy.yyy.yyy [yyy.yyy.yyy.yyy] port 22. debug1: Connection established. debug1: identity file /root/.ssh/id_rsa type 0 debug1: identity file /root/.ssh/id_rsa-cert type -1 debug1: identity file /root/.ssh/id_dsa type -1 debug1: identity file /root/.ssh/id_dsa-cert type -1 debug1: identity file /root/.ssh/id_ecdsa type -1 debug1: identity file /root/.ssh/id_ecdsa-cert type -1 debug1: identity file /root/.ssh/id_ed25519 type -1 debug1: identity file /root/.ssh/id_ed25519-cert type -1 debug1: identity file /root/.ssh/id_xmss type -1 debug1: identity file /root/.ssh/id_xmss-cert type -1 debug1: Local version string SSH-2.0-OpenSSH_8.0 debug1: Remote protocol version 2.0, remote software version OpenSSH_8.7 debug1: match: OpenSSH_8.7 pat OpenSSH* compat 0x04000000 debug2: fd 3 setting O_NONBLOCK debug1: Authenticating to yyy.yyy.yyy.yyy:22 as 'root' debug3: hostkeys_foreach: reading file "/dev/null" debug3: send packet: type 20 debug1: SSH2_MSG_KEXINIT sent debug3: receive packet: type 20 debug1: SSH2_MSG_KEXINIT received debug2: local client KEXINIT proposal debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,ext-info-c,kex-strict-c-v00@openssh.com debug2: host key algorithms: ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,ssh-ed25519-cert-v01@openssh.com,rsa-sha2-512-cert-v01@openssh.com,rsa-sha2-256-cert-v01@openssh.com,ssh-rsa-cert-v01@openssh.com,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-ed25519,rsa-sha2-512,rsa-sha2-256,ssh-rsa debug2: ciphers ctos: aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc debug2: ciphers stoc: aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc debug2: MACs ctos: hmac-sha2-256-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha2-256,hmac-sha1,umac-128@openssh.com,hmac-sha2-512 debug2: MACs stoc: hmac-sha2-256-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha2-256,hmac-sha1,umac-128@openssh.com,hmac-sha2-512 debug2: compression ctos: none,zlib@openssh.com,zlib debug2: compression stoc: none,zlib@openssh.com,zlib debug2: languages ctos: debug2: languages stoc: debug2: first_kex_follows 0 debug2: reserved 0 debug2: peer server KEXINIT proposal debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,kex-strict-s-v00@openssh.com debug2: host key algorithms: rsa-sha2-512,rsa-sha2-256,ecdsa-sha2-nistp256,ssh-ed25519 debug2: ciphers ctos: aes256-gcm@openssh.com,chacha20-poly1305@openssh.com,aes256-ctr,aes128-gcm@openssh.com,aes128-ctr debug2: ciphers stoc: aes256-gcm@openssh.com,chacha20-poly1305@openssh.com,aes256-ctr,aes128-gcm@openssh.com,aes128-ctr debug2: MACs ctos: hmac-sha2-256-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha2-256,hmac-sha1,umac-128@openssh.com,hmac-sha2-512 debug2: MACs stoc: hmac-sha2-256-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha2-256,hmac-sha1,umac-128@openssh.com,hmac-sha2-512 debug2: compression ctos: none,zlib@openssh.com debug2: compression stoc: none,zlib@openssh.com debug2: languages ctos: debug2: languages stoc: debug2: first_kex_follows 0 debug2: reserved 0 debug3: will use strict KEX ordering debug1: kex: algorithm: curve25519-sha256 debug1: kex: host key algorithm: ecdsa-sha2-nistp256 debug1: kex: server->client cipher: aes128-ctr MAC: hmac-sha2-256-etm@openssh.com compression: none debug1: kex: client->server cipher: aes128-ctr MAC: hmac-sha2-256-etm@openssh.com compression: none debug1: kex: curve25519-sha256 need=32 dh_need=32 debug1: kex: curve25519-sha256 need=32 dh_need=32 debug3: send packet: type 30 debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
- Author
IPv6 doesn't support MTU smaller than 1280, so I can't change mtu to 1010, so I have to give up using xdp, which is bad
- Maintainer
Could you please try the
xdpdump
command, as @dsalzman suggested? It would help us a lot - apparently some incoming packets are dropped before they reach sshd.You can use sudo xdpdump -i eno1 from package xdp-tools for diagnostics.
- David Vasek closed
closed
- David Vasek reopened
reopened
- Maintainer
It would be best if you could get output from
tcpdump
, fromxdpdump
and a debug output fromssh
, all from the same session. Thank you. - Author
The output of xdpdump is very large, and it is not clear. Can you test it
- Owner
The aim is to find where the lost TCP packets are. xdpdump shows all packets that are redirected through an XDP socket to Knot. However, I don't know how these small MTU limits can affect the traffic. The XDP buffers are of fixed size (2048 - 256 octets).
- Owner
Have you already been using Knot with XDP? What has changed since then? Also keep in mind, that your NIC/driver (ixgbe?) isn't well supported.
- Author
Yes, the other servers used intel x710, and the server worked properly after the knot enabled xdp. The four faulty hosts are all intel 82599ES nics. Now I am in a rush to provide services, so I shut down xdp
- Author
tested it with xdpdump
I found another problem. I ssh to the server and started the knot with xdp. At this time, my ssh session was normal, except that I could not create a new ssh session. However, if I killed knotd, my current ssh session would also be interrupted unexpectedly, and I had to connect to the server through idrac to execute
ip link set dev eno1 xdp off
. After sending this command, the server automatically restarts, which is a very strange phenomenon - Owner
Could you check if the lost IP packets (in the XDP mode) are fragemented? I mean to check it in the normal mode.
Edited by Daniel Salzman - Author
I don't understand what you mean. What does normal mode mean? how to check fragmentation packets?
- Owner
Using tcpdump/wireshark on SSH traffic when XDP isn't active and inspect the IP headers. However, I wouldn't use XDP with this card as it brings just problems...
- Author
- Owner
It's quite old (Q2'09), but the bigger problem is that Knot in the XDP mode is unstable with it (https://knot.pages.nic.cz/knot-dns/master/html/operation.html#pre-requisites). We don't have either time or motivation to investigate what is wrong.
- Author
Well, thank you for your reply. I'll try to replace the network adapter close issue
- yuchunyun closed
closed