Ondřej Caletka activity

Ondřej Caletka commented on issue #392 at Turris / Turris OS / Turris Build

2024-03-11T22:26:01+01:00

Hey, I can confirm that the issue is still present in Turris OS 7.0 on Turris Mox running kernel 5.15.151.

Ondřej Caletka commented on issue #355 at Turris / Turris OS / Turris Build

2024-03-11T22:25:48+01:00

Hey, I can confirm that the issue is still present in Turris OS 7.0 on Turris Mox running kernel 5.15.151.

Ondřej Caletka opened issue #905: Referral is sometimes sent in place of answer with DNS64 enabled at Knot projects / Knot Resolver

2024-02-29T00:00:00+01:00

In my setup, it happens from time to time that Knot Resolver provides wrong answer to a DoH client querying A record of an IPv4-only name when DNS64 module is active. It happens only when these conditions are met:

queried name is an apex name with A but no AAAA record
dns64 module is loaded
queried rrset nor the nsset of the zone is in cache
client is using doh2 and asking concurrently for A and AAAA record (the queries can come via completely independent HTTP/2 sessions though)

If all these conditions are fulfilled, then Knot resolver sometimes answers the A query with referral received from parent zone of the queried name. I was able to reproduce the issue on these names:

github.com
duckduckgo.com
liberec.cz
ipv4only.arpa

Steps to reproduce

I reproduce the issue on a Knot Resolver 5.7.1 installed from EPEL repository on Fedora 39 with this configuration:

(cache size is set to lowest possible value to increase the probability of hitting the issue)

modules = {'dns64'}
net.listen('::1', 443, { kind = 'doh2' })
cache.size = 32768
user('knot-resolver','knot-resolver')

I use this script to keep repeating queries using doh utility until A records are missing from the response. That happens at most after ca. 15 minutes:

#!/bin/bash

domain=${1-github.com}

# Enable debugging
socat - unix-connect:/run/knot-resolver/control/1 <
policy.add(policy.suffix(policy.DEBUG_ALWAYS, policy.todnames({'$domain'})))
EOF

while true;
do
        date
        out="$(doh -k $domain https://[::1]/dns-query)";
        echo "$out";
        grep -q "^A:" <<<"$out" || break;
        sleep 1;
done
date

I was not able to reproduce the issue using kdig tool, possibly because it sends queries sequentially and my shell was not fast enough to spawn second instance of kdig before the first one finishes.

Packet capture of the issue

I am attaching a packet capture together with TLS key log, as well as kreds syslogs of the issue demonstrated when querying ipv4only.arpa. The issue is very well visible with Wireshark filter set to: lower(dns.qry.name) == "ipv4only.arpa" Packets 31 - 188 show correct behavior, packets 256 - 422 show the issue, particularly packet 359 which contains referral from packet 354 instead of answer from packet 417:

No.     Protocol Info
     31 DoH      Standard query 0x0000 A ipv4only.arpa
     36 DoH      Standard query 0x0000 AAAA ipv4only.arpa
     65 DNS      Standard query 0x53fb AAAA ipV4oNlY.arpa OPT
     66 DNS      Standard query 0x9e3d A iPv4onLY.ARPA OPT
     67 DNS      Standard query response 0x53fb AAAA ipV4oNlY.arpa NS a.iana-servers.net NS b.iana-servers.net NS c.iana-servers.net NS ns.icann.org NSEC iris.arpa RRSIG OPT
     69 DNS      Standard query response 0x9e3d A iPv4onLY.ARPA NS a.iana-servers.net NS b.iana-servers.net NS c.iana-servers.net NS ns.icann.org NSEC iris.arpa RRSIG OPT
    108 DNS      Standard query 0xb804 AAAA iPV4oNLY.aRpa OPT
    124 DNS      Standard query response 0xb804 AAAA iPV4oNLY.aRpa SOA sns.dns.icann.org OPT
    142 DNS      Standard query 0x4de9 A Ipv4onlY.aRPa OPT
    144 DNS      Standard query response 0x4de9 A Ipv4onlY.aRPa NS a.iana-servers.net NS b.iana-servers.net NS c.iana-servers.net NS ns.icann.org NSEC iris.arpa RRSIG OPT
    174 DNS      Standard query 0xc998 A IpV4oNly.ARPa OPT
    179 DNS      Standard query response 0xc998 A IpV4oNly.ARPa A 192.0.0.170 A 192.0.0.171 NS a.iana-servers.net NS b.iana-servers.net NS c.iana-servers.net NS ns.icann.org OPT
    184 DoH      Standard query response 0x0000 AAAA ipv4only.arpa AAAA 64:ff9b::c000:aa AAAA 64:ff9b::c000:ab SOA sns.dns.icann.org
    188 DoH      Standard query response 0x0000 A ipv4only.arpa A 192.0.0.170 A 192.0.0.171
    256 DoH      Standard query 0x0000 A ipv4only.arpa
    261 DoH      Standard query 0x0000 AAAA ipv4only.arpa
    287 DNS      Standard query 0x23b6 AAAA ipV4oNlY.arPa OPT
    288 DNS      Standard query 0x8503 A IpV4ONLy.ARpA OPT
    292 DNS      Standard query response 0x23b6 AAAA ipV4oNlY.arPa NS a.iana-servers.net NS b.iana-servers.net NS c.iana-servers.net NS ns.icann.org NSEC iris.arpa RRSIG OPT
    293 DNS      Standard query response 0x8503 A IpV4ONLy.ARpA NS b.iana-servers.net NS ns.icann.org NS a.iana-servers.net NS c.iana-servers.net NSEC iris.arpa RRSIG OPT
    328 DNS      Standard query 0x4ab4 AAAA iPV4ONLy.arpa OPT
    330 DNS      Standard query response 0x4ab4 AAAA iPV4ONLy.arpa SOA sns.dns.icann.org OPT
    350 DNS      Standard query 0x17fa A ipv4ONLY.ARpa OPT
    354 DNS      Standard query response 0x17fa A ipv4ONLY.ARpa NS a.iana-servers.net NS b.iana-servers.net NS c.iana-servers.net NS ns.icann.org NSEC iris.arpa RRSIG OPT
    359 DoH      Standard query response 0x0000 A ipv4only.arpa NS ns.icann.org NS a.iana-servers.net NS b.iana-servers.net NS c.iana-servers.net
    407 DNS      Standard query 0x0f40 A IPv4oNly.arpA OPT
    417 DNS      Standard query response 0x0f40 A IPv4oNly.arpA A 192.0.0.170 A 192.0.0.171 NS a.iana-servers.net NS b.iana-servers.net NS c.iana-servers.net NS ns.icann.org OPT
    422 DoH      Standard query response 0x0000 AAAA ipv4only.arpa AAAA 64:ff9b::c000:aa AAAA 64:ff9b::c000:ab SOA sns.dns.icann.org

Ondřej Caletka left project labs / sitevalidator

2024-02-28T21:50:51+01:00

Ondřej Caletka left project labs / gitlab

2024-02-28T21:49:30+01:00

Ondřej Caletka left project Martin Straka / dnssec-libs

2024-02-28T21:49:12+01:00

Ondřej Caletka opened issue #900: Manager breaks if network interface name contains a hyphen at Knot projects / Knot Resolver

2024-02-18T15:22:15+01:00

One of my network interfaces is named mtg-dns. If I put it into the declarative config like this:

network:              
  listen:           
    - interface: mtg-dns
    - interface: mtg-dns
      kind: dot      
    - interface: mtg-dns                                                                                    
      kind: doh2

kresd fails to start, logging this error:

kresd0[7036]: [system] error while loading config: kresd0.conf:137: attempt to perform arithmetic on field 'mtg' (a nil value) (workdir '/run/knot-resolver')

I am running kresd 6.0.4 from Fedora COPR on Oracle Linux 9.

Ondřej Caletka opened issue #24: Turris 1.x: brightness is decreased on every restart of rainbow at Turris / rainbow-ng

2023-09-02T10:26:13+02:00

On Turris 1.X running 6.4.2 with rainbow version 0.1.4-1, when brightness is set to less than or equal to 223 (on the precise scale), weird things start to happen:

# rainbow brightness -p 223
# uci show rainbow
rainbow.all=led
rainbow.all.brightness='064'
# rainbow reset -n
# uci show rainbow
rainbow.all=led
rainbow.all.brightness='004'
# rainbow reset -n
# uci show rainbow
rainbow.all=led
rainbow.all.brightness='000'

After the second restart, the LEDs are completely dark. When brightness is set up to more than 223, it will get stored as 255 and this value survives unlimited amount of restarts. But that is just too bright.

Ondřej Caletka opened issue #797: DNS64 synthesis fails for tudelft.account.worldcat.org at Knot projects / Knot Resolver

2023-06-17T11:22:24+02:00

In kresd version 5.6.0 with DNS64 module enabled, when resolving tudelft.account.worldcat.org, DNS64 does not kick in:

$ dig tudelft.account.worldcat.org a   

; <<>> DiG 9.16.37 <<>> tudelft.account.worldcat.org a
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52064
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;tudelft.account.worldcat.org.  IN      A

;; ANSWER SECTION:
tudelft.account.worldcat.org. 2459 IN   CNAME   emea.account.worldcat.org.
emea.account.worldcat.org. 28   IN      A       193.240.184.98

$ dig tudelft.account.worldcat.org aaaa

; <<>> DiG 9.16.37 <<>> tudelft.account.worldcat.org aaaa
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63626
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; EDE: 4 (Forged Answer): (BHD4: DNS64 synthesis)
;; QUESTION SECTION:
;tudelft.account.worldcat.org.  IN      AAAA

;; AUTHORITY SECTION:
worldcat.org.           653     IN      SOA     michelle.ns.cloudflare.com. dns.cloudflare.com. 2312413286 10000 2400 604800 1800

The zone in question is hosted by Cloudflare and has DNSSEC enabled so my wild guess is that it has something to do with the way Cloudflare signs negative answers.

Ondřej Caletka commented on issue #392 at Turris / Turris OS / Turris Build

2023-06-13T15:30:40+02:00

Great find. Although this is more related to #355. The broadcast leak between VLANs (this issue) seems to be fixed with kernel version 6.1, but still present in current version 5.15.

Ondřej Caletka opened issue #392: VLAN leak with bridge VLAN filtering on MOX running TOS 6.0.4 at Turris / Turris OS / Turris Build

2022-12-16T17:15:04+01:00

Possibly related to #355.

In my setup, Turris Mox have all interfaces in a bridge and VLAN filtering is used to setup different roles for different ports. In this output of bridge vlan command, port lan1 has only one allowed VLAN number 60 that is also PVID. This port is used as an access port for computers.

# bridge vlan
port              vlan-id  
eth0              20 PVID Egress Untagged
                  21
                  60
lan1              60 PVID Egress Untagged
lan2              62 PVID Egress Untagged
lan3              20 PVID Egress Untagged
                  21
                  60
lan4              22 PVID Egress Untagged
br-guest_turris   1 PVID Egress Untagged
br-lan            20
                  21
                  22
                  60
                  62
wlan1             22 PVID Egress Untagged
wlan0             22 PVID Egress Untagged
wlan0-1           62 PVID Egress Untagged

Despite this setup, I can see some tagged frames with VLAN tag 20 or 22 leaking into the lan1 port. Only multicast traffic leaks like this. This is especially harmful for Windows, since that OS mostly ignores 802.1q header and receive data from all VLANs, breaking IPv6 configuration every time a RA is sent into some of the other VLANs.

Ondřej Caletka commented on issue #372 at Turris / Turris OS / Turris Build

2022-10-18T16:54:00+02:00

Hey! I am suffering from this issue as well. It happens only on 1 GB version of MOX A, 512 MB versions seem to be rock stable. Only TOS 6.0 is affected. The kernel panics happen with no relation to modules attached every few hours. Here is a list what I have already tried:

replace MOX A with 512MB version (helped)
replace MOX A with another 1GB version (didn't help)
replace SD card with some other (didn't help)
disconnect all Moxtet modules and SDIO Wi-Fi module (didn't help)
do a fresh installation of HBK branch (didn't help)
downgrade to HBS (helped)

The last crash today is this:

[22242.897755] SError Interrupt on CPU0, code 0xbf000001 -- SError
[22242.897780] CPU: 0 PID: 4528 Comm: foris-controlle Not tainted 5.15.74 #0
[22242.897791] Hardware name: CZ.NIC Turris Mox Board (DT)
[22242.897795] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[22242.897804] pc : el0_da+0x18/0x50
[22242.897822] lr : el0t_64_sync_handler+0x60/0xb0
[22242.897831] sp : ffffffc00b833e80
[22242.897834] x29: ffffffc00b833e80 x28: ffffff8000240000 x27: 0000000000000000
[22242.897850] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[22242.897860] x23: 0000000080000000 x22: 0000007fa9cad740 x21: 00000000ffffffff
[22242.897872] x20: ffffffc03714d000 x19: ffffffc00b833eb0 x18: 0000000000000000
[22242.897883] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[22242.897894] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[22242.897904] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
[22242.897914] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[22242.897924] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffffffc00896f3ac
[22242.897934] x2 : ffffffc00896f3c4 x1 : 0000000092000018 x0 : 0000007fa9f3b510
[22242.897949] Kernel panic - not syncing: Asynchronous SError Interrupt
[22242.897953] SMP: stopping secondary CPUs
[22242.897963] Kernel Offset: disabled
[22242.897965] CPU features: 0x00000000,20000802
[22242.897971] Memory Limit: none

Ondřej Caletka commented on issue #355 at Turris / Turris OS / Turris Build

2022-08-12T10:15:52+02:00

Hey! I discovered that the same problem appears in Omnia running HBL too. In this setup:

root@omnia:~# bridge vlan show
port              vlan-id  
lan0              21 PVID Egress Untagged
lan1              60 PVID Egress Untagged
lan2              60 PVID Egress Untagged
lan3              60
lan4              20 PVID Egress Untagged
                  21
                  60
br-guest-turris   1 PVID Egress Untagged
br-lan            20
                  21
                  60
wlan1             20 PVID Egress Untagged
wlan0             20 PVID Egress Untagged
wlan0-1           60 PVID Egress Untagged
wlan0-2           21 PVID Egress Untagged

Packets received on wlan0-1 and forwarded to lan2 or lan1 get tagged with vlan ID 60 while traffic between the router itself (br-lan.60 interface) and ports lan2 or lan1 goes untagged. Also ingress traffic from lan1 or lan2 to wlan0-1 flows without any tags.

Ondřej Caletka opened issue #359: MOX: no RTC on kernel 5.15 at Turris / Turris OS / Turris Build

2022-08-04T19:42:12+02:00

With current hbl kernel 5.15.59, there is no access to the Real Time Clock:

# ls /dev/rtc*
ls: /dev/rtc*: No such file or directory
# ls /sys/bus/i2c/devices/
#

Possibly related: The time jump during boot probably triggers some bug in lighttpd, that break TLS capability. Restarting lighttpd works around this issue.

# cat /var/log/lighttpd/error.log 
2022-08-04 18:27:45: (../src/server.c.1588) server started (lighttpd/1.4.65)
2022-08-04 18:27:54: (../src/server.c.267) warning: clock jumped 3570 secs
2022-08-04 18:27:54: (../src/server.c.275) attempting graceful restart in < ~5 seconds, else hard restart
2022-08-04 19:27:24: (../src/server.c.1019) [note] graceful shutdown started
2022-08-04 19:27:24: (../src/server.c.2097) server stopped by UID = 0 PID = 4432
2022-08-04 19:27:24: (../src/server.c.1588) server started (lighttpd/1.4.65)
2022-08-04 19:27:28: (../src/connections.c.716) unexpected TLS ClientHello on clear port (2001:db8::fe5)

From computer:

# curl https://turris.example/ -v              
*   Trying 2001:db8::1:443...
* Connected to … port 443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* error:1408F10B:SSL routines:ssl3_get_record:wrong version number
* Closing connection 0
curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number

Ondřej Caletka commented on issue #354 at Turris / Turris OS / Turris Build

2022-07-27T19:00:05+02:00

Hey, thanks! I can confirm that LEDs now do work. I worked around missing rainbow by a small shell script like this:

#!/bin/sh
cd /sys/class/leds/
for n in rgb\:*/multi_intensity;
do
        # default color
        echo 0 255 187 > "$n"
done

echo 255 255 255 > rgb\:wan/multi_intensity
echo 255 255 255 > rgb\:indicator-1/multi_intensity
echo 255 17 0 > rgb\:power/multi_intensity
echo 255 100 0 > rgb\:wlan-1/multi_intensity
echo 100 255 0 > rgb\:wlan-2/multi_intensity
echo 255 0 34 > rgb\:wlan-3/multi_intensity

But it seems that in order to actually change the color, one has to overwrite trigger. So I set up software based triggers for all LEDs in /etc/config/system like this:

config led 'led_lan4'
        option name 'lan4'
        option sysfs 'rgb:lan-4'
        option trigger 'netdev'
        option mode 'link tx rx'
        option dev 'lan4'

Ondřej Caletka opened issue #355: VLAN filtering broken in kernel 5.15.x (hbl) at Turris / Turris OS / Turris Build

2022-07-22T16:16:05+02:00

In my setup, Mox acts both as a router for some vlans as well as a switch for others. I put all ports including eth0 into one bridge and use vlan filtering.

config device
        option name 'br-lan'
        option type 'bridge'
        option bridge_empty '1'
        option force_link '1'
        list ports 'lan1'
        list ports 'lan2'
        list ports 'lan3'
        list ports 'lan4'
        list ports 'eth0'

config bridge-vlan
        option device 'br-lan'
        option vlan '22'
        list ports 'lan3'
        list ports 'lan4'

config bridge-vlan
        option device 'br-lan'
        option vlan '60'
        list ports 'eth0:t'
        list ports 'lan1'

config bridge-vlan
        option device 'br-lan'
        option vlan '62'
        list ports 'lan2'

config bridge-vlan
        option device 'br-lan'
        option vlan '20'
        list ports 'eth0'

config bridge-vlan
        option device 'br-lan'
        option vlan '21'
        list ports 'eth0:t'

config interface 'wan'
        option device 'br-lan.20'
        …

config interface 'lan'
        option device 'br-lan.22'
        …

# bridge vlan show
port              vlan-id  
eth0              20 PVID Egress Untagged
                  21
                  60
lan1              60 PVID Egress Untagged
lan2              62 PVID Egress Untagged
lan3              22 PVID Egress Untagged
lan4              22 PVID Egress Untagged
br-lan            20
                  21
                  22
                  60
                  62
wlan0             22 PVID Egress Untagged
wlan0-1           62 PVID Egress Untagged

# uname -r
5.4.203

After upgrade to kernel 5.15.50 in HBL, I am having troubles with VLAN 60, that just traverses tagged on eth0 and untagged on lan1 port. The ingress traffic on lan1 gets tagged and delivered to eth0 but in egress direction, the tag is not stripped when leaving lan1 interface despite bridge vlan show showing Egress Untagged. There are no problems with other VLANs which don't traverse between ethernet ports. Alo if I use different LAN port in place of eth0 for uplink, problem is gone.

Ondřej Caletka opened issue #354: HBL: kernel 5.15: LEDs on Turris Omnia not supported at Turris / Turris OS / Turris Build

2022-07-21T21:19:51+02:00

After upgrade of the kernel from version 5.4.203-1-da0ddeb89bb0e25d2a575f62263f6300 to version 5.15.50-1-bbde583666f0d21706f8da40fd4a8532, LEDs stopped working on Omnia. It seems that the driver is missing:

# ls /sys/class/leds/
ath10k-phy0  ath9k-phy1   mmc0::
# rainbow all enable 
Failed to open file: No such file or directory

Reverting to previous kernel fixes the issue.

Ondřej Caletka opened issue #727: DNS64: PTR synthesis yields SERVFAIL for some cache contents at Knot projects / Knot Resolver

2022-03-04T19:49:12+01:00

Summary

When cache is cold, PTR synthesis of DNS64 module works well. When cache gets populated by quering without DNS64 synthesis on, PTR synthesis stops working and SERVFAIL is returned instead.

Steps to reproduce

# cat /etc/knot-resolver/kresd.conf 
-- SPDX-License-Identifier: CC0-1.0
-- vim:syntax=lua:set ts=4 sw=4:
-- Refer to manual: https://knot-resolver.readthedocs.org/en/stable/

-- Network interface configuration
net.listen('127.0.0.1', 53, { kind = 'dns' })
net.listen('127.0.0.1', 853, { kind = 'tls' })
--net.listen('127.0.0.1', 443, { kind = 'doh2' })
net.listen('::1', 53, { kind = 'dns', freebind = true })
net.listen('::1', 853, { kind = 'tls', freebind = true })
--net.listen('::1', 443, { kind = 'doh2' })

-- Load useful modules
modules = {
        'hints > iterate',  -- Allow loading /etc/hosts or custom root hints
        'stats',            -- Track internal statistics
        'predict',          -- Prefetch expiring/frequent records
        'dns64',
        'view',
}

-- Disable DNS64 for IPv4
view:addr('0.0.0.0/0', policy.all(policy.FLAGS('DNS64_DISABLE')))

-- Cache size
cache.size = 100 * MB

First query over IPv6 works as expected:

# kdig @::1 -x 64:ff9b::101:101 +noall +answer 
1.0.1.0.1.0.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.b.9.f.f.4.6.0.0.ip6.arpa. 60    IN      CNAME 1.1.1.1.in-addr.arpa.
1.1.1.1.in-addr.arpa.   1265    IN      PTR     one.one.one.one.

Query over IPv4, where DNS64 is disabled, also works properly with NXDOMAIN:

# kdig @127.0.0.1 -x 64:ff9b::101:101  
;; ->>HEADER<<- opcode: QUERY; status: NXDOMAIN; id: 41713
;; Flags: qr rd ra; QUERY: 1; ANSWER: 0; AUTHORITY: 1; ADDITIONAL: 0

;; QUESTION SECTION:
;; 1.0.1.0.1.0.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.b.9.f.f.4.6.0.0.ip6.arpa.    IN      PTR

;; AUTHORITY SECTION:
ip6.arpa.               3600    IN      SOA     b.ip6-servers.arpa. nstld.iana.org. 2021111921 1800 900 604800 3600

After this query, PTR synthesis does not work anymore and yields SERVFAIL:

# kdig @::1 -x 64:ff9b::101:101  
;; ->>HEADER<<- opcode: QUERY; status: SERVFAIL; id: 25807
;; Flags: qr rd ra; QUERY: 1; ANSWER: 0; AUTHORITY: 0; ADDITIONAL: 0

;; QUESTION SECTION:
;; 1.0.1.0.1.0.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.b.9.f.f.4.6.0.0.ip6.arpa.    IN      PTR

Clearing the cache restores correct behavior for a while.