Common Problems

Security Lameness

Similar to Lame Delegation in traditional DNS, this refers to the symptom when the parent zone holds a set of DS records that point to something that does not exist in the child zone. The resulting symptom is that the entire child zone may "disappear", being marked as bogus by validating resolvers.

Below is an example attempting to resolve the A record for a test domain name www.example.com. From the user's perspective, as described in the section called “How Do I know I Have a Validation Problem?”, only SERVFAIL message is returned. On the validating resolver, we could see the following messages in syslog:

named[6703]: error (no valid RRSIG) resolving 'example.com/DNSKEY/IN': 149.20.61.151#53
named[6703]: error (broken trust chain) resolving 'www.example.com/DS/IN': 149.20.61.151#53
named[6703]: error (broken trust chain) resolving 'www.example.com/A/IN': 149.20.61.151#53

This gives us a hint that it is a broken trust chain issue. Let's take a look at the DS records that are published by querying one of the public DNS resolvers that supports DNSSEC. We have highlighted in the key tag ID returned, and shortened some keys for display:

$ dig @8.8.8.8 example.com. DS

; <<>> DiG 9.10.1 <<>> @8.8.8.8 example.com. DS
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9640
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;example.com.		IN	DS

;; ANSWER SECTION:
example.com.	21599	IN	DS	53476 8 2 1544D......7DDA7
example.com.	21599	IN	DS	53476 8 1 CD2AF...0B47B

;; Query time: 212 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Thu Nov 27 17:23:42 CST 2014
;; MSG SIZE  rcvd: 133

Next, we query for the DNSKEY and RRSIG of example.com, to see if there's anything wrong. Since we are having trouble validating, we flipped on the +cd option to disable checking for now to get the results back, even though they do not pass the validation tests. The +multiline option tells dig to print the type, algorithm type, and key id for DNSKEY records. Again, key tag ID's are highlighted, and some long strings are shortened for display:

$ dig @8.8.8.8 example.com. DNSKEY +dnssec +cd +multiline

; <<>> DiG 9.10.1 <<>> @8.8.8.8 example.com. DNSKEY +dnssec +cd +multiline
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20329
;; flags: qr rd ra cd; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 512
;; QUESTION SECTION:
;example.com.	IN DNSKEY

;; ANSWER SECTION:
example.com.	299 IN DNSKEY 257 3 8 (
				AwEAAePggU...0VPPEX+DE=
				) ; KSK; alg = RSASHA256; key id = 48580
example.com.	299 IN DNSKEY 256 3 8 (
				AwEAAbMZp6...NRJnwyC/uX
				) ; ZSK; alg = RSASHA256; key id = 60426
example.com.	299 IN RRSIG DNSKEY 8 2 300 (
				20141227074820 20141127064820 48580 example.com.
				ph3eBXsBQy...fQTRTlpg== )
example.com.	299 IN RRSIG DNSKEY 8 2 300 (
				20141227074820 20141127064820 60426 example.com.
				VaQ0INIa3a...nj3YTPv5A= )

;; Query time: 368 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Nov 28 11:33:00 CST 2014
;; MSG SIZE  rcvd: 961

Here is our problem: the parent zone is telling the world that example.com is using the key 53476, but the authoritative servers are saying: no no no, I am using keys 48580 and 60426. There might be several causes for this mismatch, one possibility is that a malicious attacker has compromised one side and change the data. The more likely scenario is that the DNS administrator for the child zone did not upload the correct key information to the parent zone.

Incorrect Time

In DNSSEC, every record will come with at least one RRSIG, and RRSIG contains two timestamps indicating when it starts becoming valid, and when it expires. If the validating resolver's current system time does not fall within the RRSIG two timestamps, the following error messages occur in BIND debug log.

First, the example below shows the log messages when the RRSIG has expired. This could mean the validating resolver system time is incorrectly set too far in the future, or the zone administrator has not kept up with RRSIG maintenance.

validating @0xb7b839b0: . DNSKEY: verify failed due to bad signature (keyid=19036): RRSIG has expired

The logs below show RRSIG validity period has not begun. This could mean validation resolver system is incorrectly set too far in the past, or the zone administrator has incorrectly generated signatures for this domain name.

validating @0xb7c1bd88: www.isc.org A: verify failed due to bad signature (keyid=4521): RRSIG validity period has not begun

Invalid Trust Anchors

As we have seen in the section the section called “Trust Anchors”, whenever a DNSKEY is received by the validating resolver, it is actually compared to the list of keys the resolver has explicitly trusted to see if further action is needed. If the two keys match, the validating resolver stops performing further verification and returns the answer(s) as validated.

But what if the key file on the validating resolver is misconfigured or missing? Below we show some examples of log messages when things are not working properly.

First of all, if the key you copied is malformed, BIND will not even start up and you will likely find this error message in syslog:

named[18235]: /etc/bind/named.conf.options:29: bad base64 encoding
named[18235]: loading configuration: failure

If the key is a valid base64 string, but the key algorithm is incorrect, or if the wrong key is installed, the first thing you will notice is that pretty much all of your DNS lookups result in SERVFAIL, even when you are looking up domain names that have not been DNSSEC-enabled. Below shows an example of querying a recursive server 192.168.1.7:

$ dig @192.168.1.7 www.example.com. A

; <<>> DiG 9.10.1 <<>> @192.168.1.7 www.example.com. A
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 8093
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.example.com.			IN	A

delv shows similar result:

$ delv @192.168.1.7 www.example.com. +rtrace
;; fetch: www.example.com/A
;; resolution failed: failure

The next symptom you will see is in the DNSSEC log messages:

validating @0xb8b18a38: . DNSKEY: starting
validating @0xb8b18a38: . DNSKEY: attempting positive response validation
validating @0xb8b18a38: . DNSKEY: unable to find a DNSKEY which verifies the DNSKEY RRset and also matches a trusted key for '.'
validating @0xb8b18a38: . DNSKEY: please check the 'trusted-keys' for '.' in named.conf.

Unable to Load Keys

This is a simple yet common issue. If the keys files were present but not readable by named, the syslog messages are clear, as shown below:

named[32447]: zone example.com/IN (signed): reconfiguring zone keys
named[32447]: dns_dnssec_findmatchingkeys: error reading key file Kexample.com.+008+06817.private: permission denied
named[32447]: dns_dnssec_findmatchingkeys: error reading key file Kexample.com.+008+17694.private: permission denied
named[32447]: zone example.com/IN (signed): next key event: 27-Nov-2014 20:04:36.521

However, if no keys are found, the error is not as obvious. Below shows the syslog messages after executing rndc reload, with the key files missing from the key directory:

named[32516]: received control channel command 'reload'
named[32516]: loading configuration from '/etc/bind/named.conf'
named[32516]: reading built-in trusted keys from file '/etc/bind/bind.keys'
named[32516]: using default UDP/IPv4 port range: [1024, 65535]
named[32516]: using default UDP/IPv6 port range: [1024, 65535]
named[32516]: sizing zone task pool based on 6 zones
named[32516]: the working directory is not writable
named[32516]: reloading configuration succeeded
named[32516]: reloading zones succeeded
named[32516]: all zones loaded
named[32516]: running
named[32516]: zone example.com/IN (signed): reconfiguring zone keys
named[32516]: zone example.com/IN (signed): next key event: 27-Nov-2014 20:07:09.292

This happens to look exactly the same as if the keys were present and readable, and named loaded the keys and signed the zone. It will even generate the internal (raw) files:

# cd /etc/bind/db
# ls
example.com.db	example.com.db.jbk  example.com.db.signed

If named really loaded the keys and signed the zone, you should see the following files:

# cd /etc/bind/db
# ls
example.com.db	example.com.db.jbk  example.com.db.signed  example.com.db.signed.jnl

So, unless you see the *.signed.jnl file, your zone has not been signed.