Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error connecting to the DNS server #230

Closed
chkaufmann opened this issue Dec 9, 2020 · 20 comments
Closed

Error connecting to the DNS server #230

chkaufmann opened this issue Dec 9, 2020 · 20 comments
Labels

Comments

@chkaufmann
Copy link

Well it worked great, I do not have any idea what has changed!

DKIM_Verifier.Verifier DEBUG 1 DKIM-Signatures found. verifier.mjs.js:1147:8
DKIM_Verifier.Verifier DEBUG Verifying DKIM-Signature 1 ... verifier.mjs.js:1162:9
DKIM_Verifier.Verifier DEBUG Parsed DKIM-Signature 1:
{…}

a_hash: "sha256"

a_sig: "rsa"

b: "X1QLH1ybi3WwLDCfFf1BNsiPpZZ2/YJ7AQ/4YBAeTNKGV4DWH8ZSl5jLUxamWChdTKRj+cFTgU6vGOKmbCUOwV70iF/H6ms/V6ydsbw6bHQRcjgmobeFy0J+E1joFvxBwtR7UJQdByYqDViVb2OaAaXcXVNfXUKOVgZPolrDhpw="

b_folded: "X1QLH1ybi3WwLDCfFf1BNsiPpZZ2/YJ7AQ/4YBAeTNKGV4DWH8ZSl5jLUxamWChdT\r\n\t KRj+cFTgU6vGOKmbCUOwV70iF/H6ms/V6ydsbw6bHQRcjgmobeFy0J+E1joFvxBwtR\r\n\t 7UJQdByYqDViVb2OaAaXcXVNfXUKOVgZPolrDhpw="

bh: "40Jc0tX5bKRiSOgX7fHRUnNwQ14Lg6cISIpgF5Oj/2M="

c_body: "relaxed"

c_header: "relaxed"

d: "github.com"

h_array: Array(5) [ "date", "from", "to", … ]

i: "@github.com"

i_domain: "github.com"

keySecure: undefined

l: null

original_header: "DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com;\r\n\ts=pf2014; t=1607509055;\r\n\tbh=40Jc0tX5bKRiSOgX7fHRUnNwQ14Lg6cISIpgF5Oj/2M=;\r\n\th=Date:From:To:Subject:From;\r\n\tb=X1QLH1ybi3WwLDCfFf1BNsiPpZZ2/YJ7AQ/4YBAeTNKGV4DWH8ZSl5jLUxamWChdT\r\n\t KRj+cFTgU6vGOKmbCUOwV70iF/H6ms/V6ydsbw6bHQRcjgmobeFy0J+E1joFvxBwtR\r\n\t 7UJQdByYqDViVb2OaAaXcXVNfXUKOVgZPolrDhpw=\r\n"

q: "dns/txt"

s: "pf2014"

t: 1607509055

v: "1"

warnings: Array []

x: null

z: null

: Object { … }
verifier.mjs.js:1164:9
DKIM_Verifier.Verifier DEBUG computed body hash: 40Jc0tX5bKRiSOgX7fHRUnNwQ14Lg6cISIpgF5Oj/2M= verifier.mjs.js:1025:7
DKIM_Verifier.JSDNS: Resolving pf2014._domainkey.github.com TXT by querying 8.8.8.8 JSDNS.jsm.js:537:6
DKIM_Verifier.JSDNS: pf2014._domainkey.github.com/TXT: Lookup failed with rcode 5 JSDNS.jsm.js:806:7
DKIM_Verifier.background DEBUG dns result
{…}

bogus: false

data: null

rcode: 5

secure: false

: Object { … }
background.mjs.js:42:6
DKIM_Verifier.background INFO DNS query failed with result:
{…}

bogus: false

data: null

rcode: 5

secure: false

: Object { … }
background.mjs.js:48:7
DKIM_Verifier.Verifier ERROR DKIM_InternalError: DKIM_DNSERROR_SERVER_ERROR
getKey moz-extension://4bae297f-09f6-41fb-b9e8-e6079512da68/content/background.mjs.js:49
verifier.mjs.js:984:8
handleException moz-extension://4bae297f-09f6-41fb-b9e8-e6079512da68/modules/dkim/verifier.mjs.js:984
processSignatures moz-extension://4bae297f-09f6-41fb-b9e8-e6079512da68/modules/dkim/verifier.mjs.js:1168
DKIM_Verifier.Verifier DEBUG Exception on DKIM-Signature 1 verifier.mjs.js:1169:9
DKIM_Verifier.SignRules DEBUG shouldBeSigned:
Object { shouldBeSigned: false, sdid: (2) […], foundRule: true, hideFail: false }
signRules.mjs.js:374:7
DKIM_Verifier.AuthVerifier DEBUG authResult:
Object { version: "2.1", dkim: (1) […] }
AuthVerifier.mjs.js:173:7
Capture

@lieser
Copy link
Owner

lieser commented Dec 18, 2020

sorry for the late response.

Do you still have the problem? The logs show Lookup failed with rcode 5, which means the DNS server (Google's 8.8.8.8 in your case) actively refused the connection.
Must admit I have no idea why this is/was happening. If you still have the problem, I would sugest to try a different DNS server (e,g, CloudFlare's 1.1.1.1).

@silvershadowcc
Copy link

I have the same problem. After changing one of the settings (and changing it back) it works for a while. Then it starts displaying the error for every email. Tried both Google's 8.8.8.8 and CloudFlare's 1.1.1.1
So I don't think that has anything to do with it.

@lieser
Copy link
Owner

lieser commented Feb 6, 2021

Are you also seeing Lookup failed with rcode 5, or something else? I still thing that this is not an error in the add-on, and instead some strange networking problem.

But the add-ons behavior in regards to retrying is definitely not helping in this case, and needs improvements.
Currently, if there is an error with a server, the server is marked as "dead", and not retied until the DNS settings change, or Thunderbird is restarted (disabling and enabling the add-on should work too).

For now I suggest the following workarounds:

To enable dns.jsdns.autoResetServerAlive, do the following:

  1. Open the add-ons manager
  2. Click the gear in the top right and select "Debug Add-ons"
  3. Click the "Inspect" button of the DKIM Verifier (a tab should open with something similar as described at https://extensionworkshop.com/documentation/develop/debugging/)
  4. In the top, click on Console
  5. In the console, enter and execute await (await import("../modules/preferences.mjs.js")).default.setValue("dns.jsdns.autoResetServerAlive", true)

@lieser
Copy link
Owner

lieser commented Feb 7, 2021

I attached a test version to #199 (comment) there the key storing feature should work also in TB 78.

@tadpole2077
Copy link
Contributor

Thanks lieser for the 4.0.0pre4.1 drop.
I'm also experiencing the same fault as above, fault behavior invokes after a set amount of hours typically >3 hours on my sys. Restart of thunderbird resolves it, as you mention above DNS server marked as 'dead' is likely a root cause key factor, not sure if this is caused by a temp network outage that then triggers the dead DNS marker.

I've installed the new test 4.1 package now, and will report back on finding over the next 24 / 48 hours.

@tadpole2077
Copy link
Contributor

was working fine since yesterday with latest 4.1 codeset until about an hour ago, now all mail Dkim checks fail.. below is your (lieser) github mail response to me.
Its an rcode 2 fault from the _getKeyFromDNS() call. Likely on process restart the issue with no longer occur - until next x triggers it.

image

@lieser
Copy link
Owner

lieser commented Feb 12, 2021

Note that I did not change anything in 4.0.0pre4.1 to improve the underlying issue. It only includes the Caching of the DKIM keys feature, that should greatly reduce the impact of this issue.

Regarding the log you showed: The interesting pars are happening in the experiments part of the add-on. Unfortunately the logs for that (or code) do not appear in the add-ons debugger.

You will have to open the normal Error Console from Thunderbird to see them (or the debugger for Thunderbird itself, but this is not needed).

@tadpole2077
Copy link
Contributor

Thunderbird (browser) Error Console isnt showing any additional info that i can tell(snap below). Any other options? I'll try and walk though the code.. My thunderbird crashed and restarted with same error still occurring on all mail DKIM checks, so still able to extract some debug data.

image

@lieser
Copy link
Owner

lieser commented Feb 12, 2021

The log you posted shows JSDNS: no DNS Server alive. Meaning at that point the add-on was already in the state there all further DNS queries will fail.
All this messages from JSDNS is something you did not see before.

What error caused your DNS server to be marked as disabled should be shown further up the log. See e.g. the image in the first post, there it shows Lookup failed with rcode 5.

@tadpole2077
Copy link
Contributor

I see that jsdns.js:86 log entry now. I've rebooted my desktop box this morning, DKIM plugin is working optimally at the moment, i'll keep the console log open today in the hope of capturing the additional jsdns fault log details at precisely the moment the DNS server is tagged as dead (likely assigning the rcode = 5)

Per your prior Reponses on this thread, once the DNSServer is marked as dead, no retry occurs - and either DNS setting charge or TB restart is required to reactive it again.
As a test scenario, is there a way for me to set the DNSServer as alive via console cmd line, to force a further check - purpose of this is to confirm that the problem is due to a prior tagging of the DNSServer as death, and not some prevailing issue within the plugin instance data store that would mark the DNSServer as death again.

@tadpole2077
Copy link
Contributor

No replication of the above error reported today, so i did a bit of experimenting.
Tried simulating a ISP disconnect, then opened a mail to triggered a DKIM process, resulting in "error connecting to DNS server".

Then reconnected ISP, got some more mail down, and triggered a DKIM process - same error shown "error connecting to DNS server". Likely this is a known limitation, no DNS server connection Retry, but still I would expect the DKIM process to attempt to reconnect to DNS or at a minimum to reattempt a DNS server connection after a set interval of time since last fail, maybe a timestamp stored with prior fail, and retry after x seconds, if attempting to connect to the DNS server after prior failure is a concern.

image

@tadpole2077
Copy link
Contributor

Today i captured the full sequence of the likely fault, or at least one scenario that causes it..
Internet connect was working fine - no issues with other sites or download of mail before opening the first one from reputable source reebok.com that then throws the rcode 2 fault,

DNS server reporting timeouts, resulting in dead server.

image

Expanding the Win Reg log entry and DNS servers entry...
image

Not sure if this initial error on maxLogLevelPerf is related in any way.
Console.maxLogLevelPref used with a non-existing pref: extensions.dkim_verifier.experiments.logging

@tadpole2077
Copy link
Contributor

restart and retest TB, after checking connection health is good, i can ping 8.8.8.8, 1.1.1.1, 1.0.0.1
Same issue highlighted the rcode:5 entries in log below, timed out error reported but i question if that maybe a false positive for timeout, instead some other earlier issue and then throwing the wrong error description.

image

@tadpole2077
Copy link
Contributor

if you can advise on how to debug the experiments js files, such as jsdns.js I believe i can track down the fault, and walk though code such as browser.jsdns.txt().
My hunch at this time is a timing issue on a async call, which may be using the browser.jsdns() before it is fully init() and configured, and possible causing a block on configureJsdns() - due to the flag jsdnsIsConfigured = true. Until i can walk though the experiments code its purely a guess.

@lieser
Copy link
Owner

lieser commented Feb 15, 2021

Thanks for the offer of debugging, but currently I personally don't see a need for it.

The Console.maxLogLevelPref used with a non-existing pref: extensions.dkim_verifier.experiments.logging can be ignored.

Regarding rcode 5:

Are you sure that 192.168.42.129 is a DNS Server? Have yout tried that nslookup google.com 192.168.42.129 works?

Regarding the timeout:

The timeout is generated by Thunderbird's underlying network code, not the add-on. Is also comes as expected after 5 seconds, the timeout the add-on uses by default.
I expected it is a short network issue, that you just did not notice.

If you didn't know, it is possible to make a continuous ping with ping -t 1.1.1.1, which helps detecting if there are indeed some short temporary network problems.

If you still want to debug:

For debugging experiments Thun Tools > Developer Tools > Developer Toolbox has to be used (Note that a windows in the background will open there you need to accept the connection request).
grafik

The files will only show up after they there used once.

@gxtu
Copy link

gxtu commented Feb 20, 2021

I'm getting the same error (Error connecting to the DNS server) in v4.0.0pre4. This is an example of what I see in the developer toolbox:
dkim

Manually running nslookup -type=TXT mail20150812._domainkey.mailbox.org 10.0.0.1 does seem to work though and results in:

Server:		10.0.0.1
Address:	10.0.0.1#53

Non-authoritative answer:
mail20150812._domainkey.mailbox.org	text = "v=DKIM1; p=MIIBIjANBgkqhkiG9w" "0BAQEFAAOCAQ8AMIIBCgKCAQEAytTqbbblelvR6JIoHG" "1GHd0C4VIx4j8Br7EAZXijWoG34Lpa1vJCKYjjg0LsBsXPy5BHrqqyLYvm9GJ7dL3" "ZS+ec0o4co1F6j6DwTUs13aYd0Nni8qKWoY9uf" "qbn2dqdZJu+mg3bwkS5iqzG+t0CqHQNI/aIyXTkfyTsDM+aNl3r2p0bJXw4G5obNn2hneqfqD9x3z+0EwTwsnN04vlw68tBTpd8" "UcZJJXVGSN+Beq1DyElFYczrzkln8k5I9A3mEeqHAYlbF5hCRCkVw2Pp7SmyTQeV+JHcnHOSE4zpV4SUsECKSz1En6ix0+kI23DUpBF84ofMiGMe9P0YMupu0QIDAQAB"

Authoritative answers can be found from:

Edit:
This appears to be a problem with JSDNS:
jsdns

When I switch the resolver to libunbound instead of JSDNS in the addon's preferences, everything works fine.

@tadpole2077
Copy link
Contributor

Great feedback, tks!
Sorry about the delay in response, got pulled into some firefighting(not my code).
I didnt find the JSDNS.jsm.js as its not loaded/displayed in the TB debugger until After the first DKim pass though unless breakpoints have already added . O the joys of debugging JS with many async call and callback fn's firing off in the background, can be difficult to trace and catch..

Regarding 192.168.42.129 :

  • Settings found under >> WIN 10/Network connections/TCP Adaptor Properties/IPv4/
    Obtain IP address automatically (enabled)
    Preferred DNS Server: 1.1.1.1
    Alt DNS Server: 1.0.0.1

  • DKIM Options > General > DNS > Resolver:JavaScript DNS library > (Ticked) Getr DNS serverse from OS config > DNS Server: 8.8.8.8

  • My network setup was configured over the last couple of months, still need to lay out some CAT6/7 cable to my router, in the meantime i'm using my mobile phone via USB hotspot and tethering(Android) to connect to my local router.
    I believe that results in the inclusion of 192.168.42.129 in the DNS list used by DKim.
    My fiber router box, is a Vodafone gigabox - with DNS configuration set to Automatic

CMD: ipconfig/all

Ethernet adapter Ethernet 2:

Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : SAMSUNG Mobile USB Remote NDIS Network Device
Physical Address. . . . . . . . . : 02-2B-7B-7D-14-6B
DHCP Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
IPv4 Address. . . . . . . . . . . : 192.168.42.149(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Lease Obtained. . . . . . . . . . : Friday 19 February 2021 17:51:01
Lease Expires . . . . . . . . . . : Friday 19 February 2021 20:42:23
Default Gateway . . . . . . . . . : 192.168.42.129
DHCP Server . . . . . . . . . . . : 192.168.42.129
DNS Servers . . . . . . . . . . . : 1.1.1.1
1.0.0.1
NetBIOS over Tcpip. . . . . . . . : Enabled

In the entries for the DNS servers stored on my registry at location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces

  • has 192.168.42.129 as a default gateway on one of the interfaces (matching the adapter details from ipconfig)

Conclusion:

  • Its a default preferred internal LAN IP assigned to my workstation when connecting to my router.
  • See [https://en.ipshu.com/ipv4/192.168.42.149] for details on usage of this IP.
    "The IP address 192.168.42.149 is the default gateway for most wireless routers or ADSL modems."
  • The local router gateway is 192.168.1.1

===========================
Walk though of Code, on load of TB, opening first email with DKIM sig.

  • JSDNS.jsm.js > configureDNS() (breakpoint on 239)
    Called by : configure() in getAPU() > jsdns.js
    Para2: nameServer = “8.8.8.8”

  • Calls: getOsDnsServers() {JSDNS.jsm.js Ln:298}
    Pulls Register data from Component.classes : @mozilla.org/windows-registry-key;1
    Reg: ROOT_KEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\

    // get interfaces in routing order
    Reg: Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Linkage
    Interfaces = Extract String Array of interfaces (4 GUID’s interface linkage keys found)

    // filter out deactivated interfaces
    None found

    // get NameServer and DhcpNameServer of all interfaces (Ln:374)
    REG >> ROOT_KEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces
    5 GUID Interface keys found
    (1) & (2) have no DHCPNameServer, or NameServer
    (3) 192.168.1.1 (DhcpNameServer)
    (4) 192.168.42.149 (DhcpNameServer), 1.1.1.1 , 1.0.0.1(NameServer)
    (5) No DhcpNameServer or NameServer

    Above GUID Keys (3) and (4) are used, they match with interface routing order GUID’s found earlier, and actually contain either (a)NameServer or (b)DhcpNameServer keys

Conclusion:
Cross referenced with the actual Register data and its all good, no missing keys and all the DHCPNameServer and NameServer IP's are extracted as expected.
So no issues on the pull of DNCP and NameServer IP's that May map to live DNS public server.
The 2xDHCPNameServer mappings will likely fail on DNS request calls - unless they pass-though those requests.

==============================
I've started DNS ping logging, and certainly i'm getting Ping timeout on 1.1.1.1 avg of around 50 per hour.
I need to do some more tests on this if its an issue with my local wireless connection to local router, or on the router to ISP via fiber, or on the DNS server itself.
I'll also continue code familiarisation, tracing calls on dns call timeouts to eval the existing process, fallback process and retry.

One question for you, on these calls to the DNS servers (4 registered IP's in my case), the expected action is to call each one in sequence, on fail(timeout etc) of one DNS server, trigger a query to the next DNS IP in the array?

@lieser
Copy link
Owner

lieser commented Feb 28, 2021

@gxtu

Can you please repeat your test with nslookup -type=TXT "-set vc" mail20150812._domainkey.mailbox.org 10.0.0.? The "-set vc" part should make nslookup use TCP (and not UDP), which is also what JSDNS is using.

If it still works with nslookup using TCP, do you by chance have access to the logs of the DNS server?

I also just now did a network capture with Wireshark, and then comparing it to nslookup noticed the following:
The JSDNS resolver is currently using a static (i.e. not changing) Transaction ID (TXID) (in RFC 1035 only named ID).
Maybe this is causing some DNS servers to actively refuse the query with RCODE 5.
If there is someone more experienced with DNS who could confirm this, that would be nice.

If the current behavior is something you can easily reproduce, I will also try to provide you with a testing version that is not using a static TXID, so you can just try out if this is causing problems.

@tadpole2077

Yes, that code is definitely not the most pleasant to debug. Especially since I my extensions to the original code didn't help readability (the code would definitely profit from a refactoring)

on these calls to the DNS servers (4 registered IP's in my case), the expected action is to call each one in sequence, on fail(timeout etc) of one DNS server, trigger a query to the next DNS IP in the array?

Your observations are correct. If there is an error, the DNS server is marked as dead, and the next DNS server is tried.
If all DNS servers are marked as dead, the current default behavior is that all following DNS queries fail, until the DNS servers are reconfigured.
As I described in #230 (comment), this can be changed via the dns.jsdns.autoResetServerAlive setting.

If dns.jsdns.autoResetServerAlive is set to true, the query that sets the last DNS server to dead still fails. But the next query should start to try out all DNS servers again.
With your unstable connection, I would recommend to change this value.

And again, with your temporary network issues, the add-ons behavior may not be the best user experience. But it is currently unfortunately totally expected.

@gxtu
Copy link

gxtu commented Feb 28, 2021

Can you please repeat your test with nslookup -type=TXT "-set vc" mail20150812._domainkey.mailbox.org 10.0.0.? The "-set vc" part should make nslookup use TCP (and not UDP), which is also what JSDNS is using.

The command I had to use is slightly different, but I think we've found the problem.

$ nslookup -type=TXT -vc mail20150812._domainkey.mailbox.org 10.0.0.1 
Server:		10.0.0.1
Address:	10.0.0.1#53

** server can't find mail20150812._domainkey.mailbox.org: REFUSED

Looks like this is a problem with just my local DNS server, since others appear to be working just fine:

Server:		1.1.1.1
Address:	1.1.1.1#53

Non-authoritative answer:
mail20150812._domainkey.mailbox.org	text = "v=DKIM1; p=MIIBIjANBgkqhkiG9w" "0BAQEFAAOCAQ8AMIIBCgKCAQEAytTqbbblelvR6JIoHG" "1GHd0C4VIx4j8Br7EAZXijWoG34Lpa1vJCKYjjg0LsBsXPy5BHrqqyLYvm9GJ7dL3" "ZS+ec0o4co1F6j6DwTUs13aYd0Nni8qKWoY9uf" "qbn2dqdZJu+mg3bwkS5iqzG+t0CqHQNI/aIyXTkfyTsDM+aNl3r2p0bJXw4G5obNn2hneqfqD9x3z+0EwTwsnN04vlw68tBTpd8" "UcZJJXVGSN+Beq1DyElFYczrzkln8k5I9A3mEeqHAYlbF5hCRCkVw2Pp7SmyTQeV+JHcnHOSE4zpV4SUsECKSz1En6ix0+kI23DUpBF84ofMiGMe9P0YMupu0QIDAQAB"

Authoritative answers can be found from:

My router is running OpenWrt with dnsmasq, but the log only shows:

query[TXT] mail20150812._domainkey.mailbox.org from 10.0.0.10
daemon.info dnsmasq[1454]: 532 10.0.0.10/51005 config error is REFUSED

Seems like this is a problem on my side then. Thanks for your help and sorry for bothering you with it!

@lieser
Copy link
Owner

lieser commented Jul 4, 2021

As there has been no response since a while I guess the issue as been resolved for you all (or at least a workaround found).
Note that I created a separate issue for exposing and/or changing the dns.jsdns.autoResetServerAlive value (#269).

If anyone still has problems, please create a new issue, as this one is getting hard to follow.

@lieser lieser closed this as completed Jul 4, 2021
@lieser lieser added the question label Jul 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants