DNS is the Problem

For our blog’s inaugural article, we chose one of the most common, yet vexing problems that new administrators encounter: DNS misconfiguration (domain name system). Many of us stumbled our way through these problems in the early days of TCP/IP emergence as the dominant network stack. We mainly knew of DNS from working with our Internet service providers at home. That knowledge was incomplete, so we could not fully understand the situations that greeted us in business environments. The struggle was so pervasive that I once saw someone say, “The answer is DNS; the question is irrelevant.” Over twenty years after I learned the rest of the story (the hard way, of course), new administrators still make the same mistakes.

[citationic]

Symptoms of Problematic DNS

For really broken DNS, you rarely need a lot of hints. Almost nothing network-related works, especially Internet access. If you know IP addresses, you can ping them, but nothing works by name. Web browsers will even tell you that they suspect DNS.

Mostly correct DNS gives you more insidious problems. Things work most of the time. Then, seemingly without warning, they don’t. By the time help arrives for the ailing user, the problem might appear to have magically fixed itself. A few common symptoms:

  • Active Directory login attempts sometimes take an unusually long time with sporadic failures
  • Intermittent inability to access functioning network shares
  • Applications have a long delay when initially connecting to internal resources, but work normally afterward
  • Problems don’t seem to attach to specific users; although, depending on usage, some may encounter trouble more frequently
  • External resources exhibit few or no problems
  • Logs yield no clues

Genuinely faulty DNS can cause these problems as well. However, DNS is a simple service with few ways to fail. If something breaks, you can expect the whole thing to stop working. General network problems can also result in these symptoms, although you typically find clues in error messages and logs. Before looking for difficult things, ensure that no one made a simple configuration error.

The One Big Rule for DNS

If your organization has one or more internal DNS servers responsible for providing resolution for one or more internal services, then your internal clients should not know about any other DNS servers. If those clients also need to know about external resources, then allow your internal DNS servers to handle that. Do not leave the decision of DNS server choice to the clients.

When DNS Plays a Role

As you’ve seen, web browsers tell you when they detect DNS problems. Most network-dependent applications won’t do that. Instead, they spin or hang for long periods of time or just fail. Even when a program can detect and report DNS failure, they won’t always do it. To client systems and software, a working but misconfigured DNS tree can behave exactly like a fully functional one. They have no mechanisms to know the difference.

You have a very simple test to know if DNS might have anything to do with your application: if it accesses or depends on any network resource by name. Some examples that might not be obvious at first glance:

  • Software that connects to a SQL server
  • Programs that read or write files on an SMB share (format: \\servername\sharename)
  • Anything that authenticates a user or resource

For the first two items, you can look through application settings to determine network reliance. The third may also have a setting somewhere, but you may not find it so easily. Many corporate applications deploy from a central location and populate the registry or write configuration files with network configuration. Others, notably those that offer a “single sign on” (SSO) experience, depend on the client computer’s domain membership to authenticate users and have no user-accessible settings. Active Directory relies on DNS for a heavy portion of its operations, including authentication. You don’t need to dive into every corner of a program to find out if it uses network resources. A reasonable suspicion will suffice.

The Basic Process of DNS Resolution

At its core, DNS matches names to IP addresses. It has other functions, all stemming from that one source. It can match names to names, IP addresses to names, special resources to any of the above, and simple kinds of information that can locate or describe domain resources.

We need DNS because TCP/IP communicates strictly using IP addresses. Humans do not remember numbers as well as we remember names. A number by itself conveys no semantic or contextual information. On the other hands, names can give us all that. So, DNS bridges the strength of humans to the strength of computers while addressing both their weaknesses.

Let’s look at a simple DNS operation, matching the name of a website to its IP address:

  1. In a web browser, you try to access https://projectrunspace.org
  2. The web browser gives the name to the local system’s networking stack to find an IP
  3. The networking stack looks in the local configuration for DNS server addresses. An administrator can enter them statically or a DHCP server can deliver them. If none exist, then the process stops here and the browser reports the failure.
  4. If the system has DNS server addresses, then the stack picks one. It sends a query to that IP to resolve the “projectrunspace.org” into an IP address. If that IP does not respond, then the stack moves through the others until it finds one that works or runs out of addresses to try. If none respond, the process stops.
  5. If a DNS server receives the query, then it first checks domains for which it serves as an authority. If it is authoritative for “projectrunspace.org”, then it will maintain a database of the records in that domain and will try to find a matching IP. In this case, you did not provide a host name (like “www”), so it will look for an unnamed record. It will report back what it finds (or doesn’t) to the client’s networking stack and the browser will work from there.
  6. If the DNS server received the query but is not authoritative for the domain, then it will start asking other servers that it knows about to narrow down the location of an authoritative server for “projectrunspace.org”. It will have to find one that knows about the “org” top-level domain, then find one there that knows about “projectrunspace.org”. It will continually work through the responses until it either finds an authoritative server or it runs out of time. We call this process “recursion”. Alternatively, the server may forward the request to another DNS server. That server will either perform the recursion or continue forwarding. Forwarding has the same end effect as recursion.
  7. If the browser gets an IP address, then it goes on to try to retrieve a web page from it. If it doesn’t get an IP address, then it deals with that according to its programming. Most will recognize the DNS failure and report it.

DNS resolution ends in one of four ways:

  • A usable record
  • A definite “cannot find a matching record” message
  • An error
  • A timeout

However resolution ends, it ends. The networking stack returns the end state to the browser (or whatever application made the request). Often, administrators either don’t know that or don’t understand the ramifications. We frequently find problems there.

How the Misconfiguration Happens

Like most misconfigurations, the most common one for DNS comes from thinking through a problem rationally with partial knowledge. We need DNS. Very little in the modern computing world works without it. So, we do whatever we can to ensure our systems can access a DNS server in all conditions. Unfortunately, that desire leads to a situation where the solution causes more problems than it solves.

The thought pattern for DNS typically works like this: “I want my systems to use the internal DNS server, but if it crashes, then I want them to still have Internet access”. That results in configurations like this (sometimes via DNS instead of static assignment):

IP Address Dialog with Improperly Configured DNS Servers

The first DNS address in the screenshot belongs to a server on the local network. The second belongs to one of Google’s publicly usable DNS resolvers. This seems to implement the thought process outlined above. In reality, it does not.

Where it Goes Wrong

Taking the dialog box at face value, we might assume that the system will always query 192.168.0.5 first, only falling back to 8.8.8.8 when that attempt fails. Mostly, that happens. But only mostly. The networking stack can pick the “alternate” DNS server if it wants. That might happen because some transient glitch makes it think that it can’t reach the “preferred”. Maybe your internal DNS server said it was too busy. Perhaps your internal DNS server was down for a bit, but the client continues to use the last DNS server that it knew worked. It might even happen that the client just picks the other server.

Let’s walk through such a scenario where a client queries an external DNS server for an internal resource:

  1. A client application tries to open a connection to “sql.myinternaldomain.com” (an internal location)
  2. For whatever reason, the client’s networking stack sends the query to the “alternate” DNS address of 8.8.8.8
  3. Google’s DNS server starts the process as outlined above
  4. During the forwarding process, servers all over look for the server authoritative for “myinternaldomain.com”. That server sits on your private LAN where those public servers could not reach it even if they did know about it.
  5. The process eventually ends with a “not found” state or a “timeout” state. Either way, DNS resolution has completed.

What happens next depends on the operating system. For Windows, DNS resolution represents only one stage of a larger process. Before it even checks DNS, it will look in its local cache and the HOSTS file. If DNS fails, then it may send a NetBIOS broadcast. If so, it could work, if the target resource is in the same layer 2 network and the name formatting works out to a match.

Perhaps more importantly, the process ends after exhausting these steps. The networking stack does not then try the next DNS server. If the first one responded at all, then the client does not see any value in asking a second. In a properly configured environment, doing so would not yield different results. In clearer terms, a client will only try alternative DNS servers in its list if the first one does not respond at all. Additionally, to restate another major point, a client will not necessarily start with the “preferred” or first-listed server.

Matching the Process to the Outcomes

Let’s take a moment to review how all this ends with the various results:

  • The client selects the appropriate DNS server and gets a positive result: the query works along with whatever operation depended on it, usually quickly
  • The client selects an external DNS server to resolve an internal address and NetBIOS finds name: the query works along with whatever operation depended on it, but almost certainly after waiting out the entire DNS timeout period
  • The client selects an external DNS server to resolve an internal address and NetBIOS is disabled or can’t find a name: the query fails, but not until after the DNS timeout expires
  • The client selects a DNS server that, whether directly or from forwarding, finds an authoritative server that has no matching record: the query fails, usually quickly
  • The client receives no response at all from the first DNS server it tries, it moves on to the next. If a server in its list responds, then it goes through the entire process with it. If no server responds, the query fails. The amount of time this takes depends on how long the client needs to consider each DNS server in its list unresponsive.

The way that these outcomes manifest depend on the application that asks for name resolution. That was how we arrived at sayings such as the quote in the first paragraph. Administrators with sufficient knowledge knew that misconfigured DNS would cause problems, but no one person could know how all possible applications would show symptoms. So, experience taught us to look for the answers to unexpected, intermittent troubles by checking DNS configuration.

How to Properly Configure a DNS Client

All this means one thing: for perfectly predictable outcomes, clients must not know about any DNS servers other than internal. If they don’t have the option to query a DNS server that can’t give a valid answer, then they’ll never get an invalid response.

For the system in the above screenshot, we would correct it like this:

IP Address Dialog with a Properly Configured DNS Server

If you distribute DNS servers as part of DHCP, then make the same repairs there.

Of course, that solution does not fully meet our desires. What happens when 192.168.0.5 goes offline, even just for routine patching? If you don’t have another internal DNS server, then resolutions would fail until it comes back online. However, they will fail quickly. If you know that the DNS server won’t come back online quickly but Internet access otherwise remains available, then you can reconfigure clients to use Internet DNS servers in the interim. If your organization simply cannot tolerate that, then you need to set up another internal DNS server.

Configuring a DNS Server for Forwarding

Full configuration of a DNS server goes far beyond the bounds of what we can tackle in this article. Fortunately, when you first install the DNS service on a Windows Server in a new Active Directory domain, it largely takes care of itself. Linux DNS servers need more work, but usually not an excessive amount.

By default, both Windows and Linux DNS servers have and use “root hints” repositories. These contain the IP addresses of well-known servers on the Internet that can authoritatively answer for top-level domains (“.com”, “.org”, “.net”, etc.). That allows these servers to perform recursive resolution for your internal clients without any further configuration (although the root hints repositories sometimes need updates, which your normal update process should automatically handle; check the documentation if you use a Linux DNS server).

If you want more control over external DNS resolution, then you can configure forwarders. This gives you the opportunity to decide which servers your DNS servers turn to when they do not have authority for a query and offloads the recursion process. Remember that you only need forwarding in situations where you know that the standard recursion process will not produce a valid result but you know of a server that can succeed.

Check your documentation if you need to configure a Linux DNS server. The next section includes instructions for configuring forwarding on a Windows DNS server.

How to Configure Forwarding on Windows DNS Server

Before you start, determine the IP addresses for the servers that you want to take your forwarded DNS queries. You can use general DNS servers on the Internet, such as Google‘s, OpenDNS‘s, or your ISP’s. In addition to offloading the recursion work, some of these forwarders also provide additional features, such as URL filtering.

You might also need conditional forwarders. Use these in situations where you need another DNS server to handle requests for particular domains. As an example, you may have disjointed domains in your environment — not visible to the public Internet but reachable over LANs, VPNs, or other private connections.

Once you have your list of general and conditional DNS forwarders, you can configure them in PowerShell or the DNS console.

Configuring Generic DNS Forwarders in PowerShell

Use Add-DnsServerForwarder for each general DNS forwarder. The following examples will add the OpenDNS addresses as forwarders:

Add-DnsServerForwarder -IPAddress 208.67.222.222, 208.67.220.220

This only adds the forwarders to the named server. You must add them separately on each DNS server.

You can check your work with Get-DnsServerForwarder. If you made a mistake or just want to get rid of a forwarder, you can use Remove-DnsServerForwarder with the same format that you used to add it:

Remove-DnsServerForwarder -IPAddress 208.67.222.222

You can even remove all forwarders by piping the output of Get-DnsServerForwarder to Remove-DnsServerForwarder:

Get-DnsServerForwarder | Remove-DnsServerFowarder

There is also a Set-DnsServerForwarder cmdlet that allows you to modify a few options of a forwarder. That goes beyond the intent of this article, but explore it on your own.

Configuring Conditional DNS Forwarders in PowerShell

Conditional forwarders rely on a different set of PowerShell cmdlets. Think of these from the perspective of the domains that the conditional forwarders will handle instead of the DNS servers themselves. To add conditional forwarders, you need to know the fully qualified names of those domains and the IP addresses of the servers that can handle requests for them.

If you intend to integrate conditional forwarders into your Active Directory DNS configuration, then you also need to know the scope. You have four options:

  • Forest: The forwarded domain name and DNS IP addresses will replicate to every DNS server in the forest.
  • Domain: Only DNS servers in the same domain as the DNS server where you run the cmdlet will receive information on the forwarded domain.
  • Custom: Stores the information in a directory partition. You will also need to know the name of that partition.
  • Legacy: Replicates the zone to all domain controllers in the same domain. This will bypass DNS servers not installed on domain controllers.

Use Add-DnsServerConditionalForwarderZone to add conditional forwarders.

To add a non-Active Directory conditional forwarder (stored in the DNS server’s registry):

Add-DnsServerConditionalForwarderZone -ZoneName 'alternativedomain.com' -MasterServers 172.16.20.5, 172.16.20.10

For an Active-Directory integrated conditional forwarder, just add the ReplicationScope parameter:

Add-DnsServerConditionalForwarderZone -ZoneName 'alternativedomain.com' -MasterServers 172.16.20.5, 172.16.20.10 -ReplicationScope Forest

If you chose the “Custom” scope, then you also need to provide the name of the partition to the DirectoryPartitionName parameter. This cmdlet has a few other parameters for uncommon options. See the link above. There is also a Set-DnsServerConditionalForwarderZone cmdlet to change an existing conditional forwarder.

Because these exist as zones instead of as a simple list, use the DNS zone cmdlets to view or remove conditional forwarders.

To view all zones, use the base cmdlet Get-DnsServerZone. Since we’re interested in conditional forwarders, you can narrow the list down to only those like this:

Get-DnsServerZone | Where-Object -Property ZoneType -EQ -Value Forwarder

That’s the full syntax, with no aliases or positional parameters. You can accomplish the same thing with a much shorter line:

Get-DnsServerZone | ? ZoneType -eq Forwarder

You can also retrieve zones by their full name without the “Where”:

Get-DnsServerZone -Name 'alternativedomain.com'

You can use advanced techniques with “Where” to perform partial matches. That’s beyond the purpose of this article. Make sure that you understand exactly how to retrieve the domains that you want before you attempt to remove any.

To remove a conditional forwarder, use Remove-DnsServerZone. Make absolutely certain that you understand how to select exactly the zone(s) that you want to delete. If you make a mistake, you might delete a primary zone. You can remove a zone by exact name like this:

Remove-DnsServerZone -Name 'alternativedomain.com'

You can remove a filtered list by piping output from Get-DnsServerZone. Due to the risk of removing an incorrect, you should test the Get portion by itself first. You can also use WhatIf with Remove-DnsServerZone to see what it will do without doing it. Remove-DnsServer should prompt with the target name before deleting it, but do not rely on that. The following shows an example with WhatIf:

Get-DnsServerZone | ? ZoneType -eq Forwarder | Remove-DnsServerZone -WhatIf

Just remove the WhatIf to perform the deletion.

Configuring Generic Forwarders in the DNS Console

In the DNS console, right-click on a server and click Properties:

DNS Server Context Menu

Change to the Forwarders tab and click the Edit button:

DNS Forwarders Tab

In the Edit Forwarders dialog, you can add, delete, and reorder DNS forwarders.

Edit Forwarders Dialog

As you add items, it will attempt to resolve the IPs that you provide to a name. You can keep unresolvable items, but check that the server can reach them. When forwarding, your server will give the target 5 seconds to time out, which will cause a noticeable delay for users.

DNS does not replicate generic forwarders. Perform the same process on your other DNS servers.

Configuring Generic Forwarders in the DNS Console

In the DNS console, click the Conditional Forwarders branch. In the right-pane, right-click a blank spot and click New Conditional Forwarder.

Conditional Forwarder Branch in DNS Console

This brings up the New Conditional Forwarder dialog:

New Conditional Forwarder Dialog

First, you define the DNS domain that the conditional forwarder handles. Next, enter the IP address(es) for the forwarder(s). It will attempt to resolve them to names. You can keep unresolvable forwarders, but take the opportunity to check that you entered the IP correctly. Unreachable forwarders take 5 seconds to time out, which causes a noticeable delay for users. Finally, decide how the zone should replicate. If you do not check the box, the DNS server will store the zone in its local registry and not replicate it at all. Use the drop-down box to see your alternatives. If you like, you can adjust the time out delay.

You may need to refresh the console to see the new zone.

If you right-click on a conditional forwarder zone in the tree view, you can delete it or open its properties dialog:

Conditional Forward Zone Context Menu

In the properties dialog (not shown) you can see the configuration on the General tab and access the ACL for the zone on the Security tab. The General tab has an Edit button that will show the same dialog box that saw when you added the forwarder, although you cannot change the domain name.

Keep DNS Happy

Once configured properly, DNS needs little attention. Improperly configured DNS can cause frustrating problems. When you experience network oddity, add DNS to your list of things to verify. Of course, this article only highlights one common mistake. Other things can interfere with DNS, especially when delivering resolution services to clients. Even if your configuration passes all checks, you still might have a DNS problem.