we had n datacenters each named after their city: ldn.$company.com, ny.$company.com etc etc. in the DHCP we pushed out the search order so that it would try and resolve locally, if that failed try a level up until something worked.
This meant that you'd bind to service it would first look up service.$location.$company.com, if thats not there it'd try and find service.$company.com
This cuts down the need for nasty split horizon DNS, moving VMs/services/machines between datacenters was simple and zero config.
If you were taking a service out of commission in one datacenter, you'd CNAME service.$location.$company.com to a different datacenter, do a staged kick of the machines, and BOOM failed over with only one config change.
On a side note, you can use SSSD or shudder NSLCD to cache DNS.
We do, but in the specific case of Active Directory, we want to fail over and auth against another data center if the primary is offline. This means for our domain, the local (to the /16) domain controllers are returned first and then the others. The problem is BIND locally doesn't preserve this order and applications are suddenly authenticating across the planet.
DNS devolution isn't a good idea here, since the external domain is a wildcard. We'll be paying for that mistake from long ago until (if ever) we change the internal domain name.
This is a pretty recent problem we're just now getting to because the DNS volume has been a back-burner issue - we'll look into permanent solutions for all Linux services after the CDN testing completes. Recommendations on the Linux DNS caching are much appreciated - we'll review each. It's something that just hasn't been an issue in the past so not experts on that particular area. I am surprised caching hasn't landed natively in most of the major distros yet though.
Aha gotcha. I was under the impression that SSSD chose the fastest AD server it could find(either via the SRV records, or via a pre-determined list)? I've not had too much trouble with it stubbornly binding to the furthest away server. (thats with AD doing the DNS and delegation to BIND )
> The problem is BIND locally doesn't preserve this order
Nor need any other DNS server software do so. The actual DNS protocol has no notion of an ordering within a resource record set in an answer.
I suspect, from your brief description here, that what you'll end up with is using the "sortlist" option in the BIND DNS client library's configuration file /etc/resolv.conf . Although SRV RRSets will introduce some interesting complexities.
we had n datacenters each named after their city: ldn.$company.com, ny.$company.com etc etc. in the DHCP we pushed out the search order so that it would try and resolve locally, if that failed try a level up until something worked.
This meant that you'd bind to service it would first look up service.$location.$company.com, if thats not there it'd try and find service.$company.com
This cuts down the need for nasty split horizon DNS, moving VMs/services/machines between datacenters was simple and zero config.
If you were taking a service out of commission in one datacenter, you'd CNAME service.$location.$company.com to a different datacenter, do a staged kick of the machines, and BOOM failed over with only one config change.
On a side note, you can use SSSD or shudder NSLCD to cache DNS.