{"id":103,"date":"2019-02-04T23:09:37","date_gmt":"2019-02-04T23:09:37","guid":{"rendered":"https:\/\/thelordofthepings.newedgenetworking.com\/wp\/?p=103"},"modified":"2023-08-22T01:15:32","modified_gmt":"2023-08-22T01:15:32","slug":"ntp-redesign","status":"publish","type":"post","link":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/2019\/02\/04\/ntp-redesign\/","title":{"rendered":"NTP redesign"},"content":{"rendered":"\n<p><span style=\"font-weight: 400;\">This post is about a bug that affected NTP (Network Time Protocol) and our redesign of the environment bypass the issue.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">In this environment the core Cisco 7604 IOS routers were the NTP stratum 2 servers (x.x.x.123 because fun with port numbers). The IP was an HSRP standby IP. There were several downstream Linux NTP servers and Window Domain Controllers serving NTP to Windows clients. As unsupported Linux servers died their IP\u2019s were just added to servers that were still alive. Eventually this got messy.<\/span><\/p>\n\n\n\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lh4.googleusercontent.com\/v8xrbYeUz6ZReJBdbmHPfHdVoOaTQ_Ni-rnntOaX1vF6EqiQkjhmkJJDE4IIDoYHHPCxJVSNDHyfhpj5RrFjZT4CWDEUQql1kcL_uq-LbF59HA7D15Mh1GaRu2uhdHwI69o5Ko4Tx7cDeneH0RYCcw\" width=\"624\" height=\"493\"><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">After the 7604 routers were replaced with a pair of ASR1006X we ran into some interesting issues. Windows users we no longer able to log. Turns out the Domain Controllers were falling out of sync. My Infoblox DDI servers also showed stale time. Users were eventually able to log into the Domain either before or after the Windows team changed their NTP config. The sys admins were now syncing with one of the Stratum 3 Linux servers. Knowing that the only thing that had changed in our environment was the ASR routers I knew this wouldn\u2019t be the end of the issue.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">I opened a ticket with Cisco TAC to troubleshoot with the ASR\u2019s. TAC thought maybe it was because we were using a standby IP. But I couldn\u2019t get resources to help test so we got nowhere. Eventually a bug ID CSCsq31723 was made which I think is related. Fast forward 6 months and Windows users can\u2019t log in again. This time we decided to go nuclear and redesign the whole NTP layout.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">The new design removed all servers running on unsupported hardware and OS\u2019s. It also made use of our Infoblox DDI grid which is a purpose made tool for DNS, DHCP, IPAM, NTP, and File Distribution.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">We decided on this:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400;\">3 internet Stratum 1 servers<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">3 Stratum 2 servers. Infoblox DDI Grid Master at HQ, the Grid Master Candidate at our DR location, and a Linux server to diversify technologies.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">Place the Stratum 2 servers in a mesh.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">The Infoblox DDI Grid Master and Grid Master Candidate fed the 2 HA pair (4 servers with 2 VIP\u2019s) Grid Members.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">Create Access Lists on the Stratum 2 servers so only the Stratum 3 servers can sync with them.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">All clients would then sync with the 3 HA pairs of Infoblox DDI Grid Members.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">Set up NTP Authentication (<\/span><a href=\"https:\/\/www.nist.gov\/pml\/time-and-frequency-division\/time-services\/nist-authenticated-ntp-service\"><span style=\"font-weight: 400;\">https:\/\/www.nist.gov\/pml\/time-and-frequency-division\/time-services\/nist-authenticated-ntp-service<\/span><\/a><span style=\"font-weight: 400;\">).<\/span><\/li>\n<\/ul>\n\n\n\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/ZAb-WvgmqlvNIMFk1lTpEVwTu2N1AKGt-3UOA52b1Ka1bkW6JDJSYDPbciMipJyawTmE8rsllFoPhnOXhElCJWEDxztFIR48VrMVPS9RAeHuv3onpsyYdgs6uo-QGWtPzLFyp0yeS5oP3SU-Rvmbyw\" width=\"624\" height=\"536\"><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">Both VIP\u2019s of the HA paired Grid Members had a user friendly DNS record and some systems accept a hostname\/DNS for the NTP config. That would be handy if we ever needed to change the IP\u2019s again. For extra flexibility we could use F5 BIG-IP GTM (DNS load balancing). But network devices like routers and switches don\u2019t support using DNS for NTP meaning there would be two sets of configs. One with NTP configs using DNS and another hard set to static IP\u2019s. We wanted a global config so we went with static IP\u2019s everywhere.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">The IP\u2019s were given out and people were told to migrate. We setup a span and periodically checked to see who was still pointed at the old servers before finally retiring the old IP\u2019s\/servers.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">It wasn\u2019t perfect but it was a big improvement. Long term I\u2019d want to install our own Stratum 0 GPS antennas instead of using internet hosted servers. For a home project I\u2019m thinking of using a Raspberry Pi to make one using this link as a guide (<\/span><a href=\"https:\/\/www.satsignal.eu\/ntp\/Raspberry-Pi-NTP.html\"><span style=\"font-weight: 400;\">https:\/\/www.satsignal.eu\/ntp\/Raspberry-Pi-NTP.html<\/span><\/a><span style=\"font-weight: 400;\">).<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post is about a bug that affected NTP (Network Time Protocol) and our redesign of the environment bypass the issue. In this environment the core Cisco 7604 IOS routers were the NTP stratum 2 servers (x.x.x.123 because fun with port numbers). The IP was an HSRP standby IP. There were several downstream Linux NTP [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[35,2,42],"tags":[12,9,13,19,22,26,44,11,25],"class_list":["post-103","post","type-post","status-publish","format-standard","hentry","category-cisco","category-infoblox-ddi","category-troubleshooting","tag-cisco","tag-ddi","tag-infoblox","tag-ios","tag-ios-xr","tag-linux","tag-microsoft","tag-ntp","tag-windows"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/posts\/103","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/comments?post=103"}],"version-history":[{"count":3,"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/posts\/103\/revisions"}],"predecessor-version":[{"id":336,"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/posts\/103\/revisions\/336"}],"wp:attachment":[{"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/media?parent=103"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/categories?post=103"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/andrewjwhittle.com\/wp-lotpnet\/wp-json\/wp\/v2\/tags?post=103"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}