New Organization Checklist

Whenever I join a new organization I run through this checklist to see what improvements there are to be made. Some are easier and quicker wins than others. I like to address the easy wins first to show value early on. I plan on making posts related to each item and linking them here.

AAA TACACS+ / RADIUS
BPDU Guard
Certificate strength
Certificates self-signed / CA-signed
Change Control
Config MGMT / backup / revision control
Correct Bandwidth statements on interfaces
DCIM
DHCP settings / Snooping
Diagrams
DNS / Hostnames
EOL HW / SW
Firmware standard
HA Hardware
HA WAN / Internet connection
IPAM
Jumpboxes
Knowledge Base
Line vty / con / aux
Logging local / remote Syslog
Loopback / MGMT IP / interface
MAC security
Monitoring and alerting
NAC / 802.1x / dot1x
NetFlow / Top Talkers
NTP
Out Of Band MGMT
PW policy
SDWAN
SNMPv3
SSHv2
Stolen BGP ASN
Stolen Public IP space
STP
UTC Time Zone
Vendor Contacts and recurring meetings
VLAN names
VLAN numbers

Network Device Labeling

It can be frustrating trying to locate a device in a DC. Even if you know the section, row, and rack number there may be multiple devices of the same make/model, which is where labels come in handy. Unfortunately, there isn’t a vendor-neutral industry standard place to add a label, the glue may fail causing the label to fall off, etc. For now, I’m going to focus on the label location. I would like to submit an RFC to the IETF because I have not found one as of this writing. The RFC would be an industry standard for hardware vendors to follow when designing new equipment.

The specifications should be based on a 1 U device, which would cover non-rack mounted devices, but also be applied to larger devices. There are 2 key components to identifying a device, hostname (ideally one that resolves in DNS), and IP address. Many organizations also label devices with an “Asset Tag” which provides details on purchase and depreciation. The location and size of these should also be standardized.

There should also be a location on both the front and back because depending on the device, and the job, determines which side you work on.

Additionally, room for a QR code that can provide more details could be handy.

Python Hangman game

I wrote this a few years ago while I was going through the projects on inventwithpython.com and this one interested me beyond just following along. You start in chapter 8 (http://inventwithpython.com/invent4thed/chapter8.html) making a hangman game with ASCII art and a list of words. Chapter 9 (http://inventwithpython.com/invent4thed/chapter9.html) expands it with multiple difficulty settings and word options.

The code for those 2 parts are linked but I wanted to take it further. I started thinking that there must be a bunch of text files out there with all English words. I figured there would be a dictionary or Webster’s.txt but I haven’t found one. I found many files with 10’s to 100’s of thousands of words, so I grabbed one and started playing.

https://github.com/dwyl/english-words
https://sites.google.com/a/vhhscougars.org/johnsearch/searchindex/oxford-english-dictionary
http://www.gwicks.net/dictionaries.htm

All that had to change was to make the words variable open the file words = list(open(‘words_alpha.txt’)) instead of putting the list directly in the code as a list, or dictionary, depending on which chapter you’re working on. It’s a small change code-wise but it opens up the game to new possibilities.

I’m having problems finding lists of specific words but I’d like to build on chapter 9 and have topics, and maybe hints. The hints could be in the file or be scrapped from search engine results.

If I continue to work on it here’s the Gitlab https://gitlab.com/andrewjwhittle/hangman

Bar guest wifi

Protip, try these PSK’s before asking at the bar or restaurant:

drinkmorebeer

coldbeer

icecoldbeer

Most businesses have guest wifi and it usually works smoothly. However, I’ve identified a design issue for a specific use case. Default DHCP settings and a standard /24 Subnet work well for many environments. But an area with hundreds or thousands of guests coming and going daily has unique needs. Some guests at my local sports bar had reported that they could not connect to the guest wifi but only on some of their devices. It’s usually a new device or someone who doesn’t visit often. I knew the cause was that we were running out IP’s as I had run into the same symptoms years prior. With a building max capacity of 500 guests, 254 IP’s, and a DHCP lease of 7 days doesn’t meet their needs.

Since most non-chain bars and restaurants will likely have limited IT resources I think a software solution from network vendors is needed. During the initial hardware setup, a series of questions should be asked to determine the network’s needs. For these high-traffic environments, I advise the following.

SSID “Company Name Guest”.

If you’re in a shared building that has overlapping wifi signals consider all companies using the same SSID for easy roaming.

Subnet size equal to the building max capacity multiplied by at least 2.

Lease timer 8 hours.

Name Servers 8.8.8.8 and 1.1.1.1.

Side note: Recently I was wondering why we even put passwords on guest wifi. I’m disappointed that I only now learned that open wifi does not encrypt the traffic. But at the same time, I’m happy as I can freely admit I didn’t know something and I got to learn something. I’ve seen the warnings but thought they were general “be safe in an unknown neighborhood” and not “everything is unencrypted”. But that does point to an opportunity to write about why that is and how we can change it. Without investigating, I think a certificate provided by the Access Point similar to how HTTPS works.

New TLD and Restoration

I lost TheLordOfThePings.com due to not expiration of the domain and the card set for autopayment. So now I’m using TheLordOfThePings.net which is more fitting for the blog. In addition to the new TLD I had some issues with my main domain so I had to restore the blog by moving Tables between DB’s.I lost TheLordOfThePings.com due to not expiration of the domain and the card set for autopayment. So now I’m using TheLordOfThePings.net which is more fitting for the blog.

NAT’s interaction with DNS answers

Recently I was troubleshooting some odd DNS results between 2 customers that have a B2B connection. The DNS record in question existed in the wild on the internet and resolved to 113.129.255.98 (all IP’s have been randomized using https://onlinerandomtools.com/generate-random-ip for anonymity). Customer A resolved to 192.168.20.5 on their end of the link and Customer B resolved to 172.16.20.57 on the other end, where the server lived, which was the correct one. DNS admins were brought in on both sides. Customer A confirmed that they had Conditional Forwarders configured to query Customer B’s Name Servers for this Zone.

 

To the Packets! Captures were taken nearest each Name Server and nearest each end of the B2B connection. The change was happening on Customer A’s side of the link behind a outer doing NAT. We had brought up NAT a couple of times but thought “nah, that’s not what NAT does”. Guess what happened when we said “ok, let’s remove the NAT”. By the title of this you can probably guess what happened…Problem solved. We were flabbergasted.

 

I was familiar with DNS spoofing/poisoning/hijacking but had never thought that NAT could be leveraged in this way. Research brought me to DNS Doctoring which is the non malicious way to manipulate DNS for good rather than for evil. But this was different. Eventually I stumbled across RFC 2694 “DNS extensions to Network Address Translators (DNS_ALG)” https://tools.ietf.org/html/rfc2694 and Application Level Gateways.

 

These are a couple of the pages that I found on the subject.

https://blog.webernetz.net/cisco-router-disable-dns-rewrite-alg-for-static-nats/

https://www.cisco.com/c/en/us/td/docs/security/asa/asa95/configuration/firewall/asa-95-firewall-config/nat-reference.pdf 

 

It took longer than it should have to find the answer because I was using search terms that weren’t used much. I’ve included my verbiage so the next person will find this faster.

 

On a Cisco IOS device Application Level Gateway for NAT is enabled by default. Which means in a one to one NAT (not overloaded/PAT) the DNS answer within received packets is rewritten to the NAT IP.

 

So with this config:

ip nat outside source static <outside global> <outside local>

ip nat inside source static 172.16.20.57 192.168.20.5

 

When you do an nslookup from the receiving side of the NAT (in this case Customer A) you’ll resolve to the outside local instead of the outside global IP you’ll get the outside local one.

 

You can disable ALG with:

(no) ip nat service alg udp dns

(no) ip nat service alg tcp dns

 

Or change the NAT so that it doesn’t alter the payload of the packets that contain DNS answers, or the payload of any packets, who knows what other “features” lie in wait:

ip nat inside source static <outside global> <outside local> no-payload

ip nat inside source static 172.16.20.57 192.168.20.5 no-payload

 

One more reason to hate NAT, which would also have been an acceptable post name.

Securing the wired network with 802.1X

This post covers an innovation project I did to secure the wired network at a shared conf center with 802.1X.

Every few months we had to disable the wired network in order to prevent non-employees from being able to get online. This was not scalable, was prone to human error, and scheduling confusion. I planned to automate the process by enabling 802.1X aka dot1q on the switches using our Windows AD via Cisco ACS.

Any Domain joined devices that plugged in would get access to our corp VLAN, and unknown devices would go into a dead VLAN. Long term I planned to enable a wired guest VLAN and had it labbed out for non local switched wifi where the guest VLAN exists on the switch you’re connected to but didn’t around to labbing local switching using CAPWAP tunnels.

Wired Guest Access using Cisco WLAN Controllers Configuration Example:

http://www.cisco.com/c/en/us/support/docs/wireless-mobility/wireless-lan-wlan/99470-config-wiredguest-00.html

Phones would end up in the VoIP VLAN but they weren’t equipped for dot1x authentication. So I had two options either manually add each MAC address to a list in the ACS which is not scalable or supportable. Instead I removed the user VLAN from the ports with phones connected. The voice VLAN itself was locked down with a strict ACL that only allowed communication with the VoIP server.

Client configs:

In this environment there were only Windows clients. Windows needs to be configured to enable their supplicant for dot1x. Anything with Windows can and should be controlled by Group Policy. I researched the needed settings and how to set them via GPO. Then I worked with the Windows team to roll out the GPO to a pilot group and finally deploy globally.

Configuring 802.1X Wired Authentication on a Windows 7 Client:

https://documentation.meraki.com/MS/Access_Control/Configuring_802.1X_Wired_Authentication_on_a_Windows_7_Client

You can do the same thing with other versions of Windows just this was the one I worked with.

Windows AD GPO guide:

https://msdn.microsoft.com/en-us/library/dd759237.aspx

When these Win 7 machines were upgraded to Win 10 the GPO still worked.

 

Switch configs:

!Debug 802.1x all

!Debug radius all

conf t

aaa new-model

aaa authentication dot1x default group radius

aaa authorization network default group radius

aaa accounting network default start-stop group radius

dot1x system-auth-control

dot1x guest-vlan supplicant

radius server acs1.foo.com

 address ipv4 10.1.181.2 auth-port 1812 acct-port 1813

 key 0 This-IsTheSharedSecret123

exit

radius server acs2.foo.com

 address  ipv4 10.2.181.2 auth-port 1812 acct-port 1813

 key 0 This-IsTheSharedSecret123

exit

!User ports

interface <interface>

 authentication port-control auto

 authentication host-mode multi-domain

 dot1x pae authenticator

 authentication event no-response action authorize vlan 15

 authentication event fail action authorize vlan 15

!Phone ports

interface <interface>

no switchport access vlan 1010

end

copy run start

ACS Configs:

I already had the ACS configured to do dot1x auth for wifi clients so it was simple to just add the new switches to the rule set.

 

It was a success and opened the door to securing all wired networks.

NTP redesign

This post is about a bug that affected NTP (Network Time Protocol) and our redesign of the environment bypass the issue.

In this environment the core Cisco 7604 IOS routers were the NTP stratum 2 servers (x.x.x.123 because fun with port numbers). The IP was an HSRP standby IP. There were several downstream Linux NTP servers and Window Domain Controllers serving NTP to Windows clients. As unsupported Linux servers died their IP’s were just added to servers that were still alive. Eventually this got messy.

After the 7604 routers were replaced with a pair of ASR1006X we ran into some interesting issues. Windows users we no longer able to log. Turns out the Domain Controllers were falling out of sync. My Infoblox DDI servers also showed stale time. Users were eventually able to log into the Domain either before or after the Windows team changed their NTP config. The sys admins were now syncing with one of the Stratum 3 Linux servers. Knowing that the only thing that had changed in our environment was the ASR routers I knew this wouldn’t be the end of the issue.

I opened a ticket with Cisco TAC to troubleshoot with the ASR’s. TAC thought maybe it was because we were using a standby IP. But I couldn’t get resources to help test so we got nowhere. Eventually a bug ID CSCsq31723 was made which I think is related. Fast forward 6 months and Windows users can’t log in again. This time we decided to go nuclear and redesign the whole NTP layout.

The new design removed all servers running on unsupported hardware and OS’s. It also made use of our Infoblox DDI grid which is a purpose made tool for DNS, DHCP, IPAM, NTP, and File Distribution.

We decided on this:

  • 3 internet Stratum 1 servers
  • 3 Stratum 2 servers. Infoblox DDI Grid Master at HQ, the Grid Master Candidate at our DR location, and a Linux server to diversify technologies.
  • Place the Stratum 2 servers in a mesh.
  • The Infoblox DDI Grid Master and Grid Master Candidate fed the 2 HA pair (4 servers with 2 VIP’s) Grid Members.
  • Create Access Lists on the Stratum 2 servers so only the Stratum 3 servers can sync with them.
  • All clients would then sync with the 3 HA pairs of Infoblox DDI Grid Members.
  • Set up NTP Authentication (https://www.nist.gov/pml/time-and-frequency-division/time-services/nist-authenticated-ntp-service).

Both VIP’s of the HA paired Grid Members had a user friendly DNS record and some systems accept a hostname/DNS for the NTP config. That would be handy if we ever needed to change the IP’s again. For extra flexibility we could use F5 BIG-IP GTM (DNS load balancing). But network devices like routers and switches don’t support using DNS for NTP meaning there would be two sets of configs. One with NTP configs using DNS and another hard set to static IP’s. We wanted a global config so we went with static IP’s everywhere.

The IP’s were given out and people were told to migrate. We setup a span and periodically checked to see who was still pointed at the old servers before finally retiring the old IP’s/servers.

It wasn’t perfect but it was a big improvement. Long term I’d want to install our own Stratum 0 GPS antennas instead of using internet hosted servers. For a home project I’m thinking of using a Raspberry Pi to make one using this link as a guide (https://www.satsignal.eu/ntp/Raspberry-Pi-NTP.html).

Blog List

Router Gods

Daniel Dib CCIE #37149 CCDE #20160011 – Lost In Transit

Dmitry Figol CCIE RS #53592 –  Blog

Dustin Beare – Network Introvert

Joel Sprague CCIE #52000 – Blog

Joshua Burget – Barefoot Labbing

Katherine McNamara CCIE #50931 – Network Node

Nick Russo CCIE #42518 CCDE #20160041 – njrusmc

Mary Fasang – Networking Green Girl

Russ White – Rule 11

Steinn Örvar CCIE #60715 – Network MEME shirts

Steve McNutt – Dense Mode

Tim McConnaughy CCIE R/S #58615 – Carpe DMVPN

Individual Blogs

Anthony Sequeira – AJ’s Networking

Kevin Wallace – KW Train

Group Blogs

Hack A Day

Packet6

Hak5

Life Hacker

Inactive (No posts in 6+ months)

Router Gods

Chris Pratt – CMP Networking

Kim Pedersen and Daniel Dib CCIE #37149 CCDE #20160011 – Network Career

Individual

Jeremy Stretch CCIE – Packet Life

Group

None

Guest wifi and branch backup VPN redo

This post is about a situation I ran into a while ago and records my configs and testing for converting from a PBR setup to VRF on a Cisco 881 router with a diagram at the end.

Through a combination of configs involving PBR (Policy Based Routing) AKA Source Routing (as opposed to standard Destination Routing), Proxy Server exceptions, and Default Route/missing Default Route it was impossible to get to internet facing apps/sites over guest wifi or branch backup VPN.

I knew I could use VRF’s (Virtual Routing and Forwarding) to separate the traffic and solve the issue, but had to prove it to my team as they weren’t familiar with VRF’s. A Cisco router without VRF’s built only has the “global routing table”. VRF’s create separate instances of routing tables; one for each VRF, while leaving the global in place.

IOS-XE comes with a mgmt-intf VRF by default for a separate management network. Carriers use VRF’s, or contexts in some non Cisco hardware, to keep customers traffic separate and allow for overlapping network schemes. If needed you can “leak routes” between VRF’s and/or the global routing table. This would be done if the carrier has something like a network monitoring server that needs to access customer devices.

I used a post by Jeremy Stretch at packetlife.net as a guide and to show that someone smarter than me confirmed the design. http://packetlife.net/blog/2012/sep/4/simultaneous-tunneled-and-native-internet-access/

Jeremy’s post goes into the details of building everything from the base up.

 

Testing:

Test from a computer on guest wifi:

C:\Users\>tracert google.com

Tracing route to google.com [172.217.5.78]

over a maximum of 30 hops:

  1 81 ms   139 ms 251 ms  172.17.1.1
  2  152 ms 35 ms 29 ms  [10.1.180.1]
  3  520 ms   173 ms 652 ms  [10.254.254.161]
  4  4 ms     6 ms  7 ms  [****]
  5  8 ms  5 ms    15 ms [****]
  6 12 ms 10 ms  6 ms  144.228.109.65
  7 34 ms   166 ms 219 ms  sl-mpe50-sea-.sprintlink.net [144.232.3.126]
  8  662 ms   638 ms 849 ms  72.14.242.31
  9  572 ms 45 ms  9 ms  108.170.245.115
10   578 ms   995 ms 370 ms  66.249.94.201
11   955 ms   618 ms 507 ms  209.85.240.228
12   268 ms   457 ms 447 ms  216.239.54.158
13   358 ms   342 ms 638 ms  216.239.51.124
14    61 ms 71 ms 51 ms  108.170.247.193
15    56 ms 88 ms 68 ms  108.170.237.113
16    55 ms 32 ms 44 ms  172.217.5.78

Trace complete.

Test from a computer on the regular corp wifi or LAN:

C:\Users\>tracert google.com

Tracing route to google.com [172.217.5.78]

over a maximum of 30 hops:
  1 12 ms  5 ms 11 ms  192.168.2.1
  2  5 ms  5 ms  3 ms  10.3.254.1 < --- Headend Tunnel interface
  3 10 ms  4 ms     5 ms [10.254.254.161]
  4 17 ms  2 ms     9 ms [***]
  5  6 ms  4 ms  4 ms  [****]
  6 10 ms  3 ms     7 ms 144.228.109.65
  7  6 ms  2 ms  2 ms  sl-mpe50-sea-.sprintlink.net [144.232.3.126]
  8  5 ms  4 ms     5 ms 72.14.242.31
  9  7 ms  4 ms  3 ms  108.170.245.115
10     9 ms 12 ms  9 ms  66.249.94.201
11    10 ms 29 ms 26 ms  209.85.240.228
12    34 ms 54 ms 59 ms  216.239.54.158
13    32 ms 32 ms 32 ms  216.239.51.124
14    30 ms 31 ms 30 ms  108.170.247.193
15    33 ms 30 ms 29 ms  108.170.237.113
16    38 ms 31 ms 40 ms  172.217.5.78

Trace complete.

 

Config differences:

ADD config:

First you have to create a named VRF. Some people use all caps for VRF names to make them stand out. I don’t because it’s a pain when you want to ping, trace, or use any VRF specific commands.

!The VRF names doesn’t matter but fdoor relates to Front Door VRF sometimes used with DMVPN. It separates the base routing and the “overlay routing” used by the VPN. You could also put all interfaces in VRF’s and not use the global routing table but I didn’t.

!Create the VRF.
!
ip vrf fdoor
!
!Required to support DHCP when using VRF’s.
no ip dhcp use vrf connected
!
!Place the internet interface in the VRF
interface <internet-interface>
ip vrf forwarding fdoor
!
!Place the guest interface in the VRF
interface vlan15
ip vrf forwarding fdoor
!
!Tell the tunnel to use the VRF for its source interface.
Interface <tunnel-number>
tunnel vrf fdoor

CHANGE config:

!The NAT needs to be told about the VRF.
!Change this:
ip nat inside source list 130 interface <internet-interface> overload
!
!To this:
ip nat inside source list 130 interface <internet-interface> vrf fdoor match-in-vrf fdoor overload
!The default route needs to be moved to the VRF.
!Change from this:
ip route 0.0.0.0 0.0.0.0 <internet-interface> <internet-gw>
!
!To this:
ip route vrf fdoor 0.0.0.0 0.0.0.0 <internet-interface> <internet-gw>

REMOVE config:

!
interface Vlan10
no ip policy route-map wifi-mgmt-route-map 
!
!
interface Vlan30
no ip policy route-map wifi-guest-route-map
!
!
interface Vlan36
no ip policy route-map LAN-route-map
!
!
no route-map wifi-mgmt-route-map permit 10
 !match ip address CAPWAP-traffic
 !set ip precedence flash-override
 !set ip next-hop <branch-LAN-gw>
!
no route-map LAN-route-map permit 10
 !match ip address all-except-localwifi
 !set interface Tunnel1999
!
no route-map wifi-guest-route-map permit 10
 !match ip address 130
!
!These routes are not needed as we have dynamic routes for the tunnel and a static default route for internet access in the fdoor VRF.
no ip route 0.0.0.0 0.0.0.0 vlan33 <branch-LAN-gw> 254
no ip route <DNS1> 255.255.255.255 <internet-interface> <internet-gw>
no ip route <DNS2> 255.255.255.255 <internet-interface> <internet-gw>

As you can see you end up with less configs and better routing. I was able to convert a few hundred of these setups over a few weeks with old school copy pasta. But it I had to do this again I’d spend time on a Python script that would convert, test, and document the results.