On 5/24/19 12:21 PM, Andrija Panic wrote: > In other words - you are hitting an internal interface of a VR? The VR has two NIC's. I presume that the Guest NIC as vs the Control NIC is the "internal" NIC? Type Shared Traffic Type Guest Network Name Shared Netmask 255.255.0.0 IP Address 10.102.199.148 ID 7f59d904-cdc0-43eb-b679-0721077f5bb1 Network ID 924eda5f-9a1f-4a8e-9423-18000dc92093 Isolation URI vlan://102 Broadcast URI vlan://102 Type Traffic Type Control Network Name Netmask 255.255.0.0 IP Address 169.254.2.203 ID 9c3676bc-23e6-48e3-baca-b8cce6511092 Network ID 6eff5bd9-4f4d-48fe-b6ed-f50fc115947b Isolation URI Broadcast URI > > I would replace (for a test) bind9 with just the default setup of > DNSmasq, while specifying it's uper/ROOT DNS servers to be the VR IP - > i.e. client --> DNSmasq (internal server) --> DNSmasq (VR). > See if that work - so you can draw possibly some conclusions. That gives me room for some more experiments. I am fairly sure that I am running into recent changes to bind9 / dnsmasq intended to prevent DNS amplification and spoofing attacks, but the question of which one changed and how to work around it is still a question I'm trying to answer. > > Andrija > > On Fri, 24 May 2019 at 21:12, Eric Lee Green > wrote: > > On 5/24/19 10:16 AM, Andrija Panic wrote: > > Eric, > > > > your BIND9 servers is on a "Public" network (trying to talk to > the Public > > IP of the VR during forwarding DNS requests) or a VM inside an > Isolated > > network behind VR)? > > It's on *a* public network, but not *the* public network. I don't > have > any Isolated networks, though I have them enabled from VLAN > 1000-2000. I > am using "Advanced Networking" but for my own purposes -- I have one > "Shared" guest network at VLAN 102, and then several isolated > specialty > physical "Shared" networks like "Security Cameras" (VLAN 103) and > "Storage Network" (VLAN 200) that are attached to virtual machines > that > need access to those things. The "Shared" guest network (VLAN 102) is > routed by my layer 3 switch with the rest of my network's public > VLANs > so if I am on e.g. 10.31.1.2 (VLAN 31), which is similarly a routed > public VLAN (but not one that Cloudstack is allowed to directly > talk to > or manage, it has to go thru the layer 3 switch) or 10.120.0.5 (VLAN > 120), I can talk directly to 10.102.199.148 since all are routed into > the common fabric via the layer 3 switch.  I only care about the VM's > that are VLAN 102, which are supposed to be publicly available to my > users, thus why my quicky script hack to generate a zone file out > of the > database does > > select v.name , n.ip4_address from vm_instance as > v, nics as n where v.removed is null and n.instance_id = v.id > and n.ip4_address like '10.102.%'  and type = > 'User'  order by n.ip4_address; > > in order to select out the name and IP address of virtual machines > with > NIC's on that VLAN. (Which, if it's a different list from the last > list > that was queried, then gets massaged into a zone file for > name.cloud.mydomain.com by the > script, which then scp's to my master > domain server and does a reload to reload the zone file from the new > version). > > Both of my BIND9 servers can talk directly to 10.102.199.148 (the > IP of > the virtual router for the 10.102.xxx.xxx network, VLAN 102) if I use > 'host' to directly query 10.102.199.148 for an API address like, say, > 'api-default1.cloud.mydomain.com > ' but when I try to put a > forward domain > there, nope. This was working, but now is not. I suspect it's got > to do > with the recent changes in DNS software, both bind9 and dnsmasq,  to > deal with multiple attacks on the domain name system, but I'm having > trouble figuring out why, or what my solution should be. > > Note that it's quite reasonable / feasible / viable to put a DNS > server > actually inside the Cloudstack constellation if that's necessary and > then do a two-stage hop if necessary. I'm just trying to figure > out the > "right" way to do this right now so I can retire my hack script. > > > On Fri, 24 May 2019 at 02:15, Eric Lee Green > > > > wrote: > > > >> I had this working under 4.9. All I did was, on my main BIND9 > servers, > >> point a forward zone at 'cloud..com' to the virtual > router > >> associated with all VM's that were publicly available. I could then > >> resolve all foo.cloud..com names on my global network. > >> > >> Somehow, though, this quit working after I updated to 4.11. I'm not > >> quite sure why. > >> > >> The 'Guest Network' is defined with domain 'cloud.mydomain.com > '. > >> > >> Okay, so my router for the 'Guest Network' advanced networking is > >> located at 10.102.199.148. In my master BIND9 DNS server at > 10.31.1.2 I > >> have this: > >> zone "cloud.mydomain.com " IN { > >>      type forward; > >>      forward only; > >>      forwarders { > >>           10.102.199.148; > >>       }; > >> }; > >> > >> If I send a NAMED request directly to the virtual router while > logged > >> into my main name server, it works: > >> > >> [root@ypbind ~]# host eric-gui.cloud.mydomain.com > 10.102.199.148 > >> Using domain server: > >> Name: 10.102.199.148 > >> Address: 10.102.199.148#53 > >> Aliases: > >> > >> eric-gui.cloud.mydomain.com > has address 10.102.199.234 > >> > >> If I try to use the name server however, it doesn't work: > >> > >> [root@ypbind logs]# host eric-gui.cloud.mydomain.com > > >> Host eric-gui.cloud.viakoo.com > not found: 3(NXDOMAIN) > >> > >> I'm baffled, because this *was* working. > >> > >> So I disabled any dnssec in the {options} on bind9 and gave all > >> permissions to see if that was the problem (note that this is > internal > >> to my infrastructure, so DNS amplification isn't an issue): > >> > >>           dnssec-enable no; > >>           dnssec-validation no; > >>           dnssec-lookaside auto; > >>           recursion yes; > >>           allow-recursion { any; }; > >>           allow-query { any; }; > >>           allow-query-cache { any; };user > >> > >> Still nope. Still baffled. > >> > >> Anybody got any clues as to what I may be doing wrong? I'm > thinking it > >> has to be on the BIND9 side, because I can resolve the host > name if I > >> talk to the virtual router directly, but for some reason it's not > >> allowing me to get any records from the router. > >> > >> Right now I've temporarily worked around this with a script that > >> directly queries the MySQL database every few minutes and > generates a > >> revised zone file on my master DNS server when the list of virtual > >> machines queried out of the database changes, but that's > clearly not the > >> right way to do it. The question is, what *is* the right way to > do it? > >> > >> -Eric > >> > >> > >> > > > > -- > > Andrija Panić