cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Persistent http/s connection issues

cafecubano
Investigator
Investigator

Hello,

I have been having this strange issue for over a month now, and I have talked thoroughly with EE CS, as well as had EE and Openreach engineers out. Funnily enough, no one I have spoken to seems to know what http is and why it is different to icmp, bare tcp, etc.

I thought I would write out what is happening and the investigation steps here incase anyone can help or if anyone else is having the same issue.

The issue manifests itself as http/https requests timing out, meaning that normal use of the internet is impossible. Occasionally, some requests will go through normally, and the % degradation varies, but it it is often in the region of >90% of requests failing over a given time period. This occurs on any device, connected via wifi or ethernet, and using any OS I've tried: Windows, MacOS, Linux, Android. These devices all have varying DNS setups so I don't think that is the issue, also, dns resolution works fine.

The interesting thing is that other application-level protocols work, such as: ssh, dns, wireguard.
ICMP ping works fine, and I suspect this is why EE cannot see any issue from their side. 
This means I can tunnel all traffic from one of my devices to a wireguard exit node and then the connection works as normal.
I can ping devices over the wan (icmp), and I can ssh to them also. Seeing as these protocols use both UDP and TCP, it doesn't seem to be an issue with one of those.

I have been through all of the basic troubleshooting steps such as restarting, factory reset, new router, etc.
Our hub is a Smart hub 6.
Interestingly, I think the start of this problem coincided with the firmware update to our router. (the update that forced https connection when using the hub's homepage)

The only non-standard thing on our network is that we have been using DMZ and now port forwarding, as we have a server we need to connect to over the WAN. Initially, we were using DMZ, but after the issue started, we tried port forwarding instead. This appeared to resolve the issue at the time, but then it came back a week later. It seems that any factory reset temporarily fixes the issue for a few days, then it comes back. A restart normally results in normal service for about 20 minutes, then relapses.

Things i'm going to try next:

  • use a different modem and router
  • try to inspect an http request from both ends (send and receive) and perhaps tcpdump

I will keep updating the post as I get more information.

16 REPLIES 16
Jacob_
Investigator
Investigator

I did respond to this however it either didn't post or it was deleted, so I'll try again. You're correct with the ChatGPT, i use it to fact check and I do suffer with a little dyslexia so sometimes my words tend to get jumbled up so I'll lash it through before responding, anyway, UPnP will constantly open/close ports, it can cause NAT/conttrack engine to become unstable, especially if a firmware issue is to blame. I assume the EE Hub uses a Linux Kernel, as most routers tend to do, you said you had a DMZ/portforwarding, upnp can conflict if they're trying to forward or re-map overlapping, ports, not saying this is the case but I've seen stramger things happen, especially if combined with a hairpin NAT. Some routers will tie in UPnP mapping in to the same state tracking system as outbound connections, so if the NAT/conntrack table fills up, you'll see new connections failing first, usually why a reboot will "fix" the issue. It could be down to buggy firmware, the hub that's provided to you for VDSL handles everything from PPPoE sessions, NAT, firewall, QoS, if uPnP is enabled and you have a lot of computers/apps/devices whatever attemping to make use of that, it could be filling up the NAT/conntrack table and causing crashes, DoS protection could kick in if it's displaying SYN flood false positives etc, hence why you're new outbound sesisons are beign dropped,   but again, I'm not sure, I don't want to point fingers as I think my post may of been deleted or I just pressed the wrong button..  re IpV6, afaik VDSL dual stacks ipv4 and ipv6 depending on the configuration from the ISP, I'm unsure how EE handles this. most clients/browsers etc will try ipv6 and ipv4 in parallel, so if IPv6 is flakey (routes, mtu etc) you can egt intermittent timeouts, etc. You can always test by doing the following:

curl -vvv -4 -I https://google.com --connect-timeout 20

curl -vvv -6 -I https://google.com --connect-timeout 20

 Like I said though, I very much doubt IPv6 is to blame, btu there's been weirder things happen before. 

JUst another note to add regarding the DMZ, I would recommend against that for security purposes, as it'll open all ports to your external IP address what are open on that host and will have to rely on the firewall that is running on the host it points to (iptables/Windows firewall). The requests that are getting dropped, are they to any http/https host or ones inside your local network that are connecting via the WAN IP?  i.e test.yourdomain.com is > your.external.ip.addr > 192.168.10.23 etc ?

Loulou82
Visitor

Hi 

im having similar issues did you get this resolved?

Are you using the hardware supplied by EE? If so can you tell me what hub/router you have? Are you also on VDSL? (FTTC) or FTTP? e.g full fibre to the ONT or does it connect from a master socket inside the house?

cafecubano
Investigator
Investigator

I have been running a seperate modem (HG612) and my own router/WAP for a few weeks now and this seems to have resolved the problem. So it seems there may be a bug in the latest EE hub 6 firmware relates to NAT tables or something that interacts with DMZ/port forwarding. Thanks to Jacob for the detailed answers and insight. From what I could find there doesn't seem to be any way to report a bug to EE so will just have to wait and see if they fix it. If anyone else is having the same problem I recommend dumping the EE hub and using something else. 

Hi Jacob, I did see your previous answer, thanks for all of the detail. I reckon you are on the right track with some kind of NAT table problem. It seems to be some bug or regression in the EE hub firmware, as the behaviour is new. I have "solved" by getting rid of the EE hub. We'll have to see if they take notice and investigate 

Hi Loulou, check the solution I marked in the thread. If you are having the same issue, solution currently seems be to stop using the EE hub and use some other modem/router.

I’m glad that’s solved your issue, when I was thinking about it, it could only be a numerous amount of things and that seemed the most logical, however we’ll never know until the manufacturer of the router or EE acknowledges the “bug” if one does exist. I never use ISP equipment, I’ve always used Ubiquiti, I’m glad it’s been stable for you though!