cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

IPv6 Fragmentation "Black Hole" on EE/BT 900Mb FTTP

planetf1b
Established Contributor
Established Contributor

I'd love to hear if anyone can validate this -- I've been fighting with an iOS power drain issue, but in fact it's actually a IPv6 issue which occurs both with the EE Homehub AND with a opnsense router. Basically it seems as if the EE network is breaking path mtu discovery both by dropping fragments, and in not returning the correct ICMP responses.

This is quite technical but it's quite a fundamental issue (it feels to me as it's a 'broken network'). Of course I may have made an error:

More complete AI-summarised version below.

Observation: The upstream network (EE/Openreach backhaul) appears to silently discard outbound IPv6 packets that require fragmentation. This breaks Path MTU Discovery (PMTUD), causing connectivity stalls and high battery drain on devices attempting to negotiate packet sizes (e.g., Apple devices using iCloud Private Relay).

Context: ISP/Plan: EE Full Fibre 900/110. Hardware: Tested on EE Smart Hub Plus and confirmed consistent on OPNsense (Intel N100). MTU: Link negotiates at 1492 bytes (PPPoE standard).

The Limitation: The network accepts unfragmented packets up to the link limit (1492 bytes) but drops any packet that is fragmented, regardless of total size. Crucially, it fails to send back ICMPv6 Type 2 (Packet Too Big) or Type 3 (Time Exceeded), creating a silent "black hole."


Test Methodology (Reproducible)

You can replicate this on any Linux/macOS device (Pi, Mac, etc.) connected to the router. You need two terminal windows open.

Step 1: The Monitor (Terminal A) Run this to listen for "Packet Too Big" or error messages from the ISP.

sudo tcpdump -n -i any "icmp6 && (icmp6[0] == 1 || icmp6[0] == 2 || icmp6[0] == 3)"

Step 2: The Trigger (Terminal B) Run these pings to test the boundary.

Test A (Standard Packet - 1492 bytes): Payload 1444 + 48 bytes headers = 1492 bytes (Fits in one frame)

ping6 -c 3 -s 1444 ipv6.google.com

Result: Success.

Test B (Fragmented Packet - 1493 bytes): Payload 1445 + 48 bytes headers = 1493 bytes (Forces fragmentation)

ping6 -c 3 -s 1445 ipv6.google.com

Result: 100% Packet Loss.

Step 3: Verification Look at Terminal A. Expected Behavior (RFC Compliant): You should see an ICMPv6 error message. Actual Behavior (Black Hole): The terminal remains completely blank. The packets are dropped silently.

20 REPLIES 20
JimM11
Community Hero
Community Hero

@Ewan15 You will also find on the Ispreview Forum topic discussion with the OP!

planetf1b
Established Contributor
Established Contributor

@Ewan15 Thanks *so* much for trying that out!

Router's can't fragment on IPv6 - however clients can, so they send the base packet, then a packet with an extension header (44). These are discarded on some networks, but I don't think this is EE -- it's variable depending the routing - it's also, annoyingly, quite common as there's a concern of DoS and other attacks from fragmented packets. So if say a tunneled route (vpn) uses a mtu such that it's tunneled packets won't fit in the actual connection, then the fragments will be created there. But that's all local. It cannot send anything > mtu (hence fragments)

Anyway lots of places do that, so my main focus was on the icmpv2 responses.

Google is failing at 1452 because we have payload (1452) + icmp v6 header (8) + ipv6 header (40) = 1500

Whilst this is the physical interface mtu, the actual advertized mtu for the ipv6 route is 1492. You can see this in an RA from the EE hub:

On *nix this can be seen with: sudo tcpdump-ni <interface> -vv ip6 and icmp6 and 'icmp6[0] == 134'

Within that there'll be a line
 mtu option (5), length 8 (1): 1492

So in this case it is your local device that knows the packet is too large, and therefore sending a *LOCAL* response. It's not validating the network.
 
Google will actually go up to 1452 if you use 'baby jumbo' frames (I've tested) -- and this is fine with EE.

As was pointed earlier one 'flaw' in my original post is I picked specifically on reuters as causing a problem, but that doesn't prove or disprove it's EE. I've also had issues with *some* amazon endpoints (from both mac and linux) (for example with a packet size of less than 1300).

WHat's odd is I've *never* seen an ICMP type 2 packet arrive 

A specific example. This is one resolved amazon address (I've seen it with other providers too)
 
sudo ping6 -c 5 -s 1444 -D 2a00:23c7:60de:c700:8872:afc7:95e6:2265
->
ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 4085ms

even whilst: 
sudo tcpdump -ni any -vv ip6 and icmp6 and 'icmp6[0] == 2' &

is running

So it's possible amazon (or any intermediate transit) is blocking them. Or reuters.. Or ....

I can believe there is some varied configuration in the links - which may be why they have varying mtus.. but I'd be surprised if they'd all also block the responses?

Really need to test on another isp ...



 

I have configured my Ethernet MTU to 1508 to allow an MTU of 1500 on my EE FTTP connection. When I run one of the examples above I do not experience packet loss:

pi@raspberrypi:~ $ ping6 -c 3 -s 1445 ipv6.google.com
PING ipv6.google.com(uv-in-f101.1e100.net (2a00:1450:4009:c15::65)) 1445 data bytes
1453 bytes from uv-in-f101.1e100.net (2a00:1450:4009:c15::65): icmp_seq=1 ttl=113 time=6.05 ms
1453 bytes from uv-in-f101.1e100.net (2a00:1450:4009:c15::65): icmp_seq=2 ttl=113 time=5.93 ms
1453 bytes from uv-in-f101.1e100.net (2a00:1450:4009:c15::65): icmp_seq=3 ttl=113 time=6.73 ms

--- ipv6.google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 5.931/6.237/6.734/0.354 ms

Does this help at all? With your connection are you experiencing 100% packet loss when running the above command?

 

planetf1b
Established Contributor
Established Contributor

I've done a more in-depth analysis

The scripts I used, and results are at : planetf1/bt-ee-ipv6-blackhole · Discussions · GitHub

In summary
* Some sites work fine with a 1500 mtu including apple, cloudflare
* others fail - aws, gcp
* this in itself is normal/not an issue. However the network not responding with 'packet too big' is a serious issue and protocol breakage

Mustrum
EE Community Star
EE Community Star

@planetf1b   Could this be another limitation within the network related to the UDP issue mentioned https://community.ee.co.uk/t5/Broadband-Landline/UDP-Packet-drop/td-p/1571725  ?

Getting to the right level of technical support is going to be you biggest challenge I suspect.

@Mustrum - I doubt there's any correlation; the two problems are very different in nature.

That said, I'm still at a bit of a loss to understand what @planetf1b's 'real world' problem actually is?

eg. there's mention of problems using Apple Private Relay and docker pull commands but I have zero issues with either of these 🤔

planetf1b
Established Contributor
Established Contributor

I have taken some time to put together a more technically detailed *and reproducible* analysis -- the previous discussion was a little ad-hoc (my fault).

This new writeup is technical and precise, and should be reproducible (or at least be able to rerun) by others. I hope it's also suitable for quick review by a network engineer. I've seperately shared an earlier version in a community forum and at least one person has reproduced similar issues with the script.

https://github.com/planetf1/bt-ee-ipv6-blackhole

In summary the current EE IPv6 support seems to breach standards which causes application failures

planetf1b
Established Contributor
Established Contributor

@bobpullen you asked about real-world impact - and I agree it may not be obviously apparent.

Applications will use a mix of different network protocols to communicate. 

Firstly there's IPv4 and IPv6. Which is chosen is affected by the ways names resolve to addresses (depending on DNS server for example) as well as their own logic. Some will prioritize IPv6, others may do not.

Then there's different protocols like TCP (making a connection), UDP (connectionless). And even within either there may be further layers like VPN tunnels.

Most of these apps will have some 'fall-back' to work around broken infrastructure. Sometimes the protocols themselves have tweaks built in to detect these 'black holes'

Typically the impact can be one of several things
- No impact
- a delay on first connection (quite typical)
- higher power drain (this is where I started... on a mobile device the 'radio' is kept active whilst waiting for a reply to a request that never arrives. With many apps and overnight this adds up)
- connection stalls - starts working (when packet size is smaller) then pauses (for a bigger picked when it gets no response)
- connection failures - Anything using a tunnel may fail to establish the connection in the first place.

So what we have is a whole bunch of 'odd' behaviour where many may struggle to specifically associate the issue above with the problem (usually delays or unreliability).

The above is an attempt (as an engineer) to tease out & discover the root cause (at least as far as the end user on a connection is concerned) to get answers/fixes from the network provider.

Hope that helps a little -- but I do empathize with the view 'it's working for me'. the truth is there are multiple layers trying to work around the problem, and causing side effects.

@planetf1b - thanks, I broadly understand the underlying network principles.

I'll not press the matter but have to admit, I still remain confused as to what noticeable application use-case is falling foul of this outside of your analysis/what AI thinks. From your listed impacts, it sounds to me that the 'only' thing you've personally encountered is smartphone battery drain? It would be interesting to see some evidence that correlates the two things e.g. a packet capture showing a native smartphone application sending large IPv6 payloads, failing to get a response, re-transmitting, stalling, falling back to IPv4 etc.


@bobpullen  wrote:
Have you repeated the tests on a non-BT connection with differing results?

I also think this ^ might help validate/further your case i.e. an example of your tests doing what you're expecting them to do whilst connected to a non BT/EE Internet connection. 

Another thought is that you could try removing your router/hub from the equation by hooking a PC up to the ONT directly and having it use a PPPoE dial-up connection; if only to remove a point of obfuscation.

What's piqued my curiosity is the fact that there a a few online tests out there that purport to check for this sort of thing; and when I try running them, they all pass 🤔

From the 'Tests Run' tab > Technical info here: -

Screenshot 2026-03-10 10.57.17.png

From here: -
Screenshot 2026-03-10 10.59.12.png

From here: -
Screenshot 2026-03-10 11.00.30.png

planetf1b
Established Contributor
Established Contributor

Thanks for the reply
- the summary is AI-assembled (hopefully for clarity!) but I'm a software engineer and was adding IPv6 support to apps in he late 1990s... 😉
 - some sites work perfectly at 'large' mtus (right up to 1500 mtu - 1440 mss - if using RFC 4638 - though the EE router is only configured for 1492/1432 due to the 8 byte overhead for ppp). The issue is that some sites/routes cannot take the full size, which in itself is fine, except that the sender never gets notified.. (except through timeout etc)
- I mostly use opnsense (open source router based on Free BSD), but have also used a EE hub for comparison
 - I work in sw dev, so will look at apps misbehaving (is it a broken app, the os etc) this is clearly a network issue. cumulative effect of timeout on battery can be huge, and delays can = bad user experience
- Agree ref: other iSP. I did actually run some tests on a hosted service.. I should include those results as a comparison

I'm back in touch with the exec office. Not much more I can do beyond that except switch ISP.