Maybe someone can help?
Assume IP Transit A and IP Transit B. Ingesting routes from both.
Transit B has a path failure but still advertises routes as preferable. They should stop advertising those routes but don’t.
How does a person fix this cock up??? There should be an easier way than dropping Transit B until they fix their path.
On 4 Jun 2024, at 7:36, Ron B via zanog-discuss wrote:
Maybe someone can help?
Assume IP Transit A and IP Transit B. Ingesting routes from both.
You are ingesting routes from both or both the Transit providers are accepting your advertised prefixes?
Transit B has a path failure but still advertises routes as preferable.
If you can be more clear to avoid assumptions.
Are you saying Transit B is still advertising prefixes towards you and you are preferring them because of your routing policy or you are advertising prefixes outwards to Transit B and they are onwards advertising them and this is still seen as the best path by the rest of the internet?
They should stop advertising those routes but don’t.
If a path breaks completely and routes are being advertised over that path that would be interesting to see. If the path is “broken” as in congested that is different.
How does a person fix this cock up??? There should be an easier way than dropping Transit B until they fix their path.
More detail would help but if you are saying prefixes are being advertised across a path that is truly broken that is a cock up.
To answer your first question, there are someones on this list that have enable on the large African transit providers that can probably help if the cock up is there. or those someones will know someones at upstreams to get them to help
Assume IP Transit A and IP Transit B. Ingesting routes from both.
You are ingesting routes from both or both the Transit providers are accepting your advertised prefixes?
Ingesting routes from both Transit providers and both are accepting the advertised prefixes.
Transit B has a path failure but still advertises routes as preferable.
If you can be more clear to avoid assumptions. Are you saying Transit B is still advertising prefixes towards you and you are preferring them because of your routing policy or you are advertising prefixes outwards to Transit B and they are onwards advertising them and this is still seen as the best path by the rest of the internet?
There were path failures to Amazon and IBM Cloud that failed within the Transit provider. Traces in both directions failed within the Transit provider. The policy preferred the routes via Transit B for Amazon and IBM Cloud. Not necessarily the rest of the Internet.
They should stop advertising those routes but don’t.
If a path breaks completely and routes are being advertised over that path that would be interesting to see. If the path is “broken” as in congested that is different.
That is what I was noticing, the path was broken but the routes were still being advertises over that path.
How does a person fix this cock up??? There should be an easier way than dropping Transit B until they fix their path.
More detail would help but if you are saying prefixes are being advertised across a path that is truly broken that is a cock up.
We agree on that. My opinion is that its someone playing silly buggers with peering. Its intermittent but it happened for 20 minutes last night. Have seen it happen twice last year.
To answer your first question, there are someones on this list that have enable on the large African transit providers that can probably help if the cock up is there or those someones will know someones at upstreams to get them to help
Most tickets are dealt with as an immediate and visual fault but asking about root causation meets with a brick wall or some BS template response that is irrational in the context of the fault. Like the submarine cable owners of 5 cables haven't explained how there is a single point of failure for the whole continent of Africa 250km off the Cote d'Ivoire. I assume that the Transit provider advertises prefixes to the cloud providers regardless of having an internal fault as the route metric is the path between the two and not the full downstream path within the Transit provider?
On 4 Jun 2024, at 8:53, Ron B via zanog-discuss wrote:
There were path failures to Amazon and IBM Cloud that failed within the Transit provider. Traces in both directions failed within the Transit provider. The policy preferred the routes via Transit B for Amazon and IBM Cloud. Not necessarily the rest of the Internet.
If the path eg. peering failed between the transit provider and as an example Amazon then Amazon would not receive your prefixes and not serve traffic via that path.
That is what I was noticing, the path was broken but the routes were still being advertises over that path.
I’m still not seeing how this is possible. If the transit provider’s link to another network is down the BGP session would be down and prefixes withdrawn.
We agree on that. My opinion is that its someone playing silly buggers with peering. Its intermittent but it happened for 20 minutes last night. Have seen it happen twice last year.
There are times when the cloud provider’s network is broken and they are still sending and receiving traffic over an IX but then it dies within the cloud’s network.
This does not sound the same thing. If you would like to share some of those traces either on or offlist I can look more into it.
Most tickets are dealt with as an immediate and visual fault but asking about root causation meets with a brick wall or some BS template response that is irrational in the context of the fault.
Might be worth escalating it with your transit provider if you feel you not getting a good enough response.
Like the submarine cable owners of 5 cables haven't explained how there is a single point of failure for the whole continent of Africa 250km off the Cote d'Ivoire.
Different thread but I will bite, we can only believe what the cable operators share with us in a public forum that 3 cables break due to rock slides.
If a network doesnt have diversity then yes it will isolate their African operations when a couple of cables break. It does seem that even with “local” presence and availability zones in ZA there is still huge reliance on international connectivity. We did discuss this at ZANOG-GP Workshop on network resilience.
I assume that the Transit provider advertises prefixes to the cloud providers regardless of having an internal fault as the route metric is the path between the two and not the full downstream path within the Transit provider?
If there is a BGP session then prefixes will be exchanged. BGP does not carry metrics like latency and packet loss and withdraw prefixes or weight prefixes based on that.
maybe it is time for SD-BGP and BaaS (BGP as a Service) :) remember to sprinkle some AI in there too
There are times when the cloud provider’s network is broken and they are still sending and receiving traffic over an IX but then it dies within the cloud’s network.
If the Transit provider's network breaks they do the same thing.
Different thread but I will bite, we can only believe what the cable operators share with us in a public forum that 3 cables break due to rock slides.
I'll create another thread.
maybe it is time for SD-BGP and BaaS (BGP as a Service) :) remember to sprinkle some AI in there too
I was thinking of writing a script that checks reachability to Amazon and then just withdraws the prefixes on Transit B. Or maybe a hack of a DDOS scrubbing system where if reachability to a designated set of address fails then advertise the shorter prefixes on the scrubbing centre so they act like a failover during the outage.
Most tickets are dealt with as an immediate and visual fault but asking about root causation meets with a brick wall or some BS template response that is irrational in the context of the fault. Like the submarine cable owners of 5 cables haven't explained how there is a single point of failure for the whole continent of Africa 250km off the Cote d'Ivoire. I assume that the Transit provider advertises prefixes to the cloud providers regardless of having an internal fault as the route metric is the path between the two and not the full downstream path within the Transit provider?
Sounds like you should deal with it with your wallet :-) and buy from someone else if they cant give you answers.
A support team that works with you is paramount.
Regards
Edd
Local preference
Sent from Outlook for Androidhttps://aka.ms/AAb9ysg
Internal All Employees
________________________________ From: Ron B via zanog-discuss zanog-discuss@lists.nog.net.za Sent: Tuesday, June 4, 2024 8:36:06 AM To: zanog-discuss@lists.nog.net.za zanog-discuss@lists.nog.net.za Cc: Ron B ronald@amastelek.com Subject: [zanog-discuss] Resilience
Maybe someone can help?
Assume IP Transit A and IP Transit B. Ingesting routes from both.
Transit B has a path failure but still advertises routes as preferable. They should stop advertising those routes but don’t.
How does a person fix this cock up??? There should be an easier way than dropping Transit B until they fix their path.
I may add also bgp prepend if i understood the question well
On Tue, Jun 4, 2024 at 7:54 PM Ben Roberts via zanog-discuss < zanog-discuss@lists.nog.net.za> wrote:
Local preference
Sent from Outlook for Android https://aka.ms/AAb9ysg
Internal All Employees
*From:* Ron B via zanog-discuss zanog-discuss@lists.nog.net.za *Sent:* Tuesday, June 4, 2024 8:36:06 AM *To:* zanog-discuss@lists.nog.net.za zanog-discuss@lists.nog.net.za *Cc:* Ron B ronald@amastelek.com *Subject:* [zanog-discuss] Resilience
Maybe someone can help?
Assume IP Transit A and IP Transit B. Ingesting routes from both.
Transit B has a path failure but still advertises routes as preferable. They should stop advertising those routes but don’t.
How does a person fix this cock up??? There should be an easier way than dropping Transit B until they fix their path. _______________________________________________ zanog-discuss mailing list -- zanog-discuss@lists.nog.net.za To unsubscribe send an email to zanog-discuss-leave@lists.nog.net.za
I use both, bgp prepend and loc pref for the same link; using either one independently sometimes did not work well.
On Tue, Jun 4, 2024 at 10:56 PM donald@nog.net.za wrote:
On 4 Jun 2024, at 22:25, Amin Dayekh via zanog-discuss wrote:
I may add also bgp prepend if i understood the question well
Can work some of the time.
but as Ben mentioned local-pref on the receiving network trumps bgp prepend.
zanog-discuss@lists.nog.net.za