RIPE 92
Main room
22 May 2026
9 am :
AMEDEO BECK PECCOZ: Hello. Good morning everybody and welcome to the long awaited Friday morning session. Just to give you a quick overview on the session in case you do not know. We are going to have this session and then there will be the General Meeting results together with the coffee break so if you want to attend to the General Meeting results, you have to give up on the first few minutes of your coffee break.
So, the first speaker that we have today is going to talk to us about eyeballs. Please welcome Patrick with a big applause.
(APPLAUSE.)
PATRICK SATTLER: Yeah, welcome and thank you for joining this session on Friday. This work was done together with Matthias from Lars and Tim and Joe Hans, I am a...researcher and working at and working at Ben objection.
So yes, let's start right away, what is happy eyeballs, so nowadays we have a lot of options when a client connects to a service, you can connect through IPv4, IPv6, TLS or QUIC, the client needs to decide which record types to carry for and since ECH is also standardised, it might get more common to enable ECH or not. A normal user does not care about the protocol which is used to connect to this service. He wants to get the best possible experience and but we as technician it is may be want to have, get newer standards deployed and reduce adoption times and for this there's the algorithm called happy eyeballs which does connection parameter prioritisation in order to favour newer protocols but still keep the user happy.
So, why is it important for you? Currently there is standardisation of version 3 ongoing at IETF but still you might think this is a client problem, so most of you are operators, so why should you care. This is what I will show you in this talk is that this is important for user experience and that some of you, for example, who would operate resolvers, might learn something which can useful for you, for the users.
What we see is that different browsers and OSes behave differently, implement different versions of it and that has different impact how the user experience specific network conditions and in certain edge cases we can see there's significant user impact.
So our main contributions are that we develop different test cases for different aspects of the happy eyeballs algorithm, we evaluate existing implementations and also newly developed ones we created a web based tool to check each browser which has javascript capability on happy eyeballs implementations and we have blokes measurements in a test bed to validate these.
So to give you a short background on what happy eyeballs is and how it works, in version one, standardised 2012, it was standardised that it wants to prioritise IPv6 or over IPv4 and it does that using a thing called the connection attempt delay, you start your IPv6 in this case TCP connection and then handshake and then wait for something called a connection attempt delay and only if you don't receive a response within this connection attempt delay, you fall back to IPv4. This connection attempt delay is usually set to 250 milliseconds or recommended to set to that so the user will barely notice this fallback and still get working connection, if IPv6 isn't broken.
In version 2 the IETF added more ‑‑ added additional logic to also handle the DNS resolution, version 1 expect IPv6 and IPv4 addresses available to get this we need to resolve them and they defined that both AAAA and A records should be revolved, the AAAA queries should be sent first and the A queries will be sent and if the A response is received first, then the client should wait something like called the resolution delay to see if the AAAA response arrives. Only if it does not arrive within this resolution delay, it falls back to IPv4. So our approach was to have the client as a black box, so we don't know what happened. The client ‑ there are implementations where we don't have the source code also, the client is a black box, our server is configureable per test case. And additionally for the resolution we also implement a custom name server. Our web‑based tool should be easy to use, so everyone can just go on it and run some measurements to get results. And it's intended to collect results in real networks on real devices with real browsers set ups because we know there's a variety out there and we cannot cover them in artificial set‑up environments.
But on the other hand these measurements in real next can be influenced by caching which is not under our control, which we try to reduce it as much as possible, but we perform local test bed measurements to see what we see on this web based tool also is confirmed on such artificial environment.
Finally, our code is all OpenSource and we are also actively developing and adding new features to this tool.
We are also looking out to host additional instances, this is a call for action to ‑‑ if anyone has some interest in this, we would like to host somewhere else; currently it's only hosted in Munich currently, since this is a latency sensitive algorithm, it's ‑‑ it might be beneficial to us to have it in other places of the world to get more significant results there too.
So let's start right into the first test, the connection attempt delay. We implemented this connection attempt delay by adding a delay on the server side for some IPv6 and we did that by setting up different IP addresses with different delays per IP address.
So that's quite straightforward, also for delays based on this IP contact.
Looking at the results, this is what our website looked before one of my students improved the UI a lot, but it basically stays the same. On the top you see the delay which is applied to each connection and each line is basically a new measurement run. What you see is you see Chrome and you see that it uses IPv6 up until a delay of 300 milliseconds then and falls back to IPv4. This is a static delay. We always see the same behaviour.
For Firefox, it's very similar but the connection attempt delay is not 300 milliseconds but 250 milliseconds, so Chrome and Firefox are quite simple to internet and to interpret and how they work.
Now going over to the third major browser engine developer, Safari, there we see a very different picture, in general it prefers IPv6 but as we increase delay, there's no clear picture, you see switching from V6 to V4 and then back to V6, here we see a maximum delay of 600 milliseconds used where still IPv6 was used and I can tell you that we even observed that IPv6 was used up until a delay of two seconds. This is called dynamic connection attempt delay and also standardised in the RFC. We also talked to Apple, so they confirmed to us that they do it dynamically. What they told us is that it's not domain‑based but actually prefix‑based but this still doesn't not explain what we see here; for example, for 300 milliseconds we see in the first one v4, v6, and we see again IPv4, although the IP address is always it's same, it's just different domain names to prevent caching. So there is still a pattern to be found. We don't know the algorithm since the ‑‑ it's not public as far as we know. But yeah, so this is a behaviour where you might see interesting unexpected results compared to how Chrome and Firefox currently behave.
Now, this was the connection attempt delay where IPv6 was raised against IPv4, going over to the resolution delay, what we did we implemented a custom resolver and we encode the resource echo type and delay in the domain name and our name server then delays the response by what it's configured in the domain name and this way we can then check what the client connects to.
Now, first, Safari here, the first line is delaying the A record, the second line the AAAA and then it repeats basically. What we see, if you delay the A record, Safari does not care, it always used IPv6, because the AAAA record returns immediately.
If we delay the AAAA record, we see it will still use IPv6 up until a delay of 50 milliseconds and then falls back to IPv4; this is exactly what the RFC says or what it recommends so this is what one could expect reading the RFC.
Now, Firefox again very different. What you see here is not a resolution delay. What we find is that neither Firefox nor Chrome implement resolution delay, but Firefox even has a more specific behaviour so both wait always on both responses to return.
So Chrome as well as Firefox, the issue AAAA and A queries and wait for both to return, so they have already very long measurement run times. And Firefox even fails on some cases where the A record is delayed for too long. That's what you see here with the error.
This is, even though the AAAA record and response return, so it could connect the IPv6 directly.
So what we see is Firefox and ‑‑ this is not only true for Chrome but all Chromium based browsers. We tested different ones, they don't change their behaviour with regards to happy eyeballs; and in the current stable version, they do not implement any resolution delay and the outcome is that the DNS resolver's time out will decide how fast a page will load if there is a delay on the A record.
So what our suggestion is here if you operate resolvers, check your time‑outs and see if the time‑out is set to a sensible value and/or if you might want to lower it to improve the user experience in this edge cases where one record might be delayed or unavailable.
Furthermore, something I did not show you here on the slides is that we also found that there's different behaviour in certain proxy tunnel set ups, especially with icloud. ..hop relay with different egress operators, two of them are Akamai and CloudFlare. So even though you are using Safari, you have at least three different behaviours without the...what I just showed you, and then depending on the egress operator, you will also see different behaviours for Akamai and then CloudFlare. So yes.
So that means that the client and the network configuration are relevant to what you measure and what you see.
Going on from this, we decided to also measure resolvers so how do resolvers actually connect to our authoritative name server and what do they use, first we checked which record types, do they even send AAAA records and is the AAAA record received before the A record, we did it for resolver software, BIND, unbound and Knot and for some public resolvers, we see BIND sends both record types but always sends the A record first and the AAAA record after. But this is, but BIND then still always will use IPv6 if a AAAA record is provided, this is unlike unbound, unbound always sends the AAAA record first but will only in had 44% of the cases which we observed, maybe it's some randomness in there, we don't exactly know where it shows it; it's not always preferring IPv6, it seems to sometimes use IPv6 and sometimes uses IPv4. For Knot, we even have a different behaviour, it will either issue A or AAAA queries but not both at the same time.
And then also use and ‑‑ use AAAA and then in the end IPv6 only in Knot not even 30% of the cases.
The next columns what we observed the delay and there's a re‑try up served at our server and the number of IPv6 packets tell you if if there is a time out experienced, if it falls back to IPv4 directly or if it re‑tries IPv6, so for unbound and Knot, we see it re‑tries once even so with IPv6 it does not have a direct IPv4 fall back.
For the public resolvers, we have a very interesting public for Google public DNS, for uncached zone not issue an AAAA record for the name server name, until it first resolved or finished resolving this query and only responding to the client, it will issue the AAAA record, we assume for future queries into the zone to be able to use IPv6 then but it will never use IPv6 if it's uncached.
CloudFlare, QUAD 9 and open DNS all send, according to our observations, AAAA queries first and then A orders but CloudFlare and QUAD 9 only use 11 and 34% of the cases, IPv6, while open DNS always prefers IPv6 if it's available and it uses a time out of only 50 milliseconds which is very different to what we observed with any other resolver operator. Open DNS also implements fallback to IPv4 on the first time out already.
So the take away for us is that connection property selection is not only a prioritisaiton, sorry, connection property selection and prioritisation is not a web only issue, DNS protocol but also if you think about mail and other other protocols, they might also need to think if they want to prioritise protocols like IPv6 and for DNS, standardised at one time if it wants to use different DNS protocols to communicate with their authoritatives. So while happy eyeballs is standardised for web use cases, other protocols might also think and want to standardise such prioritisation algorithms for their exact use case.
Now, going over to happy eyeballs version 3, we implement, happy eyeballs version 3 now includes all this more complex options which I told you at the beginning, QUIC and ECH for example. This means also that we have a multi‑layer decision so where the prioritisation list you can chose between IPv4 and IPv6 but you can do QUIC on IPv4 and IPv6 so that's ‑‑ you have a matrix of options and you need to prioritise these. That means a higher complexity. And what we did here, we implemented a number of tests and we are currently still working on new tests, so if you have input, please tell us.
So one test is just does your browser even support QUIC or ... if it doesn't, the other tests don't make sense. We have a QUIC test to delay one protocol to see what happens and the final test is the most complex, we basically delay the protocol handshake so QUIC or TLS and also the IP version.
Now I want to show you a short demo of exactly this from our website.
Yes, so here you see our website and do you see the different tests which you can select from and like we now want to look into this combined QUIC and IP version delay test so at the beginning you will also see some explanation of what it actually does, so that you can understand it and here you can have quite a number of options which you can configure too but usually there are sensible defaults, we also have some explanation texts for you to understand what this option does, in this test I am reducing the number of tests a bit to have a shorter and more assurable demo, when you press run test, it will run the tests and this is Firefox nightly and it will prefer QUIC over IPv6, if you delay QUIC, then it will fall back to TLS. If you delay IP, it will fall back from IPv6 to IPv4. And if you delay both, you get TLS over IP 4.
So this is exactly what you expect. This was not the very first version which Mozilla implemented but we were in contact with them and this is what you would expect from what's currently thought of.
OK. Then coming to conclude my talk. So Safari, only Safari implements version 2, no other browser yet supports more than the connection attempt delay, but all three browsers are currently working on or engineering developers are working on implementing version 3. Chrome has a feature to enable this. Firefox you can available it in the nightly version. And Apple told always they are working on it, so we don't have more insight into that. What we see is if you use our measurements, what's important is not only the client and the OS might be important but it's also the network condition you are in, do you use proxies or not.
Then the resolver time outs, if there's no resolution delay is implemented, the resolver time out can heavily impact how the users experiences a service.
And our next steps are now implement more test cases especially with happy eyeballs version 3, you have a huge parameter space where you can have multiple different tests. And like I said, we are looking out to have different vantage points. So if you have any ideas on what we might ‑‑ or have some resources we can deploy such an instance, please contact us. Now open to questions. Thank you.
(APPLAUSE.)
ANNIKA HANNING: Thank you, Patrick. OK, do we have any questions?
All right, please state your name, affiliation.
SPEAKER: Lee Howard, happily retired. First of all, brilliant work. This is fantastic, really good news for the internet, really great to see this research, thank you.
And I have two questions, one is do you have an opinion based on this on what method, what variation would provide the best user experience? But the other question is a little bit more technical, maybe more interesting, is ‑‑ in Android phones, there's a software CLAT that I believe is adding some latency to when connecting through CLAT. I would love to see if that provides a different experience. So Android versus desktop.
PATRICK SATTLER: OK. So preference, what I think, it's a good idea to have something like happy eyeballs to have a guidance for clients on what you should prioritise based on measurements you have done, based on best practice, I think it's a good idea in general because yeah, for your comment on Android and CLAT, that's exactly why we created this web page, so everyone can bring their devices and certain network conditions. We don't have it yet, so, but yeah, I agree, that's exactly the reason why we did it on a web page with that simple to use way, so basically everybody can do it on their own, can bring your devices in your own network conditions you are interested in to see what happens. We also have on our new version of the web page, we have debug inside, if you hover over a specific delay, we will see when the client connected to the server and the delay between fallbacks and so on. So I hope this helps.
SPEAKER: Greg Choules, ISC. Back to slide 14, if you would please. It's not a quick one, I am just curious about your BIND results and if you want to talk to me afterwards about it, just don't know how you got to that conclusion, that was all.
PATRICK SATTLER: You mean how we measured it?
SPEAKER: How it was that you concluded that BIND sends AAAA after the A.
PATRICK SATTLER: This wasn't a local test bed measurements, we use the current stable BIND version and just looked on TCP what's first and also what we see at the server first. Maybe, so maybe there's a new version which we did not use, I would have to check which exact version we used. But this is so the resolver software side is always what we deployed locally and what we observed there so that's how we concluded it for the resolver, it's a bit more complicated for the public resolvers but for the local ones.
SPEAKER: I would love to see the PCAPs and configure at some point.
PATRICK SATTLER: I can try to. OK.
SPEAKER: Jen Linkova, network engineer at Google at someone who keeps disturbing... so I am not sure why you specifically focusing on Android when CLAT is implemented in like all Apple products, so right? So again we all know that happy eyeballs only exist in Safari, everything else is currently v1, for the three however, we keep having a very long discussion and I am yet to produce a PR for this draft and we kind of like two hours side meeting in the last IETF how exactly we go into treat various scenarios of v6 only, V6 only plus CLAT and do we actually want to differentiate it or just consider it's a dual stack stuff, right? Because one concern is you don't want your host and happy eyeballs implementation to care is it actually CLAT IPv4 addressor a real IPv4 address, and you cannot always actually tell that. So it's a bit tricky and it's a work in progress, right, so you are welcome to ‑‑ if you have strong opinions, I am quite sure as a working group, which is actually quite active, will be happy to hear your feedback.
ANNIKA HANNING: : All right. I didn't, thank you Patrick.
(APPLAUSE.)
Next we are going to have to two lightning talks, and for the first one I want you all to welcome Warren and Dhruv on stage, yes, take it away.
WARREN KUMARI: While we are getting slides, hi I'm Warren, this is Dhruv.
DHRUV DHODY: I will start talking I guess. Hi everyone, I'm Dhruv. I'm the internet architecture bot chair and we are here to talk about a workshop that we did and this is the report of the workshop that is under community review right knew, we would want feedback from this group, that's why we are here, let's start.
For those who missed the BoF, this is just a quick introduction of what IAB does, we are the body that provides long‑term technical directions, we want to make sure the internet conditions to grow and evolve and make sure that the architecture guidance and technical foresight is there. And one of the things that we do is pre‑standardisation workshop as an activity, while IETF is focused on drafts and RFCs, workshop is a way for us to just have a much more longer focused conversation, sometimes over multiple days and we can go and deep down. Whereas if you have attended an IETF session, know how we move from documents to documents and the discussion pace is very different.
So IAB workshops invite folks who are missing at an IETF meeting and we can have a much more in‑depth discussion and sometimes with chat am house rules or other mechanisms we can have more frank conversations not always possible in a public forum. But at least for this workshop, it was a virtual one and it was recorded and all the materials are available.
This workshop is basically so that we can gather more input and sometimes explore and then hopefully that leads to more work in the IETF, either in working groups, new working group or a research group or giving more direction back to the community.
So, this workshop we had it in December, we had around 67 folks participating virtual, there were 30 accepted submissions the report is out and it's under community review and all of the material is available if you want to check it out for more details.
The way we ran the workshop was the first day we focused on what is the current use case, tried to figure out how people are actually publishing the geolocation information, how they are discovering it and consuming it, there are there ideas for improvement, what would people like to see change and what is the future direction, where is this going, and with that let me pass it on to Warren.
WARREN KUMARI: Hi all. So I am going to be going through this fairly quickly because we would like some time for questions. So what are we talking about here? I think most people are aware of what geolocation is, or IP geolocation but a little bit of background what it was originally designed for.
Initially, the primary use for this was to try to get people to a data centre that's relatively close to them. So basically if you are in Edinburgh, we would like to get your traffic to Edinburgh or links, not serve it from a data centre in Sidney.
Once we started deploying that people started using it for a bunch of other things. Once it was deployed, people figured out if you are, for example, Papa John's and somebody searches for pizza shop, you should give the person an answer somewhere in Edinburgh and not, for example, in New York.
It then got expanded for a bunch of other uses, so if you only had the rights to distribute a video stream for a football match in Spain, and somebody is trying to access it from Germany, maybe you shouldn't provide that content to them.
It has then been also sort of tried to be bent and used for other things, like for example on this side of a bridge, you are allowed to do online gambling but on the other side of the bridge you are in a different country and you are not. So what IP geolocation is being used for isn't really the original use case and it's probably not well suited to the new use cases. This is basically the same thing, right, originally it was a network level thing but it's trying to be used as well foreign enforcement and much more highly targeted location info.
For example, contractual stuff and law enforcement also would potentially like higher resolution stuff for things like if you make a VoIP call to emergency services, they might want to be able to deliver the ambulance to you and not to sort of the city that you are in.
How does geolocation work? It was originally written up in this RFC and it's a CSV format, so literally prefix, country, region and city. At one point it also had postal code but that is deprecated because when we wrote it, we figured that was much too targeted, it was intended to be less granular than that sort of thing.
So network providers would put their CSV or geofeed file and publish it somewhere, but that's not really helpful because if people don't know where the file is, you can't actually get the info.
So there's this other RFC which allows you to publish the URL for a geofeed file in IRR data, so somebody can download the entire RIPE database, databases from other IRRs walk through them, collect all the URLs, then go and fetch the files, collect all of that, and then you have a mapping from IP address to location. If you don't want to do that work yourself, there's a whole bunch of geolocation providers who will do this for you, often at a fairly substantial cost or Massimo Candala has a site called geolocate match, he has done this work and you can just download the geofeed files in one big side block.
So we had this workshop and one of the exposed a bunch of concerns. One of them is it's not entirely sure what the location actually represents.
Right, if you get a thing saying this prefix is in the I EC C in Edinburgh, is that where the user is using the IP addresses or is that where the CGN the user is behind is, is that like where your network eGreece point is, is it the where the Org I had ID registered or just the country it's being used in.
But there were a bunch of other things located sort of poked as gaps, IP addresses don't really have a jog grave, they move around, they aren't originally intended for this sort of use, the CSV format doesn't have anything saying when the file was last updated. So if you have got a file that's ten years old, is it still actually valid? If it was updated last week, it's probably more likely to be valid but another one, there isn't really user consent for this. Your network publishes it and if it's just publishing, this address is used in this country or city, that's probably fine, but now it's being used for more targeted and higher resolution things, should the user be involved?
I think we still have some time, even though it's gotten to... that's the time? So why did we have this workshop? It's clear that IP geolocation isn't going away, but it's not really well suited for some of the current use cases that people want to use location for. An example, we seem to be seeing more satellite networks. If you are sailing a boat off the coast of Edinburgh and you are using Starlink to make a call to the Coastguard, should the Coastguard show up where the boat is? Should they show up where the satellite down link is? Or should they show up where was house was when you bought Starlink? Clearly one of those is the right answers and the others aren't.
So there are some changes from the workshop, one of them is it was fairly clear that people don't like the CSV format for geofeeds, it's annoying to to parse and it's not very expressive. I submitted a new draft which is the same geofeedstuff but in JSON with a new mandatory lost updated time. A lot of people want to add additional info to that, but also we are looking at doing things like adding new mechanisms so that this can be signalled in HTTP or through an application and then the user hopefully will have a lot more control over this. So if you are trying to buy a phishing licence, possibly all that the provider needs to know is you are roughly in this country, whereas if you are trying to find the nearest pub, possibly you need higher granularity and maybe browsers and phones and similar can provide you a control to send how much info you share.
So next steps, there's going to be some more work. There are some drafts coming out of this. But also this is RIPE and there is an Address Policy discussion, there's database discussions, etc. This is where people actually understand addresses and how they are used and deployed show up. So we'd like your input as this work progresses. Please participate. And I think we have a tiny bit of time for questions.
Apologies for going through this so quickly.
(APPLAUSE.)
SPEAKER: Lee Howard, happily retired. I am not sure I heard in the workshop that people didn't like CSV, I heard it was the current version was insufficient, I want to recharacterise that. It is my personal opinion and I have said this, that the first use case that you provided of using latency to infer where the nearest data centre centre, that's architectural, that's not even geolocation, that's using the network to provide the performance, great. Everything after that is about adding the Meta data on top of and saying Edinburgh as opposed to saying 11 milliseconds and I only like that original case, all the rest are adding Meta data.
WARREN KUMARI: A quick clarification for getting users the close thing, originally providers were publishing the geofeed so a CDN could know when a connection comes from this address, let's use this one. But yes.
SPEAKER: ... thank you, speaking in my personal capacity. And based on the geofeed basically... other requirement and also that was a... into the Whois either in... RIPE region but actually the courage with all the different region was very insignificant, the coverage was less... is there any way we can ‑‑ to promote a member to have a geofeed information with...
WARREN KUMARI: If you go to geolocatemuch.com or dot or dot something, search geolocate, Massimo has a bunch of stats published on his page and it looks like a very good uptake at least of people publishing geolocation. The most recent RFCs where you publish an attribute in Whois, that is still fairly significant so maybe we can chat more later about stats.
SPEAKER: R Greg Choules, ISC, regarding the point about taking the user to the nearest data centre, we see a lot of use in ECS in DNS specifically for this point. There was a talk earlier in the week or I can't remember about IPv6 effectively obfuscating your location by using IPv6 addressing so that doesn't help an awful lot, that's going to make things worse and just a point on should we be using IP addresses at all, and you kind of raisedthe point, why not use your GPS location, other satellite locations systems may exist.
WARREN KUMARI: Yeah, I mean what geolocation is being used for now isn't really the original use case and intent, and it feels like having some better solution, preferably where the user has a lot more control is the way to go. Right. Having different services have different amounts of granularity seems really useful. There's the slippery slope argument. Most apps on my iPhone require that I give them some sort of geoinfo or access to geolocation, and for most of them I don't really understand why they need that, but if I don't give them the access, I can't use the app. And so how do we do this in a way that people can choose what information to share, what granularity and not be stuck in the, you have to do this or you can't reach the service.
SPEAKER: All data is useful to somebody somewhere.
WARREN KUMARI: Thank you very much and come chat with us after if you like.
AMEDEO BECK PECCOZ: Thank you very much. There was another question. Sorry, your speech was too interesting.
I think that's the problem. Please ask the question. OK, now I would like to call Ondrej on stage, he is going to talk to us about ASPA, the stage is yours. (APPLAUSE.)
ONDREJ CALETKA: Hello, so let's continue in this week of ASPA, we started on Monday, we had ASPA working group and now we have ASPA closing. I am going here with the public service announcement what we did this week in the RIPE meeting which probably was not good idea.
So for RIPE 92 we did ASPA signing because ASPA is the new kid on the RPKI block, so we were eager to try it. First of all, we assigned ASPA object for our autonomous system and it was straightforward, it was the very first ASPA object our RPKI portal created when it went live in November last year. Other ISPs started signing their ASPA as well, so this is fine. And we of course started also dropping invalid ASPA paths in our network because it's that's ‑‑ what you are supposed to use ASPA for, is something nobody is doing because the support in big router vendors is not good or virtually nonexistencetant, we use BIRD and you heard Maria's talk, BIRD has ASPA that actually works, how we understand the draft.
Yeah, and maybe after this talk, you won't be doing the same thing that we did. So first of all, how does our RIPE meeting work? It's a stub network, we have our ASN and IP addresses, usually we are connected to only one upstream ISP that is providing us not with default route but full BGP table, so we can do ROA validations, that's what the community asked for, we basically validate the ROAs and if you find that some prefixes originated from invalid ASNs, we drop it and not put into our routing table. This works for years without issues, even though it's debatable whether it has any effect on this single home snub network.
And we did ASPA validation this week because it was like the obvious thing to do.
Well, and then I asked on Monday whether if you something that doesn't look right, talk to me or text me and we'll sort it out. So I got the first report from the other Ondrej that you know, the Chair of RIPE NCC board, who is from CZ‑NIC, and he complained that he cannot reach it over IPv6. I looked it up and found if you we don't the route in our router because it's ASPA invalid, why is it ASPA invalid? Well, becasue our ISP has also signed their ASPA. CZ‑NIC also did their homework and signed their ASPA, both have a list of the providers that they are providing their service, but both of them are also peering with this ASN 6939 which is well know known of Hurricane Electric and this network is quite known for providing free IPv6 transit.
So basically but because it's free, there's no contract with them of course, they are not listed to either side's ASPA list of providers, so what ASPA sees on this is basically the textbook example of route leak by a peer, like there's a peer of those two which is leaking traffic. So this is exactly the valley in the valley‑free routing concept of ASPA, both our ISP and destination are appearing with the same party, neither of them consider them their provider but they do act as a provider and they do a good job, they deliver the packets so it's not just a configuration mistake, they are doing the job of provider so ASPA is doing the right thing. This is the route leak and it was prevented by ASPA validation.
So there's the problem. The problem is that of course this is very like not the only path from here CZ‑NIC there's other paths going through the approved providers but if our ISP is not dropping ASPA invalids, they are only share the best path with us, the best one is going through this peer which is not their provider, so if we drop it, there is no other alternative path we could use, we will just end up with no route.
So how we fixed it quickly, well, when I explained Ondrej what exactly is the problem, I just ‑‑ he just went to ASPA portal and add the Hurricane Electric to their providers, the ramps are now touching, everything is fine and nice experience was it took only ten minutes from the ASPA portal to running our information base, so the changes in the RPKI are quite fast to propagate. There's a link if you want to see it, how it looks, it's probably the picture it better than lots of words.
Anyway, the point of my talk is that yeah, the ASPA validation is a good idea, you definitely should to validate your customers, your peers, your redundant links, all the links that if you lose the route, you will find an alternative way. If you have a transit of last resort and you will start dropping ASPA invalids, the two options you have is that if you drop them, you will end up with no route and if it's your problem to solve because our router is dropping your traffic, if you send the data anyway, the worst Fing that can happen your ISP will drop it because there's some route leak but it means it's your transit problem and you pay your transit to deliver packets and it's the last resort of ‑‑ transit of last resort so that's basically what you are supposed to do. So for me it seems like B is actually the better option here. And in case of our single home network basically we only have the transit of last resort, our only BGP neighbour.
So with that, I have some call to action, so go there, sign your ASPA but please keep the data up to date and validate your customers and peers for your last resort transit may be not. And now I believe I have some time for comments, questions.
(APPLAUSE.)
ANNIKA HANNING: Thank you Ondrej.
ONDREJ CALETKA: I see you are already queueing from the beginning.
ANNIKA HANNING: : We also have online participants so we should do this first.
SPEAKER: Somebody needs to grant the microphone to the person of the queue. Who is the Chair of this session? Somebody who is chairing the session in Meetecho has to press the green thing. Or maybe somebody could do it, just giving a tip, yes.
RUDIGER VOLK: OK. Ruediger Volk, retired ‑‑
SPEAKER: 30 seconds, Ruediger.
RUDIGER VOLK: OK, what we are seeing here is kind of when ASPA had one of the final runs of discussion decided that, well, OK, the IPv4 and the IPv6 AS topology doesn't seem to be the same. It looks like the Hurricane Electric thing in the real network has been overlooked, and that's actually the proof that the topologies are not the same and there are consequences to it unfortunately, sorry.
ONDREJ CALETKA: Maybe. I still feel like this is basically an example of a route leak and it doesn't matter, cannot provide IPv4 trait as well, nothing will change conceptually on what happened here.
ANNIKA HANNING: All right. I guess we can maybe take the next question? OK. You queued first and then we will bounce back and forth there.
SPEAKER: Job Snijders, would you mind going back to slide four.
I am a little bit disappointed in this presentation in that I feel that you attribute the problem to the wrong aspect of the graph, and I think spreading information that you should or should not do ASPA verification based on a wrong interpretation is harmful to the project.
In this scenario, the destination either misconfigured their ASPA and resolves it by configuring their ASPA correctly or they conclude Hurricane Electric is leaking the route, we will disable the peering with Hurricane Electric. Concluding from either one of those options that the problem in 2121 doing ASPA verification on the route, I think that does not logically follow, the problem really is between hurricane and the destination.
So telling the audience, hey, we should not have done ASPA verification, I am like no, either the destination should configure the ASPA correctly or hurricane should not provide free transit without authorisation of the peers whom prefixes they are propagating. But the conclusion like the problem is with the RIPE meeting is no, that is where the symptom manifested but the problem is on the right side of the graph and I think framing situations like this needs to be done really careful in order to not discourage the community from using ASPA for its intended purposes.
And your conclusion that dropping ROAs or dropping routes that are invalid based on the ROA is not a problem but with ASPA it is, I think that's not a correct conclusion. I could not reach my ROA destination but I don't complain about it because I either need to fix my ROA ‑‑ you know so why, it's the same. If you can can drop ROA invalid routes, you can do the same for ASPA. And the problem is the remote site, there you must fix the problem, if you consider it a problem.
ONDREJ CALETKA: Yeah, I understand your point, but the biggest difference ‑‑ and that maybe what I wanted to point out here is, if there is the wrong origin with ROA validation, it's wrong basically, you see it from everywhere on the internet.
SPEAKER: Only from the networks that do validation, it's the same principles.
ONDREJ CALETKA: It's not because the problem is that there is definitely a path that is authorised between our ISP and the destination but we will never see it because the non‑authorised is shorter and is better and they only get the better one into our network. So we cannot see that there is other paths that we can actually send the data other path.
JOB SNIDJERS: I think the points were really.
ANNIKA HANNING: : We need to hurry.
JOB SNIDJERS: You are single homes taking in a routing table, without default route, you are filtering out things from the routing table and you have a partial routing table as a result and whether it's fee based or ASPA based, is irrelevant.
ANNIKA HANNING: : OK. Maria, you are next. And then I think we have another participation from the online.
MARIA MATEJKA: I would like to thank you for illustrating one of the problems I have been complaining about in the Sid repository working group which is that a wrong ASPA signature is going to brake havoc on the other side of the world and we are just lucky that it was CZ‑NIC and not some weird network on the other side of the world because if that happens, nobody is going to reach them. So I'd like to ask you to kindly write a report about this for the Sid repository working group.
ONDREJ CALETKA: I will try, thank you.
ANNIKA HANNING: Next is online. Or... well please go ahead.
SPEAKER: I am not sure I was actually in line before Gert, but apologies. I have been thinking about this since Monday and I am happy that you took to the stage and made this presentation, thanks a lot. I must admit my initial thought was OK, so RIPE NCC obviously they botched this and they should not try playing ISPs blah blah blah, why do they have... and so like people already mentioned. But in hindsight I think this was a very good experiment. And also I would have brought into exactly the same problem because we are in the same boat with these peers that provide, that I consider as a peer but they provide like courtesy or gratuitous transit to the mall so yeah, that's something I have thought about but this made it more clear that we have to deal with this, with this situation better and so I think also what Ruediger says, yeah, people are aware that IPv4 and IPv6 topologies are not congruent but you have to be inclusive now, you simply have to include your IPv6 only transits in your ASPAs. So my question is what do we collectively do with Hurricane? Do we tell them OK, stop pretending to be a transit, you are not a transit, you are a peer, stop announcing our routes.
ONDREJ CALETKA: I think I got the point, let me answer quickly. I don't think this is something specific to Hurricane Electric. It could be any misconfiguration leak and it would be basically the same and I would just say if it was leaked, it would be dropping... or blackholing them, the worse thing we would send the packet to our upstream and it would be dropped further when we eyed it, we dropped it and here that was the point of my message, either way the packets would be dropped.
ANNIKA HANNING: : I think we have still if I am for another question.
SPEAKER: Gert Doring, long time IPv6 user. I think there's explaining this was a very valuable exercise and I disagree with Job that it's the same as for ROA invalids because it depends on where the topology you look and where you do the validation, so if your ISP had done the validation, they would have said this is a leak and just routed around it. With you being a food chain all the way down, doing the validation down there and not having a default route might not have been the best thing to do but it pointed out the problem so it was a useful exercise. What we would recommend to stop ASes that do have peerings is to basically accept from the upstream and filter on the peerings so you don't get the leaks and you have a last resort because not having connectivity is not what your customers expect. And the Hurricane Electric thing was something useful 20 years ago when we had various IPv6 connectivity. And since at least ten years, this free service was with no SLAs attached and, thank you very much, is causing more harm than good.
ONDREJ CALETKA: Yes, thank you.
ANNIKA HANNING: : All right, we have still a bit of time so please, go ahead.
SPEAKER: Speaking with two hats, one of a former Anycast DNS... and two with hobbies network which happens to provide internet to people. So first there are free transit everywhere and you did the right thing, dropping the pass, is there an IS hashing upstream and is it OK? Is it right to do so but if it's just... and it has to be dropped. And second, I do filter ASPA on my hobbies network and nobody complained about anything because I have no harm done. So if someone leaks something, I can do something else and still have connectivity. So the issue is not ASPA, it's... and having peering on transit, sorry... so it's not an ASPA issue, it's a design issue.
ANNIKA HANNING: : All right.
ONDREJ CALETKA: Fair enough, I don't have a reaction to this.
ANNIKA HANNING: : Unfortunately to I have to cut this short, thank you for your participation, it was great, so one thing, I want to ask all of you, please rate the talks.
It's important that we get feedback. So please rate the talks. And the second thing I want to announce is yeah, there are still t‑shirts apparently left so if you are thinking maybe yeah, I already had a first t‑shirt, maybe what about a second t‑shirt, so yeah. Thank you all. Thank you Ondrej and enjoy the rest of RIPE.
(APPLAUSE.)
And a big happy birthday to Karla!