Althea Development Update #76: Automatic connection tuning
Now we're exploring that we can do with multihomed users and connections
We've come a long way in the last year we now have a much more varied network. Last year we had a pretty simple tree topology with no loops or any good way to take advantage of the redundancy and routing that babel really offered.
Now we're exploring that we can do with multihomed users and connections. Which brings us to what I was talking about in my last dev update.
As promised the automatic latency tuner I was working on is out and being tested in production. The diagram above shows a 10x improvement in latency spikes after deploying our new tool.
As far as I'm aware this is the first system of it's kind deployed for the last mile. How if functions is pretty straightforward and for the time being flawed but still effective.
Each router passively observes the status of it's neighbors connections by scraping Babel, when the round trip time to it's neighbors starts to spike it will reduce the maximum allowed throughput on that connection by 20% in order to fight bloat.
What this is really tuning is the shaper in the Cake traffic queue dicipline by reducing the maximum bandwidth on the link eventually Cake becomes the bottleneck and it can carefully prioritize traffic across the limited link instead of allowing the link to increase latency and disrupt user experience.
There are several future improvements to be made to the design of this system, mostly around reducing the rate of false positives by observing things like how much traffic is actually traversing the link at any given time and by working with unidirectional latency rather than round trip time.
But for the time being it's already providing a nice improvement for users.
Gateway-client corner case billing issues
The billing issue I talked about last week has been identified as a variation on the gateway-client corner case. An edge case that has plagued us for quite some time.
It comes up when a user wants to run Althea as a standalone device on existing backhaul by being both a gateway, a relay, and a client all at the same time. This is problematic to handle for a number of reasons, the gateway must complete somewhat conflicting goals of sending all user traffic over the encrypted exit connection while still relaying packets for other routers to the exit using the backhaul directly.
At some point during this forwarding process the gateway counts it's own client traffic as a billing credit and throws the whole system off. This is pretty easy to fix by simply not using gateways as clients somthing that's already universally true in our actual deployments.
Most people don't spend their time plugged directly into a server in the back room.
We just need to finish unravelling some other technical debt and we'll completely remove this corner case from v10 of the firmware.
What's new In Beta 9?
- Automated detection and resolution of latency spikes
- Support for Multiple LAN ports
- Support for WAN ports with static IPs
- Logging is now easily configurable from the dashboard
- Fixes for the /neighbors endpoint
- Dependency updates for Clarity and Rita
- Reduced image size by excluding trace level logging messages