Althea Development Update #53: Timelines and building rock solid software

I’m happy to say that we’ve brought on board 3 new developers since our last update. We’ve also expanded the marketing team to add some flash to our substance.

On the subject of development we’re merging some of our longer term feature branches while bringing these new developers on board. So Alpha 7 will be a bit late.

Speaking of releases here’s a rough time-line for features.

Things coming in Alpha 7

  • Exit signup refactor: It turns out there was a lot of work between a mediocre sign up solution and something we could expose to the world without concern. The new exit signup procedure is actually sufficient to verify users are human and is generally more robust to failure.
  • Live toggling for wan ports: One of our favorite demos is to hook up a few routers and start moving connections around, quickly demonstrating the indestructibility of the mesh routing. But there was one cable you couldn't move, the gateway WAN ports used to bridge traffic out of the mesh and onto the internet were perilously finicky. If not setup perfectly on router start any attempt to use them would fail. Now you can add/remove or change the WAN connection at any time without issues.
  • Off-chain payments code: Our first valid eth transaction back and forth is on track for Alpha 8, we hope to be testnet ready by Alpha 9 (that’s early September for those paying attention)
  • On-chain payments code: The contract we want to use for our payment channel is already complete but integration and the resulting testing will probably take a while. I’m estimating Alpha 10 before ‘on testnet’ usage doesn’t involve devs bailing out routers burning test coins all the time.
  • Peer discovery refactor: Rust is a very great language when it’s safety properties are used well, I’ve been going back over a lot of the original peering code with a fine tooth comb and the goal of higher day to day stability. I’m hoping we’ll see this in Alpha 8.
  • Simplified API endpoints: Early on with the router dashboard we used a /settings API to do just about everything, this was great in that it was super powerful and very quick to make, but long term we would like to avoid posting highly specific structs from the frontend. It’s a recipe for tight coupling and spaghetti code. We’ve added simplified endpoints for exit signup already and we hope to have everything else soon.
  • Improved and far more helpful dashboard: Right now the dashboard is quite bare, we brought on another front end developer to really polish it from something with the right buttons to something legitimately helpful in setting up and debugging networks.
  • OpenWRT version freeze: We keep getting caught by surprise by OpenWRT master breakage, building off of master was fine earlier on in our development cycle but it’s past time to freeze on a commit hash that we test before we update. This way we can just not update when a major device decides the latest OpenWRT build means it should brick itself .

What does ‘stability’ mean for Althea?

I’ve been talking about stability a lot, it might give the impression that our Alpha releases are totally broken. That’s only partially true, while features are often incomplete my personal router is untouched since Alpha 5. It has completed multiple update cycles without the slightest bit of attention.

If Althea was meant to be used in low radio noise environments with a direct fiber line for backhaul like my apartment we would probably consider stability a done deal.

Maybe half of the deployments we’re working with are in technologically modern areas with accessible backhaul fiber, where internet access for consumers is simply too expensive.

The other half live in some of the more challenging networking environments in the world. For these people it’s often not that internet access is overpriced, but that it’s simply not sold or is of such poor quality as to be unusable.

Ultra-high latency, reordered packets, flaky connections, and protocol non compliant devices litter these sorts of environments.

Somewhere in between California and Australia there’s an ISP that will buffer Wireguard packets and play them back out of order 10 seconds later. All while prioritizing ICMP, HTTP, and HTTPS, making the issue almost impossible to detect.

From the perspective of a deployment in Australia this was Althea just not working.

Our first outage report for the Clatskine test network was caused by supposedly ipv6 complaint point to point links suddenly generating spikes of 300ms latency versus their usual 10ms when exposed to pings to ff02::1.

From the perspective of the Clastkine network this was Althea just not working.

Slack and many other websites firewall off TCP MSS autonegotiation. This means they hardcode the segment size of their traffic to a standard 1500 byte frame. It’s extra insidious because protocol complaint websites will work flawlessly and you’ll sit around wondering what’s going on.

From the perspective of any user of that build this was Althea just not working.

I think I’ve made my point. The actual ‘mesh networking’ is just the beginning, if we really want to make a tool that can withstand the sort of environments where people need Althea most it’s not enough for it to work, it has to be indestructible.