> We have no option but to build a new circuit and we don’t want that.
That's true, you never want to rebuild the circuit. But it strikes me that the idea that this is avoidable falls into at least two of the Eight Fallacies of Distributed Computing[1], namely "The Network Is Reliable" and "Topology Doesn't Change".
If we instead assume that the network isn't reliable, and topology does change, then instead of eliminating unreliable nodes and being conservative with changes to the topology, we would focus on reducing the costs of rebuilding a circuit so that network unreliability and topology changes aren't disastrous.
But it sounds like the Tor team has instead decided to bolster these assumptions, to make them less of assumptions; trying to make the network as reliable as possible and trying to make the topology change as little as possible.
I don't mean this to be a harsh criticism of the Tor team. I'm an outsider, and beyond an uncompromising privacy constraint, I don't know all the constraints Tor was built under. I'm sure the tradeoffs made by the Tor team make sense within the context of their constraints. Obviously, the Tor network works well enough to have a large user base, so they have provided a good-enough solution.
But I wonder if changes could be made to Tor's design in the future which would allow quicker adding and removing nodes, and handle network reliability issues better, so that Tor would be faster.
One possibility which stands out to me is to pool circuits and load-balance between them, so that if a circuit begins to have issues, you still are connected along other circuits while you build a new circuit to replace the unreliable one. This possibly would run into issues where correlate could correlate traffic from different circuits to unmask clients, so you'd have to be careful, but I'm not sure these problems would be insurmountable.
But remember: What tor is doing is hard. They are doing complex crypto, networking, security.... The hard stuff. The real stuff. Torproject is a nonprofit organization with limited capabilities. They are doing their best. It took 3 years to design and implement dos mitigation techniques, for example.
Your proposed plan could take over 10 years, even for a well funded corporation. It might take time and fail. It might create huge vulnerability due to code complexity. Afaik, tor can’t risk that.
That's true, you never want to rebuild the circuit. But it strikes me that the idea that this is avoidable falls into at least two of the Eight Fallacies of Distributed Computing[1], namely "The Network Is Reliable" and "Topology Doesn't Change".
If we instead assume that the network isn't reliable, and topology does change, then instead of eliminating unreliable nodes and being conservative with changes to the topology, we would focus on reducing the costs of rebuilding a circuit so that network unreliability and topology changes aren't disastrous.
But it sounds like the Tor team has instead decided to bolster these assumptions, to make them less of assumptions; trying to make the network as reliable as possible and trying to make the topology change as little as possible.
I don't mean this to be a harsh criticism of the Tor team. I'm an outsider, and beyond an uncompromising privacy constraint, I don't know all the constraints Tor was built under. I'm sure the tradeoffs made by the Tor team make sense within the context of their constraints. Obviously, the Tor network works well enough to have a large user base, so they have provided a good-enough solution.
But I wonder if changes could be made to Tor's design in the future which would allow quicker adding and removing nodes, and handle network reliability issues better, so that Tor would be faster.
One possibility which stands out to me is to pool circuits and load-balance between them, so that if a circuit begins to have issues, you still are connected along other circuits while you build a new circuit to replace the unreliable one. This possibly would run into issues where correlate could correlate traffic from different circuits to unmask clients, so you'd have to be careful, but I'm not sure these problems would be insurmountable.
[1] https://en.wikipedia.org/wiki/Fallacies_of_distributed_compu...