It would be great to tie this into a release pipeline, where the release process is actively keeping an eye on failure rates of that service, so that bad deploys could be halted or rolled back automatically.
I was thinking this could work really well when using production integration tests. A percentage of that traffic can be dynamically routed to the newly running services, allowing the release pipeline to ensure the service is functioning correctly before routing any real users.
Oh nice, it's cool to get access to dependency info without having to implement a full distributed tracing mechanism that is invasive to the service components.
Breaking down service success rate by inbound dependency is great for debugging many typical fault conditions.
Thanks! Yeah, that's the danger of putting up a public roadmap with dates on it. We've seen a little production usage that uncovered a couple issues that we wanted to address quickly, and that shifted the timeline a bit. But we'll get there.
Conduit's a tool to solve actual, engineering-facing operational problems with as little complexity as possible, that happens to be a service mesh.
Istio is a Big Important Service Mesh for Big Important People that does everything under the sun, and none of it well.
Neither project has real production adoption yet. (For that, you have to look at Linkerd). Istio will get adoption by spending infinite marketing dollars. Conduit will get adoption by solving actual problems.
Conduit's also significantly faster and smaller. Sub-millisecond p99 latencies, ~2mb RSS footprint per proxy. We took some big risks initially with Rust, but it's paid off handsomely.
Did I mention focused on solving real problems with a minimum of fuss?
If you can't get topline metrics dashboards for every service you're running in Kubernetes within 60 seconds of installing Conduit, I will literally* send a team of engineers to your house to fix it right in front of your face.
How important is automatically getting grafana dashboards though? Let me say its pretty awesome for beginners and maybe someone unfamiliar with grafana dashboards and I am a huge fan of this functionality. But I'm not convinced it should be the feature that differentiates between the two solutions.
The dashboards are a detail, but visibility into topline service metrics are absolutely critical. There's a huge difference between "I need to configure a bunch of stuff and do some complicated things first", and "I get them automatically".
It would be great to tie this into a release pipeline, where the release process is actively keeping an eye on failure rates of that service, so that bad deploys could be halted or rolled back automatically.
I was thinking this could work really well when using production integration tests. A percentage of that traffic can be dynamically routed to the newly running services, allowing the release pipeline to ensure the service is functioning correctly before routing any real users.