Debugging microservices on Kubernetes with the Conduit service mesh 0.4 release

nickjackson · on April 20, 2018

This is super sweet.

It would be great to tie this into a release pipeline, where the release process is actively keeping an eye on failure rates of that service, so that bad deploys could be halted or rolled back automatically.

I was thinking this could work really well when using production integration tests. A percentage of that traffic can be dynamically routed to the newly running services, allowing the release pipeline to ensure the service is functioning correctly before routing any real users.

evanweaver · on April 20, 2018

Oh nice, it's cool to get access to dependency info without having to implement a full distributed tracing mechanism that is invasive to the service components.

Breaking down service success rate by inbound dependency is great for debugging many typical fault conditions.

williamallthing · on April 20, 2018

Thanks! We've got a couple more features coming up along this vein that you're really going to like.

CSDude · on April 20, 2018

Nice to see 0.4 released! Kudos to team. The original plan was to release 0.6 in late April, but we are waiting it very excited.

williamallthing · on April 20, 2018

Thanks! Yeah, that's the danger of putting up a public roadmap with dates on it. We've seen a little production usage that uncovered a couple issues that we wanted to address quickly, and that shifted the timeline a bit. But we'll get there.

pm90 · on April 20, 2018

How does conduit compare with istio?

williamallthing · on April 20, 2018

Conduit's a tool to solve actual, engineering-facing operational problems with as little complexity as possible, that happens to be a service mesh.

Istio is a Big Important Service Mesh for Big Important People that does everything under the sun, and none of it well.

Neither project has real production adoption yet. (For that, you have to look at Linkerd). Istio will get adoption by spending infinite marketing dollars. Conduit will get adoption by solving actual problems.

Conduit's also significantly faster and smaller. Sub-millisecond p99 latencies, ~2mb RSS footprint per proxy. We took some big risks initially with Rust, but it's paid off handsomely.

williamallthing · on April 20, 2018

Did I mention focused on solving real problems with a minimum of fuss?

If you can't get topline metrics dashboards for every service you're running in Kubernetes within 60 seconds of installing Conduit, I will literally* send a team of engineers to your house to fix it right in front of your face.

(* not actually literally)

pm90 · on April 21, 2018

How important is automatically getting grafana dashboards though? Let me say its pretty awesome for beginners and maybe someone unfamiliar with grafana dashboards and I am a huge fan of this functionality. But I'm not convinced it should be the feature that differentiates between the two solutions.

williamallthing · on April 23, 2018

The dashboards are a detail, but visibility into topline service metrics are absolutely critical. There's a huge difference between "I need to configure a bunch of stuff and do some complicated things first", and "I get them automatically".

Diederich · on April 20, 2018

Yup, inquiring minds want to know.

sampx · on April 20, 2018

Neat stuff! Any plans for StatefulSet or DaemonSet support?

williamallthing · on April 21, 2018

Yep, shouldn't be too hard