Hi Jacob. I am one of the founders of Okteto (https://okteto.com/), a remote development platform for Compose and Kubernetes applications. We use Syncthing to sync code between the developer laptop and pods running in Kubernetes. I would love to know your thoughts on the strengths and weak points of Mutagen vs Syncthing for this use case.
Thanks!
Sure, that's a great question. I'll preface my response by saying that I'm a huge fan of Syncthing (and that Mutagen's use cases form a Venn diagram with those of Syncthing (as well as tools like rsync)). Technologically, all three of these tools are very similar in terms of using the rsync differential transfer algorithm, but their architectures and primary use cases differ. I think the core differentiators with Mutagen are:
Development-oriented: Mutagen's sync configuration is primarily focused on development, so it adds more granular controls for things like uni-/bi-directionality, conflict resolution, ignores, symbolic link handling, etc. It also has a permission propagation model that's focused on things like cross-platform executability propagation and preservation (i.e. between Windows and POSIX), as well as operating in multi-service environments where many different process UIDs/GIDs might be in play. It also handles weird filesystem quirks (like macOS Unicode decomposition).
Low-latency: Mutagen's goal is to reduce the latency of sync cycles (i.e. time from local edit to change reflection on the remote) to an imperceptible level. It uses a lot of tricks to try to do this, but the goal is really to use the absolute best filesystem watching mechanism for each platform and to integrate that tightly with the sync loop. On Linux, for example, Mutagen is now starting to experiment with the recently revamped fanotify[0] API to get highly scalable but low-latency watching (as opposed to the emulated and janky recursive watching emulation with inotify that most tools use). It also uses tricks like rsync-diffing of the metadata snapshots that it transfers to get latency as low as possible. The eventual goal is to reach sub-100ms sync cycles for multi-GB codebases, and I think that's pretty close.
Git-like sync: Conceptually, Mutagen's sync algorithm is like a filesystem watcher + repetitive three-way Git merge (with the difference being that file transfers are deltified and the merging (potentially) affects both endpoints). This means it tracks content in a manner very similar to Git's CAS and branches, which is a little different than the way that Syncthing does it. This affords (in my opinion) more precise identification of conflicts.
Distrust: Mutagen takes a more aggressive approach to mutual distrust between endpoints, working hard to ensure that a malicious endpoint can't read outside the synchronization root on the other endpoint via symbolic links or maliciously crafted paths. It does this by using POSIX *at functions to traverse the filesystem and perform operations. This avoids issues like CVE-2017-1000420. You can harden this further by using unidirectional sync and other configuration options. This makes it well-suited to cases where multiple users might be syncing to file storage on a shared system (say on a SaaS platform) (though, at least in that case, you can protect yourself and users with the filesystem namespacing afforded by containers).
One-sided installation and flexible topology: Mutagen's primary M.O. is injecting small "agent" binaries to remote systems via a copy mechanism (such as `scp` or `docker cp`), so you don't have to manually install it on both endpoints. This is less important to "full stack" cases like Okteto, where your tooling can handle the setup of Syncthing on the remote, but it makes working directly over SSH or in ephemeral containers significantly more convenient. And Mutagen's architecture is also really flexible, allowing it to sync files and forward traffic between any combination of local and remote endpoints (including remote-to-remote, proxied via the local Mutagen daemon).
Command-based transports: Mutagen uses the standard I/O streams of commands like `ssh` and `docker exec` as its transport (similar to tools like Git or rsync), making it easier to target remote environments with your existing tooling and configuration. Again, this is less of an issue for a case like Okteto's, but is useful in the standalone case.
Network forwarding: This is outside the scope of sync, but Mutagen offers OpenSSH-style TCP/UDS forwarding (with the difference from OpenSSH being that Mutagen's forwarding is persistent and managed by a background daemon). This offers support for doing things like forwarding a local socket to a remote Docker daemon over SSH, and then forwarding web application traffic over that underlying forwarding by using Mutagen over `docker exec` (or reverse forwarding, or forwarding between two remotes and bridging them via your laptop, or loads of other crazy shenanigans).
I hope that clarifies things a bit. Ping me via email if you want an expanded comparison.
It should be possible, especially since Mutagen is already built for almost all of Go's supported architectures. The biggest issue would just be implementing the race-free filesystem traversal that's used on Windows and POSIX (or potentially living without it given that WASM would probably provide sufficient sandboxing). In any case, most of the work would take place in Mutagen's `filesystem` package, where those syscall equivalents would have to be figured out. The rest of Mutagen should compile without modification. You'd also need to define a transport for Mutagen to use, which would depend on the exact mode of operation you're looking at, but that's the easier problem to solve. Mutagen v0.15 is going to be focused on extensibility (including custom transports), so something like this may become a reality soon.
The Okteto CLI lets you develop inside a container, no matter if it is running locally or in a remote cluster.
The main advantages of developing inside containers are:
- replicability: development containers eliminate the need to install your dependencies locally, everything is pre-configured in your development image.
- fast inner loop development: native builds inside your development container are faster than the docker build/redeploy cycle.
- less integration issues: your development container reuses the same variables, secrets, sidecars, volumes, etc... than your original deployment.
Those are really good questions, we have similar requests from our users. We don't have general answers yet, but let me explain a few solutions we have in place.
Data: data is hard. There are teams that workout the problem with fixtures and a database per developer. We also have a PoC for cloning namespaces and its data using Velero (this way, you can clone staging on your own developer namespace). We also have teams that have a dev database for all the frontend developers...
Service dependencies: we push to define your dependencies using one or more Helm charts. This way, developers have a one-click experience to deploy the full stack on their namespaces. Once the app is running, developers use the Okteto CLI to put on dev mode any service and start synching their local code changes. Some parts of the stack might be shared by all the developers in a common namespace.
Third-party APIs: this is also a wide spectrum, but our experience says it is easier to integrate them at the cluster level that in every developer workstation.
You can achieve good isolation with a combination of RBAC, network policies, OPA rules and runtimes like gvisor. We also monitor suspicious activity with Falco.
For deeper security, we offer Okteto Teams, which runs on a dedicated cluster, and Okteto Enterprise, which runs on your own cluster.
One of the advantages of using Kubernetes vs a remote VM, which is an approach followed by companies like Stripe, Slack, Eriksson and much more... is that Kubernetes is very efficient allocating resources. Idle dev environments don't consume resources, they can be scaled to zero and restarted in a few seconds, and the same infra is shared by your entire team
Sorry if you find it patronizing, I tried to describe the problem in a general way to make it interesting for the general reader, independently of how we solve the problem.
Okteto, in particular, does not impose any IDE. It works with online IDEs, vscode, vscode remote, IntelliJ, IntelliJ remote... it is one of our core values. What we move to the remote cluster is the development environment runtime.
That really depends on the size of your app. You also lose the ability to run debuggers. Not to mention how hard is to run serverless java applications, which is the dominant enterprise programming language.
Don't take me wrong, I am a big fun of serverless. We use it in production too. But I also think it does not cover a large percentage of the deployment spectrum, at least for a few years. And serverless can also benefit from tools like Okteto
Docker setup that allows me to test my functions locally - I have a wrapper that simulates the handoff between AWS ALB and AWS Lambda.
When ready, I merge into master, push, and they deploy to AWS through CI - I have simple Python scripts that manage Lambda Layer creation, environment variable configuration, etc.
Right, that works. I think that setup does not scale well for large teams or if you are not able to run all your services locally due to resource constraints
There are challenges on this model for sure. A big advantage is that, by routing to the dev environments, you can have the dev interact directly with the webhook when they code, instead of having to wait for a shared environment to test integrations. This is pretty useful, specially if your application depends a lot on webhooks for it's main functioning.
This is an area where I'm particularly interested in. We are constantly experimenting in ways to better integrated it into our idea of remote development environments.
> you can have the dev interact directly with the webhook when they code
But how? A webhook needs to call a single endpoint (with a dns/ip pair). How can you route incoming webhooks from a 3rd party vendor to every dev machine running the service locally?
Weird people downvoted my previous comment on this, it's a genuine problem with this "DTAP on your machine" setup, isn't it?
1. Register each dev env with the provider. Dev Envs can have predictable URLs (e.g in okteto cloud it's the name + namespace + domain), so you can directly register it. This works well with self-service webhooks like Github.
2. Have a proxy that routes to the right dev env. This works if there's a key you can use for the routing, like a user ID or the subscription.
3. Have a proxy that duplicates traffic and send it to all the dev envs.
I'm curious wow do you solve this on the DTAP env. Are you registering a single endpoint with the webhook? Then how does it reach the separate dev envs? (or do you have a single, shared dev env?)