Feature flags in a CI pipeline

_t4za · on Jan 18, 2023

I've done something like this with CircleCI and Github labels. The Continuation orb[1] allows you to dynamically generate your CI yml rather than using a static workflow. A Python script is executed which uses information about the PR (author, timestamp, labels applied to the PR, etc) and a CI template yml to add, remove, or skip jobs from the workflow. Some use cases are:

  - Replacing a job with a newer job for testing purposes, as described in the article.
  - Opting out of jobs, like "skip-deploy" or "skip-unit" labels, to speed up CI in some cases when these jobs aren't needed.
  - Making some optional jobs required depending on the files that have been modified in the PR. You can get a list of modified files in a PR using the Github API. We use this for example if a new feature necessitates a new CI job but you want this job to be optional for branches that don't have this feature code yet.

The additional complexity can make this a non-starter for some teams. In our case these use cases are more convenient for us than confusing for the eng team.

1: https://circleci.com/developer/orbs/orb/circleci/continuatio...

dottedmag · on Jan 18, 2023

It looks like an additional tool and a new remote service to talk to. Are there any CIs that don't support editable variables on scopes larger than a build?

The article talks about GH Actions, they have org/repo/env variables.

Even Jenkins has variables usable from the build.

pondidum · on Jan 18, 2023

Sure you can use the environment variables, and if you have simple uses cases this is probably the best way to go.

In this case, we're using LaunchDarkly's targeting to serve different variations to groups of users, and also using the in built monitoring LaunchDarkly provides to see how often each is requested etc.

hnbad · on Jan 18, 2023

It makes sense if you already use a service like LaunchDarkly but adding a whole new service like LaunchDarkly to your stack seems overkill for most people using feature flags. It took me a long time to understand how to work with feature flags in projects that don't require all the monitoring and overhead these services provide because all videos and articles I could find on the subject always framed them in terms of using a service like that.

At the end of the day, a feature flag is just a boolean variable determined at run time that can be used to switch code paths on or off. For most purposes that can be as simple as having a list of strings on your user model and checking if it contains a specific value. Adding a service can be valuable if you need to be able to manage those flags more intelligently or monitor them or want to do more structured A/B testing or whatever, but always tying the concept to a commercial service just obfuscates what it actually is and does.

WorldMaker · on Jan 18, 2023

Yes, the somewhat unfavorable view of LaunchDarkly is that it is If-Condition-As-A-Service. (They add a lot more value than just that, of course. Just that's the raw nutshell of what they do.)

I have had too many conversations complicated by the fact that when I say off hand "well just use a feature flag for that" or "maybe you should wrap that in a feature flag" they picture massive abstractions and infrastructure and often what I mean is simply "add a configurable variable somewhere, the environment or the user table in a database or a JSON file or something else, then add boring if statements checking that flag". Sometimes it is okay to just start with the basics.

(Again, not to disparage LaunchDarkly. Having "proper" infrastructure for it can provide a lot of nice-to-haves and there are definite benefits to the tools that they value add. Just that sometimes the marketing of tools like that does come at the cost of over-complicating the discussion of the basics of the thing.)

goodoldneon · on Jan 18, 2023

Bespoke feature flag systems can become a quagmire. You might think “we only need a boolean value”, but then you need a number, and then you need a string, and then you need a UI to manage it, and then you have a 2nd service and DB and have to expose feature flags via an API endpoint, and then you realize that getting feature flag values via HTTP calls is causing latency and reliability issues, and then and then and then.

I worked at a company that went far down that slippery slope and now there’s and horribly convoluted bespoke feature flag system that no one enjoys using and no one wants to maintain.

I’ve used LaunchDarkly and it’s really nice. The only downsides I’ve experienced are:

- UI is complicated.

- They have the concept of end users but not end “organizations” (like “set this flag to X for an entire customer organization”). There are hacky ways to do this but they feel icky.

hnbad · on Jan 18, 2023

Sure but by that logic you can outsource any component of your software from the start because it might evolve to be something more complex.

For contrast, adding another service that comes anywhere near touching user data (including user IDs or IP addresses) means doing due dilligence on another service provider, signing a data processing agreement with them, vetting their data protection claims, updating your privacy policy, adjusting your processes for users' data requests (including export, change and removal of their PII) and so on, not to mention adding another recurring expense, point of failure and third-party API dependency.

TINSTAAFL. Of course I understand for many companies (especially outside the EU) a service like this may be much closer to "free" because most companies try to get away with not doing even half of the things I described, whether or not that will come back to bite them. Especially if you're in the business of running a VC-backed startup designed for growth it's often a safer bet to ignore that your engine is on fire as long as you continue to go fast because if you can hit your target (i.e. "exit", whether via acquisition, acquihire or going public) in time that will be someone else's problem or no longer matter.

verdverm · on Jan 18, 2023

How do you handle flag changes in the middle of builds, where the flag is being checked in multiple places and the flip happens between them?

pondidum · on Jan 18, 2023

Only read the flag once! For multiple flags we tend to have the first step in the build read all flags we care about and export them to environment variables.

verdverm · on Jan 19, 2023

How do you enforce that they are only read upfront?

Why not just make them parameters if you are going to put them upfront and turn them into ENVs?

How do you deal with a commit that needs to be rerun? What if the flags have changed in between?

verdverm · on Jan 18, 2023

We add a "branch config" JSON file for each branch, which is later merged with some CUE. We put metadata and "feature flag" like values in there. This full config value is passed to all subjobs and makes it easy to change and grow our CI.

The main problem I see with real feature flags in CI is that it can make reproducing a build nearly impossible. What happens if someone flips a flag in the middle of a build? (where that flag value is checked in multiple places, so the flip happens in between the checks)

goodoldneon · on Jan 18, 2023

Would logging feature flag values be enough? Then you could look at stdout to see what the value was during each job

verdverm · on Jan 18, 2023

How would you ensure the same values after pushing a new commit or retriggering a build at the same commit? There are many times where external failures, like NPM returning a 50x for a package, that requires a build to be rerun at the same commit.

It seems impossible to ensure reproducible CI when some of that configuration lives outside of source control. Given the CI files are in source control, you can just change them there, rather than using feature flags. Using feature flags comes with a host of hard problems that you ought to just avoid by not using them.

mo_po2 · on Jan 18, 2023

Flagon's exit codes are as follows:

0 the flag queried is on (true) 1 the flag queried is off (false) 1 an error occurred querying the flag

I think that it would be nice that error and false had different exit codes.

pondidum · on Jan 18, 2023

It actually does have different error codes; I think I just forgot to update the docs:

0 - flag on 1 - flag off 2 - error

I'll update the docs soon!

tail_exchange · on Jan 18, 2023

I've been thinking about a way to add feature flags to my project's CI pipeline. What makes me afraid of adding such thing is reliability: if someone breaks the feature flag service, now the CI pipeline is broken too. I could just give them a default value if the communication fails, but then we will have builds that are skipping checks potentially due to transient errors. I would be very curious to know the guidance for using this tool in my case.

verdverm · on Jan 18, 2023

An easier solution is to add a json file per branch that holds these "flag" or CI config values. We have a dir `bfg/...` to hold them. Nice thing is they are version controlled and the dev can change values between commits like they do with any other code. This pattern has helped solve a lot of issues and we no longer have complex conditions that key off of branch name patterns.

rtpg · on Jan 18, 2023

I feel like you can just add a table to your DB and be done with it for most systems. I feel like people with large enough systems where “stick it into the DB” is not reasonable also should probably have a bespoke feature flag system.

raptorraver · on Jan 18, 2023

I’ve been looking for a way to do integration tests for our pipelines. So far haven’t found anything I could use.

jillesvangurp · on Jan 18, 2023

I use matrix tests with github actions to test my kt-search client with different versions of elastisearch and opensearch. Pretty easy to set up: https://github.com/jillesvangurp/kt-search/blob/master/.gith...

Basically it fires up elasticsearch using docker-compose and then the integration tests run against that. You could use a similar strategy to test different feature flag combinations.

For some of our private projects, we use kts to generate the github action yaml files using this: https://github.com/krzema12/github-workflows-kt

Well worth checking out if you have more complex workflows. Yaml is just horrible in terms of copy paste reuse. Also nice to get some compile time safety and auto complete with our action files.

diarrhea · on Jan 18, 2023

Pretty sure the parent meant testing the pipelines themselves, end to end.

The tooling around pipelines is awful. A single typo in some variable’s name in a later stage can take minutes to catch. The feedback cycles are very long (cloud machines are much slower than local ones) and IDE tooling is bare-bones.

Just give me one large Python file with some library to manage common actions (building up the job DAG, accessing pull requests, easy shell access, …). We’d have refactoring, Turing completeness, type safety and so much more. A core downside would be managing the complexity of DevOps scripting going berserk. Personally I’d prefer that trade off.

Denvercoder9 · on Jan 18, 2023

> The feedback cycles are very long (cloud machines are much slower than local ones) and IDE tooling is bare-bones.

I'm continuously amazed that none of the major CI providers offer standalone tooling to run and debug your CI pipelines locally. Seems like it'd be a killer feature for anyone working with complex pipelines.

verdverm · on Jan 18, 2023

The local & CI parity is one of the main points for https://dagger.io

theSage · on Jan 18, 2023

I've been building Jaypore CI for exactly this tradeoff. The config is a single python file and I run jobs using a git hook on my own laptop.

I'd love some feedback on what else I could add to this project to make life easier for people.

https://www.jayporeci.in/

jillesvangurp · on Jan 18, 2023

Testing the pipelines is a thing as well. But pretty tricky to set up. You basically need to run dockerized actions locally for that. Not impossible and I've seen some attempts to do that. But I can't really justify spending a lot of time on this stuff.

The kotlin scripting support for github actions that I mentioned addresses things like typos, refactoring, and IDE tooling. Try it, it's pretty nice and easy to use. We actually have an integrity check as part of our build that runs the kts script to verify the yaml file stored in the repository is consistent with what the script generates.

raptorraver · on Jan 18, 2023

Exactly this. We are having quite complex pipelines and almost weekly basis we introduce some regression bugs there or change the rules so that they don't work anymore properly. It takes around 20 minutes to run the whole pipeline and of course some jobs are only triggered when merging to master etc. The development flow is painful and slow (although we have found some tooling to run some parts of our pipelines locally).

marcosnils · on Jan 18, 2023

hey! Dagger (https://dagger.io/) contributor here! Dagger can help with that. If you have any questions, happy to connect!