Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Cyclic dependencies are evil (2013) (fsharpforfunandprofit.com)
129 points by nojito on Jan 30, 2021 | hide | past | favorite | 121 comments


Language constraints aside, the real world is not something that can be cleanly modeled without the notion of circular dependencies between things. Not very many real, practical activities can be truly isolated from other closely-related activities and wrapped up in some leak-proof contract.

Consider briefly the domain model of a bank. Customers rely on Accounts (I.e. have one or many). Accounts rely on Customers (i.e. have one or many). This is a simple kind of test you can apply to your language and architecture to see if you have what it takes to attack the problem domain. Most approaches lauded on HN would be a messy clusterfuck when attempting to deal with this. Now, if I can simply call CustomerService from AccountService and vice versa, there is no frustration anymore. This is the power of reflection. It certainly has other caveats, but there are advantages when it is used responsibly.

If you want to understand why functional-only business applications are not taking the world by storm, this is the reason. If it weren't for a few "messy" concepts like reflection, we would never get anything done. Having 1 rigid graph of functions and a global ordering of how we have to compile these things... My co workers would laugh me off the conference call if I proposed these constraints be imposed upon us today.


Sometimes I read conversations about software engineering concepts by career developers and I get this sense that there is this entire hidden world of other engineers that I have somehow managed never to encounter.

Every engineer I've ever met who has been doing it for more than a couple years universally decries circular dependencies. Every one of them has come to that opinion via hard won experience dealing with real world problems they encountered. And yet comments like yours reveal there are other engineers working in places where apparently the opposite is true. Whenever I see that I wonder how this sort of self sorting manages to occur.


> Every engineer I've ever met who has been doing it for more than a couple years universally decries circular dependencies. Every one of them has come to that opinion via hard won experience dealing with real world problems they encountered.

They may not realize they have them - at runtime, in the object graph, possibly indirectly. There's little difference in principle between cycles in "static" code vs. runtime state, but we often can't express them in the former, because our languages don't specify the concept of dependency at enough granularity (and some rely on a linear compilation pass).


The fact that computing is shot through with people who think that putting things in-band makes them go away, and that because special cases have nice properties therefore King Canute can abolish the general case with a wave of his hand, is a very large part of the problem with computing.


I've worked with very high level engineers at Google that thought cyclic dependencies are fine. Sometimes they are just the simplest solution and trying to design abstractions to avoid it just creates a huge mess.

If you've working with Java at Google (Guice) makes you learn to hate hiding circular dependencies behind dependency injection, since your binary gets injected by some huge tree of thousands of dependencies all that can create errors or interfere with each other. And without static checking trying to reason and fix those issues becomes really hard.

I strongly prefer using the languages actual type system to create circular dependencies that you can inspect using well known tools over that any day, better have a problem you can see than hide the same problem to satisfy some tools requirement.


I have to say that in my whole career of doing Java / Clojure, Google’s SDK are one of the worst in terms of dependency hell. It’s incredibly unpleasant to work with, lots of breaking incompatibilities with libraries, etcetera.

I may sound very snarky, but it really feels like an over engineered mess and I suspect the reason they need all these dependency tricks is the fact that it’s over engineered, not necessarily that the problem they’re solving is so unique that they need circular dependencies.


You never truly need circular dependencies, however many times letting functions at the same abstraction level call each other greatly simplifies code.


I think something has been lost along the way with injection and IoC that makes using a fully functional programming language for object wiring a compelling alternative for most engineers.


To me, the parent is describing a functional circular dependency that we all have to live with, but when we’ll design the system it will be masqueraded in a many to many association in a third object/table and we’ll proudly say “we have no circular dependencies”.

To me both are technically true, reality is messy, and we hide the mess under imaginary constructs.


Ah, but in the functional version, the messiness is both totally explicit and trivially inspectable.


Possible reason: selection bias. If everyone else is fine with circular dependencies or at least sees them as a necessary evil... are they going to extremely vocal about the current status quo when there they have no need to defend it?


In my experience, it’s generally less experienced engineers that will happily make a clusterfuck of circular dependendies.

If a more experienced engineer does, it’s generally considered a necessary evil, but not a choice taken lightly.


Some problem domains have unique characteristics. I think the closer a problem resembles nature, the more it breaks our structures we want to impose on it.


It's not clear the parent commenter actually needs a cyclic reference there, so I wouldn't draw any conclusions from it. Some of us don't believe it's necessary in that example.


Manifolds? Locally we want everything to be a straight line, but when you zoom out far enough you see it was a circle all long.


In F#, you would naturally approach this by defining a Customer independent of the Account (e.g. just containing a name and address), and an Account independent of the Customer (e.g. just containing an ID), and then a Bank which is a mapping of Account to Set<Customer>. What you see as a cyclic dependency, I see as a data type that you haven't reified.


This came up in an application I had modeling college football. A Team is a member of a Conference and a Conference is made up of Teams. But during realignment a few years ago when teams were switching all over the place it became apparent that Teams and Conferences were truly independent of each other and that their association itself was a first class object which also needed to capture the Season/Year.

Not really rocket science but some times your model is telling you something if you’re willing to listen.


This is almost exactly the motivational example that Codd used when originally describing relational algebra. He described 5 different data organization schemes for a single problem and designed relational algebra to work with all of them. This ability for the same program logic to work with many different data storage layouts should make changes like you describe less painful to implement.

(see page 2)

E. F. Codd. 1970. A relational model of data for large shared data banks. Commun. ACM 13, 6 (June 1970), 377–387. DOI:https://doi.org/10.1145/362384.362685

PDF: https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf


My dream, when I'll have a few years of free time, is to design a language were the relational table is the primary data structure abstraction.


Mine too. My main gripe with object graphs is that often you need to make arbitrary decisions about what comes "first" -- should I represent my customers as a list of `(name, address)` records or a record `(names, addresses)` of lists? This is a silly decision to have to make, and an even sillier one to have to deal with when later you need to transpose the structure to more naturally fit some other task. Relational tables sidestep this issue.

Over the years I've found that these transposition problems are a huge drag on programs, and once you recognize them you start seeing them all over the place. They make simple and obvious relational tasks into dozens of lines of nested data structure traversal and manipulation.

The notion of transposition comes from the array programming world, where a multi-dimensional array/tensor can be seen as a tree structure, whose levels may be easily reordered by transposition. I sometimes use numpy arrays for general-purpose programming; it can be very powerful with a well-chosen representation. Unfortunately array concepts basically require the data to be rectangular -- `x[i]` has the same shape as `x[j]` -- which can be hard to adapt to general problems.

My dream then is a general-purpose data structure that takes the best of both the relational and array-programming worlds: the freedom and flatness of relational tables, enriched with tacit ideas like broadcasting that make array programming so ergonomic.


I’m working on a Rust library for this as my MS project. It calculates the queries inside the type system, so there’s minimal runtime cost.


It's not quite the same, but how do you feel about Datalog or Prolog? Having a set of facts and being able to make arbitrary relations in them makes for some interesting programming. Especially if you limit your data to a more rigid normal form like RDF tuples. Datafun is also a cool upcoming language in this space https://github.com/rntz/datafun


Starting from the other end, what is missing in SQL to make it more general purpose?


Generic foreign keys / Polymorphic relations. I.e. table Foo can either point to table Bar or table Xuul.

This type of relation shows up quite a bit, and either leads to application-level workarounds, or having to add many duplicate tables for the same "thing".


This is a cool idea. What would you need to be able to try this in 2 weeks, instead of a few years? What's the lean slice?


Relational algebra isn’t that hard to implement if you’re willing to sequential-scan all of the time. The massive complexity comes in the query planner: The point of adding an index is to change the space-performance tradeoff of certain operations. The system needs to be able to take advantage of these data layout changes, or there’s little benefit over storing everything in flat arrays.


This resembles sql a lot. Btw I'm surprised that few languages seem to have in their standard library a many-to-many container which supports efficient lookups in either direction.


Actually yeah that is kinda interesting. The couple times I've had to do it, always ended up hand-rolling a 'class' with hashmaps in both directions. I wonder why few standard libraries have a both way look up structure.


Which languages support this at all? Are you talking about ORM?


I'm talking about something like this

    std::relation<Foo, Bar> foobars;
    ...
    auto& bars = foobars.forward(foo);
    ...
    auto& foos = foobars.backward(bar);
Edit: Boost apparently has it, see e.g. https://stackoverflow.com/questions/1128144/c-maps-for-many-...


Boost.Bimap is a special case (and built on top) of Boost.Multiindex that allows indexing a logical table with an arbitrary subset of fields.


> a data type that you haven't reified

Insightful. Once you learn to see dependencies as implicitly revealing abstract data types you haven't yet teased out into existence, you start seeing all kinds of places where behavior that ought to be explicit and handled by an abstraction is smeared across multiple other concepts.


I've got a half-cooked blog post about this in the oven: "reify the state machine". I've seen plenty of code with lots of communicating parts, which is really a distributed implementation of a state machine. Rewriting to make the states explicit as data, and perhaps creating a central "state machine" type for that machine, can really help clarify the domain model and let you see what you're missing.


> and then a Bank which is a mapping of Account to Set<Customer>

The problem is that this only lets you get customers from accounts, and not vice-versa.

And if you add a Map<Customer, Set<Account>> to the Bank, you now have to ensure the mappings are synchronized at all times. Which, to be fair, is very straightforward as long as Bank is immutable, but still kind of annoying.

But considering how incredibly common many-to-many relationships are in business logic and especially in SQL, I'm constantly surprised that there isn't a de facto standard, efficient data structure to represent them. That speaks to how entrenched cyclic relationships are in software engineering, I suppose.


In the in-storage form (database), the mappings is synchronized at all times.

Many-to-many relationships in the purest form is described as records of ItemAId/ItemBId pairs. `List<{ account: Account, customer: Customer}>`. There isn't any cyclical relationship in this form.

From `List<{ account: Account, customer: Customer}>` one can derive `Map<Customer, Set<Account>>` and `Map<Account, Set<Customer>>`. Bank can have copy of both mappings and use them as double index, synchronizing both mappings as operations goes by, while at the same synchronizing both mappings to the in-storage form. Or not!

We're talking mostly about data-modelling in a programming language, which applies to the in-memory form. `Customer { accounts: Set<Account> }` and A `Account { customers: Set<Customer> }` is its in-memory form, a layer of indirection from its actual form, the in-storage `List<{ account: Account, customer: Customer}>`.

If informations of Accounts, Customers, and the many-to-many relationship of both are laid flat in its purest, source-of-truth form, the in-storage database entries, why are people suggesting that there is/should be a cyclical relationship in its representative form, the in-memory objects?


This is the right answer and I wish more languages would make this dramatically more ergonomic to do. Store the most basic normalized truth and query/derive what you need.


Sure, but you wrap that up if you want. A Bank can contain both maps, if you want, and you hide the methods that would update individual maps, only exposing your wrappers that update everything together.


Yeah, I'm not saying it's difficult, I'm just saying that it's weird that there isn't a standard ManyToMany<T1, T2> class in most programming languages, even though a functional (if perhaps not performance-optimal) implementation is straightforward and should be very useful at least for DB mappings.


True, I have implemented it myself multiple times, getting a bit annoyed at its absence without stopping to ask why.


Which makes it easy to answer the question “what customers are on this account” and hard to answer the question “what accounts does this customer have”. As a customer, I usually want the latter.


I don’t quite understand what you’re trying to convey with your example in regards to FP.

First of all, both data and functions are first class concepts in the functional paradigm and are not glued together as classes. So you’re not modeling your domain as „account service“ and „customer service“, but rather have data representation of these concepts and functions that query/reduce etc on those. In your example that would be a relational model, described as sets of tuples/maps, and relational functions do derive new data. There is no need for circular dependence, because your operations are generic over relational data.

I agree with the notion that the purely functional approach doesn’t dominate, for good reason. But more and more languages are incorporating FP concepts since about a decade or longer, at least the basic building blocks like closures, immutability and function composition have gained massive traction and eliminate the need for many of the complex patterns in traditional, class based OO.


I actually don't follow your example; I don't see why this is cyclic. In your example what exactly would CustomerService be responsible for that prevents it from functioning unless it has a reference to AccountService? AccountService is pretty obvious (add/remove money, link other accounts, close account, etc.) but I'm struggling to see what CustomerService would do on a per-customer basis that it would need a reference to AccountService for. The only thing I can imagine is a get/add/remove accounts operation, but (a) you need nothing but an account ID for that (not the actual AccountService class) and the caller can look up the actual accounts using the IDs if they want, and (b) you can (and I think perhaps should?) model this as a database table managed by the Account class where you're associating customer IDs with account IDs.

Another way to think about this is, how would you do this in C++? You have header files, and they shouldn't be cyclic. Who #includes who? Well in this case it seems to me AccountService.hpp should #include CustomerService.hpp, but not the reverse. Maybe CustomerService.hpp would #include AccountID.hpp? Whatever I imagine I just don't see why you'd need cyclic #includes.


One can have Customer in one header file, possibly forward declaring Account. Account in another header file possibly forward declaring Customer. Definitions in the two cpps which can include both header files. When writing such a thing one should ask oneself whether one really needs to do it that way, though. Not that it is too much of a mess but asking such question is always good.


Funny that we both think in C++. It was what popped into my head too.

AccountMgr has many Account(s). Ditto for CustomerMgr. Interact with both to taste.


In a relational database you would have a CustomerAccount table keeping track of which accounts that belongs to which customers. Customer would not know about Account, and vice versa.

And what has reflection to do with this?


Reflection allows you to put the interfaces to Account and Customer into a shared module and the business logic implementations into their own mutually independent modules, thus removing the cyclic compile time dependency.

You "wire up" the implementations at runtime, using reflection.


I still don't get it. I put the interface into a shared module and the implementations in independent modules without reflection. Why exactly would we need reflection to do that?


If I may, that sounds horrifying. Please, for the sake of my sanity and the IDE, always do things at compile time rather than runtime if you can!


It's called the "dependency inversion principle". People don't like to call it "reflection", and heap various layers like XML or dependency injection annotions on it, but technically, it comes down to reflection at runtime, as far as I've seen.

It certainly has its cost in additional complexity through indirection, but it's better than creating cyclic dependencies or giant balls of mud.

https://en.m.wikipedia.org/wiki/Dependency_inversion_princip...


No, you don’t need reflection or annotations to use dependency inversion. Why do you think so?


The question was "what has reflection got to do with it". I've used reflection for dependency inversion, so I think that's what it's got to do with it.

If you can do dependency inversion without reflection, more power to you :-) We can't do classpath scanning in the project I'm working on because of the size of the classpath, and compile time configuration using direct imports would introduce cycles, so reflection it is for us, in one form or another.


void createAccount(ICustomer c) { ... } // No dependency to concrete Customer

class Customer implements ICustomer { ... } // Customer Class can be created after Account above

createAccount(new Customer()) // Use of both, with dependency injection

Scanning for classes dynamically using reflection has nothing to do with above. And you certainly don't need any xml or framework to do it either.


If the code is implementing against interfaces, why would late binding mess up your IDE experience at all?


My heart sinks whenever I need to spend ten minutes trying to work out which of the three different implementors is actually going to be called on a particular code path. For about a week of December, my work was solely spelunking to track down which implementations of an interface in C# were dead code and so on.


Right. Often there are 2 types of interfaces in a project. The first are "natural" interfaces, that you have put some design into and are meant to be reusable. Things like Streams or Collections. When you write a function that uses one of these, you really are expecting to be able to use any implementation. If you want to go to the implementation of Stream.Read, obviously the IDE isn't going to be able to do it, it's abstract and there are any number of implementations.

You get the other kind of interface when you want to loosely-couple your code, and so you define interfaces for many classes. Often, there is only a single class that implements the interface in your project, though there may be mock implementations in your test code. Even though there is a single "real" implementation, the IDE can't/won't jump to that implementation in the same way it won't in the first case. This is frustrating though, because it would have worked if you hadn't extracted the interface for improved testability.


Is there any solution to this?


It seems like an IDE could know if there is a single implementation and could go to it, especially if the interface is not visible outside the project ("internal" in C#). In practice, the interface is in the same file as the primary implementation, so it's not as bad as it seems. More of an annoyance.


Jetbrain’s Rider as well as their ReSharper for Visual Studio have a navigate to implementation feature which I use the shortcut key for all the time.

The fact our solution has an interface for just about everything for testing reasons doesn’t slow me down even in the slightest when I’m looking through the code.


I've seen people learning about dependency inversion from a book called "Clean Architecture", then proceeding to apply it to every bit of code they write, to make it "clean". It makes code difficult to trace by reading alone. Indirection may be cheap, but it adds up.


This is perfectly possible without reflection. The interfaces already make it possible.


But functional-only business applications already did take the world by storm! Relational DBs is the defacto way data is organized, and a Junction table is the way you model the data so that a functional query (generally in SQL) can be made against it.


SQL is not functional, it mutates global state.


Yes, there is a single transactional global state.

There is always state somewhere, the question is where it should be.

Global state is the input to the program, and a new global state is the output. Given identical inputs, it will always produce the same output.

This is the entire principle behind ACID, and mutation of global state is a common approach for functional programmers regardless of whether they they are using a DB or not.

If a DB were not fully transactional, or if your queries of the DB are not deterministic themselves, then we’ve moved out of functional territory.


SELECT?


I think the OP is overly dogmatic and I agree with you that sometimes you just need circular dependencies

That said, I don't see what circular dependencies have to do with reflection, much less a functional style? For a trivial example, Rust lacks reflection but the following (functional-ish) code works just fine:

    mod ModA {
        use crate::ModB::B;
    
        pub struct A {
            ref_to_b: Option<Box<B>>
        }
        
        pub fn into_b(a: A) -> Option<Box<B>> { a.ref_to_b }
    }
    
    mod ModB {
        use crate::ModA::A;
    
        pub struct B {
            ref_to_a: Option<Box<A>>
        }
        
        pub fn into_a(b: B) -> Option<Box<A>> { b.ref_to_a }
    }
Am I missing something?


Could you give an example of a domain that naturally requires circular dependencies? I have been trying lately to practice seeing outside the functional-programming-tinted lenses, but I need a bit of help to do so :)


I think the GP's example is pretty reasonable. And like I said, I still don't really see what this has to do with FP. Clearly F# has trouble with it, but so does C++


I don't think the GP's example is reasonable, and nor would an SQL-writer. Could you give something I will find it harder to wriggle out of?


I don't understand how reflection is involved here? Don't you just have a data model (schema) that can be interacted with?


In the functional paradigm, language constraints aside, to model this "story" would need these:

- The definition of Account

- The definition of Customer

- A ledger consisting of: List of Account, List of Customer, List of Account-Customer relations

A clerk works on the records on the ledger with one hand and one pen, a metaphor for the service process working on the data in the database.

An act, such as money transfer, would be described as a function.

A customer creating a new account for himself is written as `fn createAccount(Customer, NewAccountData)` because from the perspective of the clerk/bank manager/service the customer, newAccountData, and the existing data in the ledger as objects which the clerk/bank manager/service must move around in a precise way.

The module which has 'fn createAccount` depends on the types `Account` and `Customer`.

In english it roughly sounds like,

"the success of writing the rule of creating an account for a customer depends on knowing the definition of Account and Customer."

The function is not written as `customer.accountService.createAccount(newAccountData)`, because the clerk doesn't schizophrenically pretend to be the customer and create an account for himself. The clerk just receive request from the actual customer, writes new account entry and customer-account-relation entry into the ledger, that's it.

There's simply no need to call CustomerService from AccountService and vice versa. There's no need for reflections because data types are all available at compile time.


> `customer.accountService.createAccount(newAccountData)`

I believe that most OO implementations would read

    accountService.createAccount(customer, newAccountData)
Care to elaborate if that was the main point of the clerk's schizophrenia criticism? Or if I'm misinterpreting just call me dumb, haha.


Oh, I meant to emphasis on that "customerService depends on accountService and vice versa" thingy. While in my writings "accountService is a member of customer" is a form of customer depending on accountService.

But my main point is that it should not be written that way at all.

Semantically, if we must include the subject, it should be

`theService.createAccount(Customer, NewAccountData)`

Or replace theService with CustomerAccountService or whatever.

Writing that way, with functions depending on types, avoid getting tangled from the so-called account-customer cyclical dependencies. Because account and customer don't need to know each other until a function needs to know both of them. There's no such thing as `customer.getAccounts()` because in the end the query would roughly look like `getAccountsWhereCustomerIs(CustomerId)`.

You're not dumb. It's just me mis-writing due to the fact that I'm writing this at 4 in the morning lol. Or maybe I'm misinterpreted parent comment and that confuses you. In that case, I'm the dumb.


Customers and Accounts are not cyclic.

A customer has an account. The customer is the domain root. Hence there would be no accounts if there were no customers.

A customer does need to to hve an account. An account must have a customer.

It's not cyclic in any way


Banks think the other way around - the account is important, the customer less so. Multiple entities, or no entities, might be authorized to access or control the account at various times.


I’m having a hard time understanding why you think F# is unable to handle the domain model presented.


It reads as an extremist take on Chesterton’s Fence: “the fence exists, therefore it is probably the only thing that could’ve worked.”

There is nothing in this contrived example that F# or Haskell (gasp, pure functional) wouldn’t handle with ease. It just would be modeled differently from traditional langs. I’d argue the alternative modeling forced by FP langs would be superior.


" I’d argue the alternative modeling forced by FP langs would be superior."

Then I'd like to see that. The other example of the functional someone gave more above, was not convincing to me. Much more complicated then the straight-forward simple, but evil cycle approach.


“Customers rely on Accounts (I.e. have one or many)”

Alternative model: the Customer relation couples a Person to an Account.

I think that’s a better model, as it allows Person P to be a Customer at bank B but not at Bank C, Person Q to be a Customer at both B and C, etc.

It also allows you to model Persons before you model Accounts, breaking the circularity, allows Companies to become Customers, etc.


Functional only actually can model this kind of real world circulair dependencies easily without having circular dependencies in the language level. Not sure what it even has to do with functional style in general though.


circular dependencies at runtime are not really a problem and often necessary. circular dependencies at build time/between modules are evil.


Your build time is someone else’s runtime.


You would forget everything about a customer if they closed their last account?


No, of course you would maintain a Set<Customer> as well (possibly even a Map<Customer, Account Set> for efficiency).


... if you model it that way you're not using cyclic dependencies at all and this is in fact exactly how you would model it in a functional way.


Their whole premise is that

> Customers rely on Accounts (I.e. have one or many).

I'm giving a domain relevant counter example to that premise.

And interestingly enough, that leads you to a clean way of modeling it without the cyclic dependency as you've just done.


EDIT: In fact I appear to have paraphrased the next post in the series, https://fsharpforfunandprofit.com/posts/removing-cyclic-depe... . Probably go and read that instead.

One pattern this nudges you towards is having a Domain.fs at the top of each project. Since everything in your project wants to talk in terms of the data types you've modelled, it makes sense to put them all in the same place at the top. Moreover, once you've done that, you're guided towards having your data types really being "dumb" - algebraic data types only - because to put actual behaviour into them is harder when all you have is definitions of data types (and no associated modules). You certainly can write OOsagne using the Domain.fs pattern, but it's much harder and the code really smells when you do (because the domain file gets super long).

The upshot is that your domain model appears explicitly at the start of the project, which is a big win for anyone who comes into the project and needs to learn quickly what's going on. By contrast, nearly all the C# I've ever come across has the domain spread across many files and all mixed up with implementation details.

This is certainly not the only way to write F# - I've written projects which have the domain spread across multiple files - but it's one nice way to handle a small-to-medium-sized project.


This is hexagonal architecture. I’ve yet to see anything quite as nice as it once I started using it.

The only downside to hexagonal architecture is that you start to see how mediocre most architectures are, and how frameworks often cap how good your architecture can be by virtue of their structure. Frameworks tend to resist being put on boxes, and hexagonal architecture tries to put all external dependencies in boxes.


The generic way to remove a cyclic dependency is by Dependency Inversion (not to be confused with Dependency Injection), so that a mutual dependency A <–> B is transformed into the noncircular set of dependencies { A –> BIntf, BImpl –> BIntf, BImpl –> A }. That is, at least one node in the dependency cycle is split into interface and implementation, breaking the cycle.

Compilers for languages like C# and Java can deal with (seemingly) circular dependencies within a set of classes to compile, because when A uses B they only have to consider the interface (type signature) of B, not its implementation. (Cyclic dependencies between modules are still a problem, in particular for build systems.)

I'm not really familiar with F# (so someone please correct me if necessary), but in languages that use type interference to determine function signatures, one consequence is that the interface (type signature) of the function then depends on its implementation, so you can't as easily invert the dependency as when you have explicitly declared type signatures.


F# could have out of order definitions, but the designers chose not to because it lengthens compile times considerably.


I still cringe when I think back to some C projects from long ago, where the dependency chains of headers and libraries where unfathomably byzantine.

On a related note, lately I have been looking into some biochemistry topics and their relation to computer science. It seems that cyclic dependencies are possibly a requirement for life, which makes faithful simulations of biochemical processes an interesting challenge.


I’m a compiler nerd at the moment. I have an inherent trust of things that bootstrap themselves, as if it means they are conceptually more pure. I’m not sure whether that is 100% well founded, but it sounds similar to some of these observations. :)


I think self-hosting is just a little over-rated for languages. It's good that Rust and C++ are self-hosting, you'd expect those to be languages you can write a compiler in.

But Lua and JavaScript satisfy their own niche just fine without having popular runtimes written in themselves.


Most languages don't aspire to be so conceptually-complete that they insist on being bootstrapped. Which is fine, I make my living with them.

Those that do, however, are immensely beautiful and there is more to be learned from them.


There are a few JS interpreters written in JS, don't know about Lua.


The C language and gcc are a cyclic dependency too.


Only if you don't consider time, or optional dependencies. GCC 10 _can_ be built with GCC 10, but it was probably built with GCC 9 the first time around.

If you can bootstrap it, and you can, then it's not really a hard, unbreakable cycle, right? It's just an option that's quicker than starting from tcc or whatever every time.


But how would you build the first GCC in that chain?


Using another C compiler, if none exists, create one in Assembly or other programming language in the target platform and go from there.


TCC 0.9.27 can build GCC 2.95.3. You can build TCC using GNU Mes, which itself can be built using M2-Planet, which is written in a subset of C.

https://www.gnu.org/software/mes/


Furthermore, M2-Planet can be built by another C compiler written in assembly, all the way down to hex instructions.

https://github.com/oriansj/mescc-tools-seed/


These projects are from the Bootstrappable Builds folks:

https://bootstrappable.org/ https://bootstrapping.miraheze.org/wiki/


That is an interesting observation. I wonder whether the cycling in chemistry is more about information flow then like (negative?) feedback, it helps in making a stable system.


I’ve played around with F# over the years, but have only recently started getting into seriously. The top down nature of dependencies (both within a given file and for the files within a project) is a little odd at first, but once you get used to it feels quite natural.

If nothing else, it encourages (somewhat) common ways of structuring projects and avoids a lot of bike shedding about separation of concerns and project structure I encounter in C#.


I had to sell F# (over C#) as the language for new projects to my CEO a while back.

What I really wanted to say was: "F# lets us be lazy. You write way less code, you have fewer things to keep track of, you need fewer abstractions, you can write everything in the most straightforward way possible and then extend only when necessary because refactoring is super safe."

What I _actually_ said was: "F# is an extremely rigid language. It will not let junior developers take shortcuts, because you must write your code properly or it won't compile. Your classes and functions must be written in the correct hierarchical order, you must cover every corner case. I know it's going to be stressful or boring to be forced to do things right every time and never be able to just push some sloppy cowboy fix, but it's a sacrifice we're willing to make :)"


This is one of the things that frustrates me about go. They started with this very good advice and then took it one step stupider: Disallowing circular imports.

Why is this bad? It makes it impossible in many situations to locate your public API at the top level. Let's say you have a Widget interface and APIs to do things with those Widgets. Eventually your library gets complex enough that you want to separate functionality into subdirs. Now you have a problem: You can't access the definition for Widget from the subdir because importing the top level would create a circular import!

The only way to get around this problem is to have no code at the top level, and put your public API in a subdir. Just be sure to do this when first starting your library or you'll have to break things.


You can deal with that issue by defining things in a submodule, then importing and re-exporting them in the top-level (e.g. `type Foo = subdir.Foo`). That even lets you move things into the submodule without breaking callers. Other submodules can then import from the submodule rather than the top-level.


This is how libraries are exported inside Google (use /public subdir for the public interface), and I guess with Go this got into the language. But I think Go made much worse things than this :)


Nice article. The first thing that came to mind when seeing the title was "but circular dependencies just mean that your layers are wrong and those should all be in the same layer" and funnily enough that's exactly what's explained in the article!

What would be interesting would be to organize modules into named layers--instead of just explicit 1-by-1 module dependencies (i.e. edges). I have not seen a language do that yet.

Another interesting thing that I haven't seen touched on in more mainstream languages is the idea of bidirectional interfaces. The somewhat DSL-like NesC language had this, primarily driven by the need to write device drivers that serviced interrupts (and events).


> What would be interesting would be to organize modules into named layers--instead of just explicit 1-by-1 module dependencies (i.e. edges). I have not seen a language do that yet.

Correct me if I'm wrong, but I think you're describing namespaces here


I've never tried F# so I'm not sure I'm exactly "on base" here, but I actually think this way. It's frustrating to me when dependencies happen automatically or when they can be used before they are declared.

It helps me understand what's included if I can see a nice list of all the dependencies. It helps me narrow down where the problem is if I only have to address dependencies I know have already loaded.

I learned to program in the 80's however, and it probably is a bit "old school".


The simplest case where I'd want a circular dependencies is I have a module that defines some data types and functions on those types

  mod widget

  struct Widget

  fn show(Widget) { ... }
  fn join(Widget) { ... }
  fn calc_box(Widget) { ... }
and one function is getting very long and complicated so I want to spin it out into its own module

  mod widget                   mod calc_box
  import calc_box              from widget import Widget

  struct Widget                fn calc_box(Widget) { ... }

  fn show(Widget) { ... }
  fn join(Widget) { ... }
The reason the article gives that this is "evil" is that there is no hierarchy between the modules; the cycle is really just one big supermodule. That's true... but that was basically the point. Where is the "evil" here?


In that case you can separate ’struct Widget’ into another file. This way there will be no circular dependency


I know how to break the cycle. What I'm asking is why should I break the cycle?


It depends on situation. As an example, it could be difficult to write tests when you have circular dependency. As well depends on the language there could be strange ’side effects’ which would be difficult to debug.


The mention of Lakos' book "Large-Scale C++ Software Design" reminds me of the epiphany I had when I read it. You could almost boil the entire book (and it's not a small one) down to "eliminate circular dependencies". That's trite, but Lakos walks through all the problems that large C++ projects had and lays out how all of them boil down to the dependency problem. Particularly interesting to me, at the time, was his analysis of how compile times got exponentially longer as interdependencies grew, to the point where even a small change required recompiling the entire code base.

The section in Lakos on how to identify and remove circular dependencies was worth the price of the book.


In a related vein, is there any sort of theoretical transform that can be used to convert cyclic dependencies into a tree? Similar to how lambda lifting eliminates free variables. (Other than the obvious one of treating the entire cycle as one node)


For my FE projects I employ a module based code structure. If you import a module, you get the whole thing. This lends itself to many positives but one negative is it makes it easy to run into circular dependencies. I wrote an article discussing my architecture and how I solve circular dependencies in js.

https://erock.io/scaling-js-codebase-multiple-platforms/

In the article I argue that circular dependencies are a good thing because they uncover poor code organization.


Debian has been doing some work on this:

https://wiki.debian.org/DebianBootstrap

Also the Guix and Bootstrappable Builds folks are working on this too, at a different layer:

https://bootstrappable.org/ https://bootstrapping.miraheze.org/wiki/


Funny to see those layered architectures/designs.

Tightly coupled dependencies are bad. Cyclic or not.

And if you do mange to to a cyclic dependency you are not thinking about your design property. Imo


This was one of the reasons I used lots of Interfaces back in the day with Java.


That doesn't remove circular dependencies though, it just hides them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: