Fuzzyset – A Human-Readable Interactive Representation of a Code Library

erikpukinskis · on July 3, 2018

This is great, I love explorations of how to represent code. (My masters thesis was a vaguely similar project: http://snowedin.net/windowinthebox/)

Lately, I find the best way to make my code more comprehensible is just to break it down into smaller pieces. The 300 lines of fuzzyset is pretty good! I find 150 lines is ideal, although I rarely understand the refactoring space well enough to hit that. 300 is more common, and 600-800 is not unheard of for me. And I code in 42 character width, so those are short lines, with smokers whitespace.

Forcing your code into single 150 line JavaScript file with a README and a published interface is a great way to force yourself to really understand the core concern. You squeeze out any and all orthogonal concerns into their own fully independent module. It clarifies the design to the point that the documentation has less to do.

gkaemmer · on July 4, 2018

At the same time, I find that it's often hard to comprehend projects that are broken into too many little files. There's normally no guide on what pieces are where, so you don't know which order to read things. Makes it a little tougher to dive in. I just want to find the file that has the "meat", and I don't mind if that file is 1000+ lines long, especially if the code itself makes sense and is clear.

erikpukinskis · on July 4, 2018

Agree.

But I’m advocating something much stronger than that: breaking them down into small modules each with a published interface.

You can see what all any given module depends on by looking at one set of imports at the top of one file.

Each of those has a github repo and a README, so it’s easy to find out what they do...

But most importantly, the architecture of such a library will be wholly different than what it would have been if developed as a monorepo.

My point is not really that this structure of code is ergonomic. It’s that forcing yourself to do things this way forces you to clarify the purpose of modules. And to discover interfaces which are truly orthogonal.

And that makes the code easier to understand.

de_watcher · on July 4, 2018

> the file that has the "meat", and I don't mind if that file is 1000+ lines long

I prefer "meat" separated. And having clear boundaries. 1k+ lines is a bad sign.

Even if it's well-structured, why force your idea of how much I have to scroll?

And why making git diff overviews less useful? (glance over filenames)

et1337 · on July 4, 2018

Completely agree. This is a common problem with object-oriented code bases. You've got hundreds of 20 line source files and the only way to decipher them is to trace the execution all the way from main() down to the last util class.

mjdease · on July 4, 2018

> And I code in 42 character width, so those are short lines

I've never heard of an approach like this, how did you settle on 42? I'd be very interested to see an example if you have some code hosted somewhere?

My initial reaction is 42 is too short and would encourage terse non-descriptive naming and be a pain to deal with long strings.

erikpukinskis · on July 4, 2018

40 was a little too small, and 42 has been OK in practice!

I find it’s helpful because beating any structure beyond maybe 5 levels deep is painful, so I am forced to refactor.

Names aren’t so much of an issue because of the single-file-per-repo and 150-line-target principles:

If your module only consists of 150 lines, there’s just not enough in your namespace to need long names. Essentially, you only ever work within a small, isolated piece of a huge flat namespace.

Long names are necessitated by the huge, deep API surface that the filesystem provides.

tzahola · on July 4, 2018

Maybe he’s programming in APL

daveFNbuck · on July 4, 2018

What's smokers whitespace?

erikpukinskis · on July 4, 2018

Oops autocorrect. Ample whitespace.

meken · on July 4, 2018

I really like this.

A while ago I was prepping for programming interviews and thought it would be fun to make some videos demonstrating some programming concepts [1] due to my perception that there was a lack of material linking code with visualization/conceptual understanding. It wasn't long, though, before I stumbled upon a website [2] that had many videos that did just that, so I got a bit discouraged and gave up.

Around this time, I was also watching videos on algorithms [3]. After watching many videos where the professors give very thoughtful hand-drawn examples, I became convinced that the hard part is truly understanding the algorithm/data structure and that the code is the "easy" part. Maybe that's just my bias talking because I was a computer science major, though.

I still think that documenting a codebase with GIFs and video demonstrations is really useful. I get kind of sad when I visit a github repo and there are none of these because it can be hard to get a sense for what the code does (even a screenshot goes a really long way). For example, I think [4] did a really good job and I modeled a recent project's github README after it.

- [1] https://youtu.be/Z35sLFyLBek

- [2] https://www.geeksforgeeks.org/fundamentals-of-algorithms/

- [3] https://ocw.mit.edu/courses/electrical-engineering-and-compu...

- [4] https://github.com/ajtoo/vscode-org-mode

no_identd · on July 8, 2018

You might want to take a look at DRAKON:

https://en.wikipedia.org/wiki/DRAKON

expiredtofu · on July 4, 2018

Wow, the github UI lookalike was really well done...I almost thought the top banner was a wikimedia donation request kinda thing. Nice job!

anc84 · on July 4, 2018

And it totally threw my brain into expecting only markdown below which made me not keen on meeting something where I did not know how to interact. Not a good decision imo.

codetrotter · on July 3, 2018

Is https://github.com/Glench/fuzzyset.js/blob/gh-pages/ui/index... generated from something that merges the GitHub UI and data with your own content or was it handmade so to speak?

hk__2 · on July 3, 2018

It seems to be handmade as it doesn’t really merge the GitHub UI: the README is a little bit different; the stars count is not up-to-date; nor is the latest commit; etc.

codetrotter · on July 4, 2018

Ah, I am on mobile and so could only compare the number of open issues immediately and assumed because that number matched that they all did. Then I agree it’s certainly handmade.

It would be cool if it made use of the GitHub API to get the data. Doing so should be possible, the GitHub API gives access to a lot of data about repositories. For example, here is a static page I made that pulls some data from the GitHub API so that I don’t have to keep the HTML data for the table at the bottom of the page manually updated with what repositories exist in the GH org it belongs to and thanks to that I also have the number of open issues right on the page: https://dcp-solved-with-rust.github.io/