Matthew Zito's Blog

Jigsaw Fixation

exbotanical@protonmail.com — Sun, 06 Apr 2025 00:00:00 GMT

It's been a while since I've written one of these. I realize that I maintain this sort of harmonic existential panic that I am not writing enough. Specifically, I haven't in many years produced any sort of material artifact that documents the project of my personhood; who I am right now is in the minds of the very small number of people who know me well. Or really, just one—Daryl. I guess I want this to be here in case I die, or maybe so I can read it in ten years and play a game of remembering these exact minutes (or attempting to, anyway). My programs and code _are_ effectively one such artifact, but the interpretation thereof also warrants its own artifact, and I'm too lazy and too busy to write a whole epistemology about my own programming. So I've reconvened on a resolution to use this damned blog thing and write. ## Areaware This last month, I've found my way into a jigsaw puzzle fixation. It began when Daryl and I took my dad and his wife to the Chihuly Glass Museum in Seattle. They've a gift shop which I presumed would be akin to any other museum gift shop—perhaps a few ostensibly interesting objects which quickly lose their allure courtesy heavy price tags. I was surprised, then, to find reasonable tickets on many items and moreover to find objects that I really, _really_, REALLY liked. Specifically, these little wooden robots that fold into cubes called Cubebots. Cubebots are puzzle/sculptures based on Japanese Shinto Kumi-ki—the designer, David Weeks, has created something like 30 different designs thereof. I love tiny objects and I love puzzles, so these caught my eye. I ended up acquiring four of them via various venues. Namely, I found a design store called Areaware for which David Weeks made the Cubebots. Cubebots are made with varying difficulties such as the Cubebot Classic, Julien, and Guthrie in order of least to greatest difficulty. Areaware still had Julien, but I had to purchase the Guthrie from the RISD online store. It is with great restraint I only acquired just four (at one point I seriously considered purchasing every single variation all at once). Here's my Cubebots. ![Cubebots; clockwise from top-left: Julien, Guthrie, Milo, Classic](images/puzz1/puzz_9999.jpg) ![Cubebot Julien](images/puzz1/puzz_999.jpg) Areaware also creates jigsaw puzzles in collaboration with various designers and artists. Areaware puzzles are like little design objects, with wide, thick boxes bearing wrap-around images. They're incredibly satisfying to collect in series. One such series—Bryce Wilner's Gradient Puzzles—feature various bichromatic or trichromatic gradients in 100, 500, or 1000 pieces. I've a penchant for monochromatic and bichromatic objects, and I furthermore very easily fall for the strange allure of finite collectibles. Needless to say, a few Ebay purchases later and I've completed several Areaware puzzles. Here's some (though not all) of the Areaware puzzles I've completed thus far: ![Pink/Orange/Yellow Gradient by Bryce Wilner](images/puzz1/puzz_1.jpg) ![Black/White Gradient by Bryce Wilner](images/puzz1/puzz_9.jpg) !["Stack" by Dusen Dusen](content/images/puzz1/puzz_7.jpg) !["Arc" by Dusen Dusen](images/puzz1/puzz_29.jpg) ## Two Months Later... So, uh...I wrote everything prior to the last set of pictures about ~~one month~~ two months ago. Since, the puzzle thing has broken out into a sort of addiction. I burned through several more Areaware puzzles, including three from an out-of-print series by a Korean photographer and designer, KangHee Kim. The series is called _Puzzle-In-Puzzle_; essentially, each puzzle contains a smaller one therein, analogous to the overlaid photographs Kim has taken. Here's the *Puzzle-In-Puzzle*s I've found and completed thus far: !["SeaSaw" by KangHee Kim](images/puzz1/puzz_3.jpg) !["SeaSaw" by KangHee Kim, with box](images/puzz1/puzz_4.jpg) !["SeaSaw" by KangHee Kim, with inner puzzles partially removed](images/puzz1/puzz_5.jpg) ![Selection from "Street Errands" by KangHee Kim](images/puzz1/puzz_10.jpg) ![Selection from "Street Errands" by KangHee Kim, with box](images/puzz1/puzz_12.jpg) ![Selection from "Street Errands" by KangHee Kim](images/puzz1/puzz_39.jpg) That last one I found at the Ballard Goodwill, now my favorite spot for finding new puzzles. They typically have an astoundingly good selection. Later, I'll share a vintage 1977 Springbok I purchased there this last weekend. My fixation has quickly escalated since the KangHee Kim pieces. I began puzzling every day after work and for several hours on the weekends while listening to my ever-expanding checklist of MIT courses on OpenCourseWare. ## Le Puzz Via an email campaign they sent out, I was apprised of an even greater vendor and puzzle manufacterer called [Le Puzz](https://lepuzz.com/). I suppose it was around this time I became fully possessed by a zeal for the jigsaw. Just _look_ at Le Puzz. They've some of the most cohesive and brilliant design and copy I've ever seen. I love that they arrange their own compositions and conduct photoshoots for their puzzles. I have nearly one dozen of these in my collection now, by way of which I've come to find that random-cut pieces—a dedication to which they pride themselves—are far superior to other, grid-like puzzles. !["Happy Birthday" by Le Puzz, in collaboration with Bodega Cakes](images/puzz1/puzz_16.jpg) !["Freaky Deaky" by Le Puzz, in collaboration with MYFAWNWY](images/puzz1/puzz_47.jpg) !["HEADS" by Le Puzz](images/puzz1/puzz_52.jpg) !["I ❤ Stickers" by Le Puzz](images/puzz1/puzz_58.jpg) !["I ❤ Stickers" by Le Puzz, detail](images/puzz1/puzz_62.jpg) !["I ❤ Stickers" by Le Puzz, detail](images/puzz1/puzz_63.jpg) !["So Random" by Le Puzz](images/puzz1/puzz_65.jpg) !["My Favorite Mistake" by Le Puzz](images/puzz1/puzz_75.jpg) !["Juicy" by Le Puzz](images/puzz1/puzz_43.jpg) Just look at how gorgeous this box is! ![Packaging for "Happy Birthday" by Le Puzz](images/puzz1/lepuzz_bday.jpg) ## Springbok Reading blogs and watching videos about puzzling also led me to the gold standard of vintage puzzles: Springbok. As I see it, there's two requisite characteristics I look for in a jigsaw puzzle. One, the pieces must be random-cut. The lack of predictability and novel surprise when some oblong piece magically reveals itself to be an edge piece in ways you had repeatedly thought impossible is highly satisfying. Once you become accustomed to random-cut puzzle pieces, everything else feels awful—especially those glossy, sharp, laser cut grid pieces many new puzzle brands like to use. Two, I prefer dense, aesthetically balanced—and often just a tad kitschy— compositions—something at which Springbok and Le Puzz (in the inspired tradition of the former) excel. There's something to be said for these puzzle companies back in the 70s and 80s, and how they took these playful risks and made such ludicrous design decisions. Springbok champions this what with their dense collages which might—for those of us born in the nineties—hearken forward to the I-Spy book series and Walter Wick's brilliant photography. Look at this 1977 Springbok named "Just Your Type". I love the balance of the composition, and the quasi-monochromatic colorway thereof. !["Just Your Type" by Springbok (1977)](content/images/puzz1/s-l1200.jpg) Follows is another Springbok I have an eye out for. I like how ridiculous this one is. !["Chip-Chip Hooray" by Springbok (1984)](content/images/puzz1/spbk_cookie.jpg) Decades later, Springbok pieces behold a delightful tactility and are of superior quality to many newer brands' prints. I love that Springbok puzzles each have a distinct name, akin to the title of a book or a film, and how the box thereof is adorned with thematic sayings, phrases, and paraphernalia. I can certainly see where Le Puzz got their inspiration. Here's a few of the Springbok puzzles I now own: !["Yesteryear" by Springbok](images/puzz1/puzz_74.jpg) !["Fan-tabulous Fifties" by Springbok](images/puzz1/puzz_44.jpg) !["VerticalVille" by Springbok, with Tamlin aka Snoopy](images/puzz1/puzz_86.jpg) !["VerticalVille" by Springbok, progress shot](images/puzz1/puzz_83.jpg) That last one is the aforementioned recent find from the Ballard Goodwill. One more thing I love about jigsaw puzzles is hunting for them in old thrift shops, garage sales, and other venues. Uncovering an unlikely trove of old Springboks is immensely satisfying. I suppose some people extrapolate this same satisfaction from record collecting, a practice in which I avidly partake, but I never have (qua _"extrapolate..."_). Namely because there are so many records, and my tastes are very specific and supposedly eclectic that I won't even waste my time looking at a stack of vinyl I find in those same aforementioned venues in which I find the jigsaw. But puzzles are common enough that you might just find one in the unlikeliest of places, and I don't have to be too fastidious about them beyond the two requisites I mentioned earlier (random-cut pieces with a good composition). Normally, I get extremely fixated on things for only a few months. Then I grow bored with them and find obsession elsewhere. This is how I've ended up with a fantastic zine collection, a pile of hobo literature, vintage fetish books, a DVD collection of TV Party episodes, and a pile of circuits. That said, this jigsaw puzzle thing has been going steady and strong for some 3 months. I suppose I will continue to partake (eventually intermittently) as a stress relief endeavor. I've thought about creating a blog specifically for jigsaws, or finding a forum on which I can share my finds and swap puzzles with people. Until then, I'll intersperse my writings here with the occasional find, that is assuming my writing panic proves a sufficient motive to return anytime soon. Bye!

Corundum and the Dromedary

exbotanical@protonmail.com — Mon, 22 Apr 2024 08:00:00 GMT

## Of Ocaml Well, it's already April and I've allowed far too much time to pass since my last post. Recently, in equal part due to watching too many Primeagen videos and wanting to learn a more traditionally functional programming language, I decided to learn Ocaml. Ocaml was created in 1996 by a squad of French computer scientists as an extension of the Caml ML dialect with object-oriented features. While there's certainly some aspects of Ocaml I've come to really quite appreciate, I don't anticipate I'll be incorporating it into my usual repertoire. Also, learning Ocaml solidified for me a notion that Kotlin has perhaps the most superior grammar I've ever seen in a programming language. Thus, I'd like to take a few moments to talk a bit about Ocaml (and what gripes I've thus far accumulated for it), laud Kotlin, and share about my burgeoning forays into the v8 JavaScript engine. The first thing that threw me off about Ocaml was its `let` assignments. There's no statements in Ocaml - everything is an expression. Variables and functions are declared with `let`, which I suppose is an FP thing where everything is a first-class expression that can be passed around. As someone who typically defaults to function declarations (for example, I _don't_ prefer function expressions in JavaScript unless absolutely necessary - and such necessity is something for which I cannot at this moment conjure up any examples), this aspect of Ocaml felt strange. Almost everything in Ocaml uses type inference, because Ocaml is a truly strongly-typed language (unlike TypeScript where you can wantonly lie to `tsc` and do bizarre things that fall outside the confines of traditional Set theory). So here's an Ocaml function: ```ocaml let fn arg = let local_var = 1 in arg + 1 ``` Ocaml will infer the argument `arg` to be of type `int`. In many cases, however, the Ocaml compiler is unable to infer the argument type, so you'll use a type declaration syntax. By the way, all Ocaml functions are curried, so here's a multi-argument function: ```ocaml let (arg1: string) (arg2: int) = blablabla... ``` Currying isn't anything new, but I like that Ocaml builds this in by default. Partial function application is trivial and it makes every multi-argument function far more versatile. If you've never written (or read) Ocaml code, you're probably wondering what the `in` is all about. Understanding this bit of syntax was really confusing, and to be honest, the docs weren't all that helpful in this regard either. After some trial and error and online reading, I eventually figured it out. Essentially, the syntax is saying `let some_value = expr1 in expr2`, where the conspicuous lack of `expr2` in the first example is supplanted with the current scope. As [this](https://stackoverflow.com/a/73319917) SO comment quite succinctly explains, Ocaml binds the result of `expr1` to the pattern (here `some_value`), then evaluates `expr2` with the new bindings present. In practice, that looks like ```ocaml let x = 2 in x + 1 (* x is 3 *) ``` Yes, it's weird. Ocaml's grammar is very odd. Ocaml's semicolon treatment is also very odd, and kind of stupid; it almost feels arbitrary because of the sheer complexity of the rules around which you'll need an explicit semicolon to terminate an expression. In Ocaml, semicolons are not optional - they're necessary, but only when the Ocaml compiler can't figure out where an expression ends. The situations in which this can arise are surprisingly numerous, and if you use a semicolon outside one of these situations, you'll run into a compiler error. ```ocaml let found_top_level = try Hashtbl.find top_level name with Not_found -> notfound := true; let item = { version = satisfied_v; url = satisfied.dist.tarball } in Hashtbl.add top_level name item; item in ``` You'll also need semicolons to delimit list elements ```ocaml dependencies = [ ("@magister_zito/eslint-config-vue", "^0.15.0"); ("@typescript-eslint/eslint-plugin", "^5.48.1"); ("@typescript-eslint/parser", "^5.48.1"); ("jsonc-eslint-parser", "^2.1.0"); ("yaml-eslint-parser", "^1.1.0"); ]; ``` And record fields: ```ocaml Lock.upsert (name ^ "@" ^ v_constraint) { version = satisfied_v; url = satisfied.dist.tarball; shasum = satisfied.dist.shasum; dependencies = satisfied.dependencies; }; ``` As a new Ocaml user, your workflow will involve a feedback loop of guessing where the semicolon needs to go whenever your LSP starts bitching at you. In some languages, by contrast, the semicolon as a syntactical terminator is extremely rare (e.g. Python, where it _can_ be used to terminate statements but canonically is not). In others, like C, they're required at the end of every statement. In languages like JavaScript, where you have ASI (automatic semicolon insertion), you very seldom _must_ use a semicolon (like when writing an array literal, or an immediately-invoked function expression aka IIFE). It's optional - you can choose to use them a la C, or only when absolutely necessary (which, again, is rare). But Ocaml...honestly, I would go so far as to say this aspect of Ocaml is outright stupid. The language requires semicolons enough that it would behoove the Ocaml developer experience to just require them everywhere. But alas, in Ocaml, we need the lack of semicolon for the implicit `in ` and for implicit returns (think Rust), et al. The language just owes itself to a very precarious balancing act of semicolons and a lack thereof. Another unexpected aspect of Ocaml is that `let` "assignment" is actually a pattern matching affair. This is exemplified by the ubiquitous entry-point `let () = ...` (effectively the `main` function of an Ocaml program). In Ocaml, `()` is the `Unit` type, and thus `let () = ...` matches on everything and always executes. Ocaml also has a `match` statement that I won't delve into because it's essentially the same thing as what you have in Rust, or Kotlin's `when` but with more powerful pattern-matching semantics. It's quite awesome. What got me thinking about Kotlin was the inability to do explicit returns in Ocaml. Most of the time in Ocaml, if you need to early-exit from a function, you're fucked. That's not a thing in the FP paradigm; you either use exhaustive pattern-matching or you must use `if...else` branching - all branches must return the same type. I ran into this when learning Ocaml and ended up with a nasty combination of `if...else` and `match` and all the nesting bullshit that comes with it. Kotlin, however, elegantly blends the power of FP canon with the liberal expressiveness afforded by the OOP lineage in the form of labeled return statements: ```kotlin listOf(1, 2, 3).forEach { if (it == 3) return@forEach println(it) } ``` ## Of Kotlin It was in this moment that I realized how much I've come to appreciate Kotlin's grammar. I say _grammar_ specifically here because I am largely not too fond of JVM languages. I find Java to be rather clunky and the surrounding ecosystem, build systems, and package managers moreover are an utter catastrophe. But for whatever reason, the language designers of Kotlin nailed the grammar on almost every account. Writing Kotlin is an absolute pleasure. Sometime last year, Kotlin swept across teams at AWS and mine adopted it for a new greenfield system we were building. I learned Java during the month between my hire and start dates at Amazon, then began using it daily therein. One year later, I was able to learn Kotlin such that I was writing code proficiently within one day. It took perhaps a couple weeks to master the more complex aspects i.e. coroutines, generics, etc. Kotlin offers an incredibly efficient adoption process, mostly because its grammar is simple. Before I proceed any further, I've a digression I must make. A few years prior to my arrival at Amazon, my team had taken a similar albeit short-lived excursion into Scala. A small portion of one of our legacy codebases was written in Scala, and one day I had to understand that portion in order to debug a Sev2 issue I had encountered while on-call. It was utterly baffling: cute, esoteric syntax. Myriad strange keywords with no analogous counterparts in any popular language. Weird lexical bindings that seem to appear out of nowhere. And worst of all, a complex grammar that would make the likes of Noam Chomsky explode. My personal Litmus test for a programming language is as follows: > Can I understand _generally_ what this code is doing, just by reading it? If yes, then the language's grammar is very likely sound, or at least sensible. If no, the language's grammar is very likely obscenely esoteric and over-complicated. The likelihood that it's a shitty language is also, incidentally, much higher. I've not written Scala, but from my brief encounters with it, I'm not particularly impressed. Ostensibly - at least - it's a bad idea. Anyway, Kotlin isn't like Scala. The grammar is simple and legible. Most of my team at AWS picked it up very quickly, which I believe is a testament to the language itself. Nowadays, I often find myself wishing for Kotlin's syntactical graces when writing code in other languages. Just the other day I was writing some infrastructure code in TypeScript and while writing a `switch` statement, I thought _"Isn't it so unfortunate that JavaScript doesn't have a `switch` **expression**?"_" In Kotlin you can do: ```kotlin val x = when { expr1 -> "a" expr2 -> "b" else -> "c" } ``` But in dumpy old JavaScript, we're stuck with this: ```ts let val: string switch (expr) { case 1: val = 'a' break case 2: val = 'b' break default: val = 'c' break } ``` Boo! Kotlin also improves upon Java in myriad ways. While I'll confess that quickly bootstrapping from nothing to a runnable program in _any_ JVM language is a pain, when solely considering programming, Kotlin is a remarkable improvement over Java. There's so much ceremony and boilerplate involved with Java code, and this is further compounded by inane ideas like the Spring framework, which is essentially a collection of "magical" abstractions which are almost always more complex than is necessary for the scope of the consuming project. But with Kotlin, classes and functions can be terse and minute. Look at how dumb simple this is: ```kotlin interface D class X : D class Y(val x: String) : D { fun main() { println(x) } } fun main () { val y = Y("gello") y.main() } ``` Versus the 4 files required in Java: ```java // D.java public interface D {} // X.java public class X implements D {} // Y.java public class Y implements D { private String x; Y(String x) { this.x = x; } public void main() { System.out.println(this.x); } } // Main.java public class Main { public static void main(String[] args) { Y y = new Y("gello"); y.main(); } } ``` ## Of Ruby Another thing I love is that Kotlin did away with checked exceptions. I've long since loathed the idea of checked exceptions for several reasons, among them that it forces you to reckon with exceptions using `try/catch` blocks which inevitably and almost invariably encourage bad programming practices. The Kotlin docs have a [section](https://kotlinlang.org/docs/exceptions.html#checked-exceptions) dedicated to this. Kotlin also fixes Java's type system, controlling null references, eliminating array variance, implementing proper function types (not that SAM interface bullshit Java does), demarcating read-only versus mutable collections, etc. But things haven't always been lovely like this at Amazon (this is my segue into Ruby). I'll share a little story. Once upon a time, AWS realized that writing infrastructure by hand was exceedingly difficult. Handwritten Cloudformation templates are challenging if not impossible to test; they can be verbose and repetitive and it's easy to make mistakes. The not-at-all-new notion of writing infrastructure with code (see: [Infrastructure as Code](https://aws.amazon.com/what-is/iac/) aka IaC) seemed the obvious solution. So AWS engineers decided to write a framework for doing IaC, one that could be used by the entirety of Amazon Web Services - by this I mean the Lambda, DynamoDB, ECS, SQS, SNS, CloudWatch, IAM, et al teams would be writing their software infrastructure using this library. So what do the guys in the AWS build tools team decide to use for this framework that will be used by thousands of engineers to build the infrastructure on which a considerable chunk of the internet runs on? Fucking Ruby. In a moment of brilliant clarity, the AWS build tools team concocted the bane of every AWS SDE's existence: LPT, or Live Pipeline Templates. I can't delve into the inner-workings (because it's an internal library), but this is what preceded the AWS CDK. You heard that right: before AWS CDK, every AWS team had to write all of their infra using Ruby. I'd never used Ruby prior to my tenure at Amazon, and I will certainly never use it again thereafter. It's the absolute most disgusting, awful language. It's dynamically typed, uses esoteric syntax, has a gazillion keywords; it's impossible to tell what is part of the standard library versus just part of the grammar (which is a glaring sign of feature bloat); the build system is a nightmare and is almost never reproducible from machine to machine; it's painfully slow; it's burdened with seemingly endless "cute" syntactic sugar. It's horrible to work with and I also learned that IDE/LSP support for Ruby is completely in the gutter. I absolute despise Ruby; I hate it, I hate it, I hate it. It is absolutely baffling to me that AWS engineers thought this was a good idea. In my dazed confusion, I went spelunking on Amazon's internal search and found a discussion thread where several SDEs wrote to the build tools team to inquire about the mystifying reasoning that led to their thinking that Ruby was a good choice, and after a few lengthy exchanges, they simply confessed that they regretted doing so. ## v8 Language rants aside, I've been delving into the v8 source code as of late because I want to see for myself how JavaScript can be implemented. I've read many articles and watched a few lectures on the underlying implementations of, say, Promises and the micro-task queue, but I've come to terms that I won't be satisfied until I've read the source code myself. Most reading materials focus quite heavily on the v8 compilers and the optimizations for which v8 is well known. This is understandable, but I would like to read about the JavaScript implementation itself: how does the prototype chain really work in practice (beyond some cursory musings about linked list lookups)? Promises (beyond abstract notions of queues and state machines)? What about the translation layer? I remember reading Ryan Dahl's original implementation of Node and finding it remarkable that much of the standard library was, in fact, implemented as a thin JavaScript API in front of the underlying C++ machinations we all know about. ![v8 engine logo](images/v8.png) As far as I can tell, v8 is a marvel of software engineering; an incredible feat and the cumulative product of some 2 or 3 decades of Lars Bak's work on Self. It's impressive - albeit intimidating - but I will nevertheless be back soon with some source code reviews of v8's JavaScript implementation. See you next time.

ucontext and coroutines

exbotanical@protonmail.com — Mon, 12 Feb 2024 00:00:00 GMT

Recommended listening: [Gétatchèw Mèkurya - Ethiopian Urban Modern Music Vol. 5 (1972)](https://www.youtube.com/watch?v=a4Uvr5tIIoU) Here's my source code review of the `coroutine` C library by an author called cloudwu. Figuring out this code was a great exercise in understanding cooperative scheduling, coroutines/fibers, and handling stacks in user-space. We'll focus on these concepts during the review. There's some clever bits that I absolutely loved as well, and about which I'm excited to share. `coroutine` is built atop the `ucontext` library. I've dedicated the immediately subsequent section to a brief foray into `ucontext` and its most commonly used APIs. Thereafter, we'll take what we know about `ucontext` and review the coroutine source code. ## ucontext `ucontext` handles much of the heavy lifting needed to implement something like threads i.e. saving registers, context switching, etc. The core API is very simple and focused around the `ucontext_t` struct: ```c typedef struct ucontext_t { struct ucontext_t *uc_link; stack_t uc_stack; mcontext_t uc_mcontext; // signals to block in the context sigset_t uc_sigmask; // ... } ucontext_t; ``` It'll be helpful to first see some examples to understand what purpose each of these fields serves. We'll look at a few example programs. ```c #include #include #include static ucontext_t ctx1; int main(void) { int x = 0; getcontext(&ctx1); int y = 0; printf("x=%d, y=%d\n", ++x, ++y); setcontext(&ctx1); return EXIT_SUCCESS; } ``` First, we have `getcontext`. It accepts as input a `ucontext_t` and will take whatever registers - including the program counter - we have in the current execution context and save them in this `ctx1` context. Next, we have `setcontext`. This will cause the program to resume execution right after the `getcontext` call. We'll therefore infinitely loop between the two calls. Quite illustrative here is the `x` and `y` variables. The value of `x` continues to increase while `y` forever remains 1. This happens because `setcontext` restores the registers and stack to that of the last `getcontext` call, including the program counter (hence resuming execution back at the most recent `getcontext` call). So when we jump back to the `getcontext` line in the execution flow, `y` - which is allocated on the stack - no longer exists. We're constantly re-declaring it and pushing it onto the stack. Next, we have `makecontext` and `swapcontext`, which will be fundamental in cloudwu's coroutines implementation. ```c #include #include #include static ucontext_t context; void handler(int arg1, int arg2) { printf("Function called with arguments: %d, %d\n", arg1, arg2); } int main(void) { char stack[2048]; getcontext(&context); context.uc_stack.ss_sp = stack; context.uc_stack.ss_size = sizeof(stack); makecontext(&context, (void (*)())handler, 2, 10, 20); printf("pre-swap\n"); setcontext(&context); printf("Back to main function - will never run\n"); return EXIT_SUCCESS; } ``` In `main`, we leverage `makecontext`, which allows us to create a new context with its own stack. Notice we'll also have to allocate a stack for the context. Using this context allows us to reset the program counter to a different function (`handler` in the demo program). If you've ever written assembly, this is very analogous to a `jmp` instruction. The third argument to `makecontext` is an integer indicating the number of arguments we want to pass to the context handler, and any subsequent `makecontext` arguments are the handler's arguments - the quantity of which, obviously, should align with however many arguments we pass. Know that the handler arguments must be integers, which isn't particularly useful in practice. The `coroutine` library handles this in a unique way that we'll see later. In this example, we call `setcontext` on this new context to immediately invoke `handler` with 2 integer arguments. The default behavior after `handler` finishes executing is to end the process. Thus, the string `"Back to the main function - will never run\n"` will never be written to stdout. We can control this behavior by leveraging the `uc_link` field in the `ucontext` struct. Setting this field tells the program to `setcontext` to the context stored in `uc_link` as soon as it finishes executing. In the following example we set `uc_link` to `main_context`. As soon as `handler` finishes running, execution will resume at `getcontext(&main_context)`, causing an infinite loop. ```c #include #include #include static ucontext_t main_context; static ucontext_t context; void handler(int arg1, int arg2) { printf("Function called with arguments: %d, %d\n", arg1, arg2); } int main() { char stack[2048]; getcontext(&main_context); getcontext(&context); context.uc_stack.ss_sp = stack; context.uc_stack.ss_size = sizeof(stack); context.uc_link = &main_context; makecontext(&context, (void (*)())handler, 2, 10, 20); printf("pre-swap\n"); setcontext(&context); printf("Back to main function\n"); return EXIT_SUCCESS; } ``` This is really quite akin to what threads do, and I've heard that a lot of university courses teach `ucontext` for toy thread scheduler implementations. One can certainly see why - much of this is almost perfectly analogous to `pthread_create` and `pthread_join` superficially. There's one more function we should look at: `swapcontext`. ```c #include #include #include static ucontext_t main_context; static ucontext_t context; void handler(int arg1, int arg2) { printf("Function called with arguments: %d, %d\n", arg1, arg2); swapcontext(&context, &main_context); } int main() { char stack[2048]; getcontext(&context); context.uc_stack.ss_sp = stack; context.uc_stack.ss_size = sizeof(stack); makecontext(&context, (void (*)())handler, 2, 10, 20); printf("pre-swap\n"); swapcontext(&main_context, &context); printf("Back to main function\n"); return EXIT_SUCCESS; } ``` In this program, we made a slight modification in that we're now calling `swapcontext`. `swapcontext` effectively calls `getcontext` on the first argument, then `setcontext` on the second. That is, we create a checkpoint of sorts at `swapcontext`, then immediately begin executing `handler`. Notice we don't need `getcontext` to initialize `main_context`, as the swap call does this for us. After the `printf` statement in `handler`, we call `swapcontext` again and resume execution just after the initial `swapcontext` call in `main`. So the program output looks like this: ``` pre-swap Function called with arguments: 10, 20 Back to main function ``` This should be sufficient knowledge of `ucontext` such that we can appreciate `coroutine`. # Source Code Review: coroutine A while back I found an implementation of coroutines written in C by someone called cloudwu. I bookmarked the library because it looked like a very concise and well-written implementation of coroutines that would be easy to digest. Coroutines are a flavor of cooperative multi-tasking centered around suspending and resuming execution flow. You might be familiar with the concept because of Go's stdlib implementation, _Goroutines_, or Kotlin's `kotlinx.coroutines` library. You can find the source code for `coroutine` [here](https://github.com/cloudwu/coroutine/tree/a6263031e6ee3d3e10e613f0c0c3af886465170e) (this is the exact revision I'll be reviewing). All code samples henceforth are authored by cloudwu, with occasional albeit slight modifications I've made to improve readability for the sake of code review (predominantly spacing), or to add annotations. I'll prefix all of my annotations with `Z:`. We begin with the example program, main.c, found in the root level directory. ```c #include "coroutine.h" #include struct args { int n; }; static void foo(struct schedule * S, void *ud) { struct args * arg = ud; int start = arg->n; int i; for (i=0;i<5;i++) { printf("coroutine %d : %d\n",coroutine_running(S) , start + i); coroutine_yield(S); } } static void test(struct schedule *S) { struct args arg1 = { 0 }; struct args arg2 = { 100 }; int co1 = coroutine_new(S, foo, &arg1); int co2 = coroutine_new(S, foo, &arg2); printf("main start\n"); while (coroutine_status(S,co1) && coroutine_status(S,co2)) { coroutine_resume(S,co1); coroutine_resume(S,co2); } printf("main end\n"); } int main() { struct schedule * S = coroutine_open(); test(S); coroutine_close(S); return 0; } ``` In `main`, we create a struct called `schedule`, then pass it into a `test` function where we initialize two coroutines and continually call `coroutine_resume` on them until they are no longer active. If you run this program, you'll see that the coroutines take turns executing the `foo` function each with their own local state. The output is as follows: ``` main start coroutine 0 : 0 coroutine 1 : 100 coroutine 0 : 1 coroutine 1 : 101 coroutine 0 : 2 coroutine 1 : 102 coroutine 0 : 3 coroutine 1 : 103 coroutine 0 : 4 coroutine 1 : 104 main end ``` I was really quite interested to see how this logic is implemented. I decided to read the source code in order of execution. We'll therefore begin in `main`, with the `schedule` struct and `coroutine_open`. ```c struct schedule { char stack[STACK_SIZE]; ucontext_t main; int nco; int cap; int running; struct coroutine **co; }; ``` In `coroutine`, a `schedule` is an opaque struct that represents the runtime environment in which all of the coroutines execute and context switch. Despite being logically well-written, `coroutine` uses unnecessarily terse names that make it difficult to understand. I've therefore summarized each field below to elucidate its purpose: - `stack` is the stack for the current coroutine execution environment - `main` is effectively a checkpoint context for the coroutine. This is where we'll resume execution when a coroutine yields control back to the scheduler. - `nco` tracks the number of coroutines that have been added to an array thereof - `co` is said array - `cap` is the capacity of the coroutines array, `co`, which will be pre-initialized and realloc'd when `nco` reaches `cap` - `running` is the index of the currently executing coroutine qua the `co` array `coroutine_open` is a very simple initializer function for a `schedule`. The initial coroutine capacity is 16 and the `co` array's elements are initialized to 0 via `memset`. The author does this because we will later need to perform `NULL` checks against any one of these slots when inserting new coroutines. ```c struct schedule * coroutine_open(void) { struct schedule *S = malloc(sizeof(*S)); S->nco = 0; S->cap = DEFAULT_COROUTINE; S->running = -1; S->co = malloc(sizeof(struct coroutine *) * S->cap); memset(S->co, 0, sizeof(struct coroutine *) * S->cap); return S; } ``` Now we have our `schedule` in `main` and we pass it down through `test`. We'll call `coroutine_new` twice, initializing two coroutines. We will first look at the `coroutine` struct, then the function: ```c struct coroutine { coroutine_func func; void *ud; ucontext_t ctx; struct schedule * sch; ptrdiff_t cap; ptrdiff_t size; int status; char *stack; }; ``` - `func` is the coroutine handler. This is the "work" performed by the coroutine. - `ud` is the coroutine handler arg. Later on, we will see how the library manages to pass this in. - `ctx` is the context. - `sch` is a pointer to the `schedule` that manages this coroutine. As far as I can tell, it's not used in the library. - `cap` is the coroutine's stack capacity - `size` is the coroutine's stack size - `stack` is where we persist the coroutine's stack prior to yielding - `status` indicates the coroutine's state - `COROUTINE_DEAD`, `COROUTINE_READY`, `COROUTINE_RUNNING`, or `COROUTINE_SUSPEND` > ptrdiff_t was a type I had not encountered before. The signed integer type of the result of subtracting two pointers, it exists primarily for pointer arithmetic. Interesting! Here's the function `coroutine_new`, annotated: ```c int coroutine_new(struct schedule *S, coroutine_func func, void *ud) { // Z: Just mallocs and sets all fields to defaults struct coroutine *co = _co_new(S, func , ud); // Z: If the num coroutines in the schedule is at or above capacity... if (S->nco >= S->cap) { // Z: Reallocate to have double the capacity and memset the memory. int id = S->cap; S->co = realloc(S->co, S->cap * 2 * sizeof(struct coroutine *)); memset(S->co + S->cap , 0 , sizeof(struct coroutine *) * S->cap); // Z: Assign the new coroutine at the next available index (the old capacity) S->co[S->cap] = co; // Z: Now update the capacity S->cap *= 2; ++S->nco; return id; } else { // Z: We have enough capacity. Find the next available index int i; for (i = 0; i < S->cap; i++) { // Z: Calculating this way always gives us the next available index // and always cycles to the beginning if we run out of cap (shouldnt // happen because of the realloc cond above) int id = (i + S->nco) % S->cap; if (S->co[id] == NULL) {~~~~ S->co[id] = co; ++S->nco; return id; } } } // Z: Trigger a failure (this will be compiled out in release builds) assert(0); return -1; } ``` Not much of interest in `_co_new`, just a struct initializer. The coroutine is initialized with a status of `COROUTINE_READY`. `coroutine_new` is super basic, but I've annotated it nonetheless. The author has a very terse style that I'm not particularly fond of and it makes everything harder to understand than it should be. Really, this is your standard fare initialization logic with a few interesting clever bits. One such clever bit that I like is the `assert` towards the end. When targeting release builds in most C compilers, this will get stripped out but it serves its purpose during development as a safeguard against invariant violations that should never happen. Okay, so back in main.c we've initialized two coroutines. Now we enter a while loop that won't break until either `coroutine_status` handler returns a falsy value. ```c int co1 = coroutine_new(S, foo, &arg1); int co2 = coroutine_new(S, foo, &arg2); printf("main start\n"); while (coroutine_status(S, co1) && coroutine_status(S, co2)) { coroutine_resume(S, co1); coroutine_resume(S, co2); } ``` `coroutine_status` is very simple, but again, I have annotated it: ```c int coroutine_status(struct schedule * S, int id) { // Z: Maintain invariant: should be a positive index and below the capacity assert(id>=0 && id < S->cap); // Z: if the slot contains NULL, the coroutine is no longer active if (S->co[id] == NULL) { return COROUTINE_DEAD; } return S->co[id]->status; } ``` Assuming neither coroutine has a status of `COROUTINE_DEAD`, this loop will keep going. Inside the loop body, the program calls `coroutine_resume` on both coroutines. `coroutine_resume` is where things get interesting: ```c void coroutine_resume(struct schedule * S, int id) { assert(S->running == -1); assert(id >=0 && id < S->cap); struct coroutine *C = S->co[id]; if (C == NULL) return; int status = C->status; switch(status) { case COROUTINE_READY: getcontext(&C->ctx); C->ctx.uc_stack.ss_sp = S->stack; C->ctx.uc_stack.ss_size = STACK_SIZE; C->ctx.uc_link = &S->main; S->running = id; C->status = COROUTINE_RUNNING; uintptr_t ptr = (uintptr_t)S; makecontext(&C->ctx, (void (*)(void)) mainfunc, 2, (uint32_t)ptr, (uint32_t)(ptr>>32)); swapcontext(&S->main, &C->ctx); break; case COROUTINE_SUSPEND: memcpy(S->stack + STACK_SIZE - C->size, C->stack, C->size); S->running = id; C->status = COROUTINE_RUNNING; swapcontext(&S->main, &C->ctx); break; default: assert(0); } } ``` It's difficult to read in its as-is state, so I'll break it down into three parts: the setup, the `COROUTINE_READY` code path, and the `COROUTINE_SUSPEND` code path. Every call to `coroutine_resume` begins with this preliminary logic, which I've annotated. ```c // Z: We shouldn't call resume while the schedule is in a running state. // Should be set back to -1 after every run, so it shouldn't be possible // for user code to violate this invariant. Hence why it's an `assert`. assert(S->running == -1); // Z: Ditto - we do a size and capacity check. assert(id >=0 && id < S->cap); // Z: Last, if the coroutine at the given index is NULL, we exit. struct coroutine *C = S->co[id]; if (C == NULL) return; ``` At this point in our systematic breakdown of main.c, we have just invoked `coroutine_resume` for the first time, so we'll hit the `COROUTINE_READY` case of the switch statement. ```c // Z: If the coroutine is COROUTINE_READY, create the context and prepare to // run the coroutine case COROUTINE_READY: // Z: Setup the coroutine's context... getcontext(&C->ctx); // Z: ...and its stack C->ctx.uc_stack.ss_sp = S->stack; C->ctx.uc_stack.ss_size = STACK_SIZE; // Z: S->main is where we will return after switching to ctx. // Literally the swapcontext line below. C->ctx.uc_link = &S->main; S->running = id; C->status = COROUTINE_RUNNING; uintptr_t ptr = (uintptr_t)S; // Z: Now, this looks interesting... // makecontext expects two ints and on some architectures this means 32 bit // values so we pass in the low 32 bits, and the high 32 bits of the // schedule struct, then reconcile it in the context handler. makecontext(&C->ctx, (void (*)(void)) mainfunc, 2, (uint32_t)ptr, (uint32_t)(ptr>>32)); // Z: Run the coroutine. We'll swap the stack and reset the program counter to C->ctx, which is `mainfunc`. // Calls getcontext on main and initializes it such that it resumes on the // line after this swapcontext call. swapcontext(&S->main, &C->ctx); break; ``` You'll notice the pointers being passed in (and used in the next code sample) as arguments to the context handler. The [manpage](https://man7.org/linux/man-pages/man3/swapcontext.3.html) for `swapcontext` explains this a bit. > On architectures where int and pointer types are the same size > (e.g., x86-32, where both types are 32 bits), you may be able to > get away with passing pointers as arguments to makecontext() > following argc. However, doing this is not guaranteed to be > portable, is undefined according to the standards, and won't work > on architectures where pointers are larger than ints. We'll look at `mainfunc` next, as this is where the flow of execution is about to proceed. `mainfunc` is brief, and as usual, I've annotated it liberally: ```c static void mainfunc(uint32_t low32, uint32_t hi32) { // Z: Reconcile the schedule pointer from the high and low 32 bits uintptr_t ptr = (uintptr_t)low32 | ((uintptr_t)hi32 << 32); struct schedule *S = (struct schedule *)ptr; // Z: Grab the id of the running coroutine from the schedule int id = S->running; // Z: ...and grab the coroutine struct coroutine *C = S->co[id]; // Z: Run the coroutine's handler function, passing along the coroutine handler argument. // We'll want to peek back at main.c to remember what happens next. C->func(S,C->ud); // Z: If the program calls coroutine_yield, we'll context switch out of this function immediately // and never hit this line. Thus, if the coroutine does not yield, it is effectively done and gets // destroyed. This does not happen at this point in main.c. _co_delete(C); S->co[id] = NULL; --S->nco; S->running = -1; } ``` We've hit the `C->func` line and we are now in `foo` - the coroutine handler - in main.c. Let's pick up there: ```c static void foo(struct schedule *S, void *ud) { struct args *arg = ud; int start = arg->n; int i; for (i = 0; i < 5; i++) { printf("coroutine %d : %d\n", coroutine_running(S), start + i); coroutine_yield(S); } } ``` After some work, we call `coroutine_yield`, which will save the stack, call `swapcontext` to context switch back to `coroutine_resume`, right before the `COROUTINE_READY` case's `break` statement. This ultimately lands us back in main.c, about to call the `coroutine_resume` on the second coroutine. This is the "cooperative" multi-tasking characteristic of coroutines. At some point during a coroutine's execution, we simply save the stack, and context switch to some other coroutine, picking up where we left off. Here's `coroutine_yield`, where we suspend the running coroutine: ```c void coroutine_yield(struct schedule * S) { // Z: Another quick check - self explanatory int id = S->running; assert(id >= 0); struct coroutine *C = S->co[id]; // Z: This one is interesting - we assert that the pointer address is equal to the stack address // to prove we persisted the stack properly assert((char *)&C > S->stack); // Z: Persist the stack state _save_stack(C, S->stack + STACK_SIZE); C->status = COROUTINE_SUSPEND; S->running = -1; swapcontext(&C->ctx , &S->main); } ``` Again, this `swapcontext` call will jump back to the `swapcontext` checkpoint at the end of the `COROUTINE_READY` case statement. Now the first coroutine has suspended and the second coroutine does everything we just looked at. After the second coroutine has run through all of these calls, we won't hit `mainfunc` again until the end of the coroutine's lifetime. Back in main.c, `coroutine_resume` is called with `co1` for a second time. This time, however, the coroutine's state is `COROUTINE_SUSPEND`, so we hit the other case statement: ```c case COROUTINE_SUSPEND: memcpy(S->stack + STACK_SIZE - C->size, C->stack, C->size); S->running = id; C->status = COROUTINE_RUNNING; swapcontext(&S->main, &C->ctx); break; ``` Here, we restore the coroutine's stack and `swapcontext` again. This is where the code gets very interesting: because we've saved the stack, `swapcontext` now context switches to the end of the `coroutine_yield` stack frame. We immediately exit `coroutine_yield` _into_ the `foo` handler function. The stack state is exactly as it was before, so we make another iteration of the loop and call `coroutine_yield`, save the stack again, and return to the end of the `COROUTINE_SUSPEND` block. By the way, we should take a peek at `_save_stack`. I've annotated it as usual ```c static void _save_stack(struct coroutine *C, char *top) { // Z: Grab a reference of where we're at in memory // This will be stored on the stack, so we'll use its address to figure out the size of the stack. char dummy = 0; // Z: top of the stack minus addr HERE should be less than the stack size assert(top - &dummy <= STACK_SIZE); // Z: update capacity and realloc if not enough if (C->capacity < top - &dummy) { free(C->stack); C->capacity = top - &dummy; C->stack = malloc(C->capacity); } C->size = top - &dummy; // Z: starting at the address of dummy (the last thing pushed onto the stack), // copy memory all the way up to size memcpy(C->stack, &dummy, C->size); } ``` Maybe this sort of trickery is typical among C programmers, but I thought it was rather clever; using a stack-allocated value's address to verifiably determine the current stack size. This is the crux of the context switching logic. Both coroutines will continue to suspend and yield, context switching into the other. Each time we make a context switch, we restore the stack of the running coroutine, do a little work, save the stack again, and continue the cycle ad infinitum (or until we stop calling `coroutine_resume` and `coroutine_yield`). And I really found this quite clever: the first time we finish executing `foo` without calling `coroutine_yield`, we're suddenly in the same stack frame where `mainfunc` was initially invoked. Execution picks up after the `C->func` call and the coroutine is deleted: ```c static void mainfunc(uint32_t low32, uint32_t hi32) { uintptr_t ptr = (uintptr_t)low32 | ((uintptr_t)hi32 << 32); struct schedule *S = (struct schedule *)ptr; int id = S->running; struct coroutine *C = S->co[id]; C->func(S,C->ud); // Z: After the final coroutine_yield, the outermost handler func // stack frame exits and we continue execution here _co_delete(C); S->co[id] = NULL; --S->nco; S->running = -1; } ``` There's a few utility functions that for the sake of brevity I have not included in the review. I had the most trouble comprehending the stack persistence logic because I only have a cursory level of experience with pointer arithmetic, but once I started to get it I gained an even more acute appreciation of the code. My favorite part of this library was the stack frame manipulation around `coroutine_yield` and `mainfunc` - it's a completely bonkers stroke of genius that sent chills down my spine the moment it all clicked for me. The fact that we just fall out of the stack frame and into the other coroutine...totally baller. I've seen this same kind of "fallthrough" trick before (in a manner), specifically in the reactivity library Evan You implemented for the Vue frontend framework. Evan does this thing where you call a function, trigger a proxy trap inside of that function, then push the function itself onto a stack so you can start invoking it any time the proxy traps are triggered subsequently. Yes, it's nuts. In fact, the `@vue/reactivity` library is so cool, I'll have to do a separate source code review for it on this blog (stay tuned for that). Anyway, I hope you enjoyed `coroutine`; drop me a note and let me know what you thought!

Internet Archives I

exbotanical@protonmail.com — Sat, 13 Jan 2024 00:00:00 GMT

Follows is a collection of some of my favorite videos on YouTube. `video: https://youtu.be/389DkzjHpus?si=P0FnJQFY3wbTOQFk` Dudes headbanging in the woods with no audible music. Pretty sure I found this video in high school. It's a classic. I've quite a fascination with headbanging, moshing, and generally just the way people behave at metal shows. My wife and I have frequented several such events in Portland, during which I coined the term "Bozie" to refer to headbangers who congregrate and walk in a chained circle, drooping their heads and languidly lurching forward, round and round. The term "Bozie" I took from the German sci-fi show _Tribes of Europa_. It's set in a post-apocalyptic Europe where a bunch of German Goths are the preeminent tribe, and they've this hierarchical society wherein the lowest rung is the Bozie, entry-level thugs who wear their hair like pre-Columbians with shaved sides and a little knot at the top. Epic. Great show, too, by the way. I'm disappointed it was cancelled. `video: https://youtu.be/AC4cACp-E2w?si=tZ_Tmv7U42PqR4sC` 120 hunting dogs being fed in France. I love the cacophony of sound, and the patterns that emerge from all their tails wagging about when the handler lets them through the gate. If I lived in France, I would go watch this frequently - maybe a few times per week. When the handler has them line up, it looks like an ocean of dogs, emitting waves as they climb atop one another in rounds. The erratic movement once they begin eating reminds me of ant swarms, such as death spirals. That static movement of homogenous color. Rather hypnotic, really. `video: https://youtu.be/Q5UYAM0Yp40?si=CBSrE4z6unALoK8k` I've always enjoyed this video; it's one of the many I come back to at least once per year. I like how the guy goes through all this bullshit to get the bottle only to...well, watch it. And I love his expression when he reads the message! He looks so done. The way the scene has these rapid cuts as he breaks the bottle, as though to convey it's a passage of time that needs to be articulated - very odd, and it lends to the absurdity of this fantastic scene. `video: https://youtu.be/C95BAULJnF8?si=WAuG2Tc8muWnzdCa` > **Spoilers** for Mishima: A Life in Four Chapters. Go watch it first. A scene from a great film with a great score by Philip Glass. The film is a hybrid Yukio Mishima biopic and telling of his tetralogy and assorted short stories. Mishima is a beautiful and tragic character and exemplar of the life as art in a sort of corporeal way that reminds me of artists like Chris Burden. Mishima actually committed seppuku after a failed attempt to overthrow the Japanese government, mirroring with eerie uniformity the events of his novel Runaway Horses. His work is so interesting and beautiful. I highly recommend. `video: https://youtu.be/443n8Udr77k?si=kKTGtRArPz9jMBaY` A four-way polyrhythm played on the drums. I have a large collection of polyrhythms - from the West African variety to selections of Jaki Liebezeit from Can. This is one of my all-time favorites. `video: https://youtu.be/h3MxEHQk644?si=ajF0l83Rs9O6NJv6` The documentary _Dirty Girls_, an exemplary nineties cultural artifact. Recorded at this guy's high school, Dirty Girls follows a group of girls who seldom shower and give fuck-all about their appearances. It's a sobering take, too, spared the pathological "you do you" narcissism of the 2020s; this is an organic punk ethos appropriately accompanied by Liz Phair's Girly-Sound cassette recordings. `video: https://youtu.be/x__s-hy4hZ4?si=Z08zZbBi0LoNbtdc` _Stacking_. This is one of my favorite phenomenons to emerge from the United States' gang culture. This video in particular is Crip _staccing_ (technically, any `ck` is replaced with `cc` because `ck` is a Blood abbreviation for "crip killer"). Stacking is a means of story-telling; often gang members will congregate in a large circle and take turns stacking and telling their stories. Why this fixture of gang culture hasn't received more attention eludes me; it's fascinating and rather artful. I've a decent collection of these videos and perhaps we'll do a dedicated post about them. `video: https://youtu.be/vVZm7I1CTBs?si=fdtMLnvabvap1ut_` A deaf man and phone phreaker who can whistle at precisely 2600 Hz, the frequency used by analog phones with fully automatic switches in order to signal that a call has ended, leaving the carrier line open for exploitation. When I was really into this around 2012, there was a guy who hosted his own phone lines for the purpose of analog phreaking. He sold blue box kits (both pre-assembled and parts plus schematics) that you could then use to phreak his lines.

Choreographic Programming

exbotanical@protonmail.com — Tue, 17 Oct 2023 07:00:00 GMT

Recently, I learned about a lesser-known programming paradigm called "Choreographic Programming" that rather intrigued me. When it comes to concurrent programming _paradigms_, my knowledge has largely been focused on the technical aspects of _dealing with_ concurrency and the constructs one typically reaches for: mutexes, semaphores, condition variables, kernel threads, user-space aka "green" threads, coroutines, event loops. It wasn't until I learned about Choreographic Programming that I realized just how little I knew about actual _paradigms_ for managing concurrent programs beyond your standard fare Actor Model or reactive programming. Before we dive right into choreographies, we need to get a few things out of the way. ## Session Type Formalism In order to truly appreciate choreographies, we need to understand what a _session type_ is. And by session here, I am not referring to the networking-associated, time-delimited, stateful session we typically think of. Session in this context refers to a formal methodology and accompanying construct for specifying communication behavior in computer programs. First introduced in a [conference paper](https://link.springer.com/chapter/10.1007/3-540-58184-7_118) submitted for the 1994 International Conference on Parallel Architectures and Languages Europe, session types offer a way to describe invariants around multi-party communication protocols. Even further, session types can actually be statically verified (hence the "type" in the name). Let's consider the canonical example of an ATM. Here we have a typical client/server scenario with the following protocol: ``` - Client supplies an ID to the ATM | |---> If OK: | | | |---> Client requests DEPOSIT | | | | | |---> Client sends amount | | | | | |---> ATM responds with updated balance | | | |---> Client requests WITHDRAW | | | |---> Client sends amount to WITHDRAW | | | |---> ATM responds with: | | | |---> OK | | | |---> ERROR | |---> If ERROR: | |---> Session terminates ``` So how do we formalize this protocol such that we can describe it programmatically? Let's create a simple formalization: ``` ATM = recv ID; choose { OK: (ATMₐᵤₜₕ) | ERROR: (ε) } ATMₐᵤₜₕ = offer { DEPOSIT: (recv u64; send u64; ε) | WITHDRAW: (recv u64; choose { OK: (ε), ERROR: (ε) })} ``` Session types consist of four primary operations: - _send/recv_, indicating sending and receiving messages of a particular type from the other party; and - _choose/offer_, indicating a branching point at which the host party can choose to enter into one of several sub-protocols. For the latter, our grammar describes such a sub-protocol where we either become authorized to use the ATM (`ATMₐᵤₜₕ`), or the ATM returns an `ERROR`. Note, here we're describing communication _behavior_ but not necessarily implementation. For example, in the first `ERROR` case we don't know the reasons for which a user's authentication might be rejected; we just know that it isn't. Also note, the semicolon in the grammar indicates a sequence of actions and the epsilon indicates the termination of the communication itself. What we've done here is describe the protocol from the perspective of the server (the ATM). In session typing, we also describe the protocol from the point-of-view of the client. This — in session typing — is known as the _dual_. We need the dual so we can ultimately implement both and verify them against the session type. Let's throw together another formalization for the dual, in this case the client: ``` CLIENT = send ID; offer { OK: (CLIENTₐᵤₜₕ) | ERROR: (ε)} CLIENTₐᵤₜₕ = choose { DEPOSIT: (send u64, recv u64; ε) | WITHDRAW: (send u64; offer { OK: (ε) | ERROR: (ε)})} ``` You'll notice that `CLIENT` looks quite a bit like the `ATM`, but with the four operations reversed. And this is precisely the point! From this we can construct a _dual session type_, which ensures each party's protocol is consistent with the other's. In practice, recursion is often used here, but I don't want to digress too much further than I already have. You can easily envision this if you consider whether `ATM` instead allowed the authenticated `CLIENT` to `choose` again instead of terminating. So how does this look in practice? Suppose we had two channels — we would actually model these with types: ```rust type AtmDeposit = Recv>; type AtmWithdraw = Recv>; type AtmServer = Recv, Close>>; type AtmClient = AtmServer::Dual; ``` Where `Dual` here would allow us to swap, say, a `recv` for a `send` depending on whether we're referring to the ATM or the client. ## Choreographies: The Basics Okay, now we're adequately equipped with the requisite knowledge to understand Choreographic Programming. The idea of Choreographic Programming was solidified in computer scientist Fabrizio Montesi's titular [2013 PhD thesis](https://www.fabriziomontesi.com/files/choreographic_programming.pdf). The choreography model existed prior to this thesis, but several issues severely limited its viability in practical application to distributed systems — namely that choreographies were typically prescribed to a single duality of client/server whereas distributed systems are multi-party, and that when applied to multi-party systems these original choreographies encounter a melange of race condition issues. The thesis is largely concerned with presenting proofs that demonstrate these limitations can be overcome. But for now, let's learn the short version. ### Distributed Stuff is Hard The thesis points out that the large majority of issues in concurrent distributed systems are focused around endpoint safety. Basically, it's challenging to ensure several nodes each abide by the constraints of the protocol in which they partake. This could be something as simple as many REST servers needing to communicate in a common payload format, or something as practical as several nodes participating in the internet using compliant HTTP. Programming the communication flows between nodes is difficult, Montesi points out, because the prevailing programming paradigms focus on implementing systems discretely. That is, you don't write code for a client and a server at once; you go and implement a server, then you (or someone else) implement a client (or clients) that talk to it. The paper summarizes this, stating _"expressing [communication flows] by defining the sending/receiving actions of each endpoint is difficult"_. ### Endpoint Projection aka EPP Choreographic programming alleviates this by allowing one to write the intended communication behavior of concurrent programs as a single logical unit. This is done by way of a procedure dubbed "Endpoint Projection", or EPP. Where a choreography is a single implementation of session types that describe a protocol, EPP is the process of compiling that choreography into multiple targets (for each participant). Imagine writing the code to implement an auth protocol between a server and its clients, then compiling that code into separate modules that can then be used by the server and client applications. In EPP, a single program produces multiple modules for each role in the protocol described in that single program. ### Thoughts All of this has me thinking about the distributed systems we build at AWS. Obviously, AWS is a massive company with an even more massive ecosystem. I suppose an onlooker might think we've begun to figure out such issues as the aforementioned, but we really haven't. Because of the strict SLAs and high availability requirements, AWS systems easily wind up being comprised of many, many sub-systems and nodes therein. The way we ensure such sub-systems follow protocol isn't any more special than the standard open source solutions out there right now. If you're interested in reading about one of the tools involved here, you can find the docs [here](https://smithy.io/2.0/index.html). Now because I brought this up, I'm apparently obligated to state for the record that this blog comprises my thoughts and opinions; in no way whatsoever are they representative of my employer. Not that this is particularly ambiguous. There's a lot more interesting stuff in the thesis, and I'd recommend checking out at least one of the few Choreographic libraries out there such as [Choral](https://www.choral-lang.org/). A lot of these are oriented towards the emergent class of specialty langs designed for distributed programming. Now that it's top-of-mind, I've been meaning to check out [Unison](https://www.unison-lang.org/), which sounds quite interesting. Next time! --- _This post's photograph is a picture of assorted people, among them Merce Cunningham — my favorite choreographer. I'll update this with a link once I post something about dance choreography._

Chromophores

exbotanical@protonmail.com — Sun, 02 Jul 2023 07:00:00 GMT

I've relocated across the United States twice in the last 2 years. My records have slowly decremented an entire Goldmine Standard grade, each move introducing myriad opportunities for newly-acquired crushed corners and sleeve creases. This is really unfortunate because I've slowly accumulated a very deliberate collection of rare pieces, nearly all of them first pressings. I noticed this a few weeks ago after we moved in to our new residence; my first press copy of [Teenage Head in My Refrigerator](https://www.discogs.com/master/84997-The-Deep-Freeze-Mice-Teenage-Head-In-My-Refrigerator) by the Deep Freeze Mice has been sitting in this shitty, flimsy saran wrap sleeve; I've several box sets with no protection whatsoever, and my [Zs 33](https://www.discogs.com/release/3168387-Zs-33) 7-inch is slowly developing a ring in the front of the sleeve (the hand-screened linen is also bending on the left side). It's absolutely disgusting, and I'm pissed at myself for ever allowing things to get to this point. Thus, I embarked on a research endeavor to actually learn - after some 15 years - how to properly store my records. First, I learn I should be storing the records themselves in dedicated inner sleeves. I've come across records that are packaged with a rice paper sleeve, but I had hitherto largely discounted this. You want your inner sleeve to be no less than 2mm thick (3mm is better, if you can find it). A decent inner sleeve is comprised of high-density polyethylene (HDPE) and rice paper. Next, you've your outer sleeve. A decent outer sleeve is larger than the record jacket on all four sides - this mitigates any dings or otherwise devastating damage that can occur both on the shelf and while handling. An acceptable outer sleeve has passed the Photographic Activity Test (PAT) to verify it accords with the ISO-18916 standard. Essentially, the PAT and ISO-18916 comprise the standard for archival quality enclosures. The former explores chemical interactions between photographs and a given material (i.e. some plastic) after prolonged contact by utilizing two detectors: one screens for oxidation and reduction reactions which can cause image fade, silver mirroring, and red or gold spots; the other screens for chromophores, which can cause yellowing. In short, the ideal outer sleeve is a cast polypropylene that has passed the PAT and is accompanied by lab test results to prove it. The next order of business - where does one find such sleeves? I found mine at a small Canadian company named Vinyl Storage Solutions. I decided to re-sleeve my records using the dual-pocket system. In short, VSS manufactures sleeves into which you seal your record's jacket; the opposing side contains _another_ sleeve into which you store the record itself (in its inner sleeve). I re-sleeved some 7" and 10" records today; here's a few photos. ![Angus Maclise / Tony Conrad - Dreamweapon III. This is the 1st edition, of which 500 copies in a silkscreened brown cover with purple print were made](images/angus_maclise_vinyl.jpg) ![The Incredible String Band - The Hangman's Beautiful Daughter](images/sleeve.jpg) I'm really quite pleased with this investment. Still a bit sore I didn't do this sooner, though. ## New Additions ### Zs Arms, Untitled EP, Karate Bump The sleeves arrived just in time for June's new additions. I found a record dealer in New York who had a NM copy of [Zs' _Arms_](https://www.discogs.com/release/1451533-Zs-Arms). This is the record that officially sold me on Zs' immeasurable brilliance. _Arms_ was recorded in May 2006 during one of Zs' sextet periods. Truly, I believe this particular incarnation of Zs is the group at their zenith. The band once described themselves as > "primarily concerned with making music that challenges the physical and mental limitations of both performer and listener. Manipulating extended technique, unique instrumental synthesis, and near telepathic communication, Zs aims to create works that envelop the listener and unfold sonically over time, evoking unspoken past, present, and future rites and ritual." Indeed. This contemplation of endurance in art reminds me of Matthew Barney's _Drawing Restraint_ series, a collection of performative drawings and sculptures whereby Matthew Barney, the artist, is restrained by way of various physical obstacles through which he laboriously attempts to engage in art-making. ![Matthew Barney, Drawing Restraint 2, 1988. Photograph: Michael Rees](images/drawing_restraint.jpg) _Arms_ has these brilliant moments of sustained synergy between otherwise incredibly disparate instruments (which happen to very often be paired together). I love the moniker "brutal chamber" as used to describe it - all too accurate. This copy is one among 500 pressed on white vinyl, including a small insert. The seller also happened to have a VG+ copy of Zs' [_Untitled_ 10"](https://www.discogs.com/release/1156121-Zs-Untitled) on clear vinyl, as well as the [_Karate Bump_ EP](https://www.discogs.com/release/1051260-Zs-Karate-Bump). I'm not too keen on CDs as a medium but I'll purchase them if they're integral to some sub-collection I'm working on (like the Zs catalog). Plus _Karate Bump_ was $2. ### Extra Life: Secular Works Vol 2 I noticed in the _Untitled_ 10" accreditations _"Written by: Charlie Looker"_. I didn't know that, but cool because I've also really been digging Extra Life's _Secular Works Vol 2_ the past several weeks. If you're not familiar with Charlie Looker, he was an integral part of the New York experimental scene during the aughts and 10s. He's especially interested in early Western vocal music and turned me on to David Munrow's _Music Of The Gothic Era_. This comes through in Extra Life (and moreover, Looker's corpus generally) in very unexpected ways. _Vol 2_ just kind of came out of nowhere last year; I didn't even know about it until now. Here, give this a listen: `video: https://www.youtube-nocookie.com/embed/-_UmJI6xhkw` I do _love_ the vocal orchestration from 6:00 and beyond. I picked up a copy of Vol 2 as well as a first-press copy of Vol 1. Vol 2 was pressed on the same pink vinyl as Vol 1. ![Extra Life's Secular Works Vol 1](images/extra_life_vinyl.jpg) ![Extra Life's Secular Works Vol 2](images/extra_life_vol2.jpg) ![Extra Life's Secular Works pink vinyl](images/extra_life_vinyl_back.jpg) ## Other Happenings in June ### BNNT I also stumbled upon Polish collective BNNT, by way of Zs guitarist Patrick Higgins' involvement. BNNT has been around since the early 10s, but I'm only learning about them now as I recently began another bout of listening to 'aughts era art music. `video: https://www.youtube-nocookie.com/embed/jv3z3ySsTS4` ### Sinbad I picked up a British series from 2012 called Sinbad, with which I've desperately fallen in love. I went so far as to order the DVD because I legitimately loved this show and lamented that moment at which I viewed the final episode of its first and inevitable final season. The production is assuredly less than stellar, and the writing is at times somewhat frenzied. It's one of those Xena/Hercules-tier serials that should have endured for years and hundreds of episodes, but alas - it aired in 2012 when TV show renewals weren't handed out as liberally as they had a decade prior. Onward, into the age of penal streaming whereby new shows get PIP'd (yes, a FAANG joke) before they can even establish a cultural presence. ![Elliot Knight as Sinbad, et al](images/sinbad.jpg) In case you didn't know, I often enjoy proverbially "bad" TV as much as I enjoy good film. I've no patience for pretense and will always keep it real with you folks. ### WebAssembly I've been reading _WebAssembly in Action_ by Gerard Gallant. What I like about this book is it focuses on WebAssembly with C, which inherently affords a more in-depth discourse on WebAssembly's inner machinations. Equipped with a thorough understanding of WASM, I aim to write a frontend C library, drawing inspiration from projects like [Yew](https://github.com/yewstack/yew) and [Choo](https://github.com/choojs/choo). Such a lib would make for a nice companion to my C server framework [Ys](https://github.com/exbotanical/ys). My aim is a virtual DOM rendered over a websocket connection. I'll implement a small routing component using a stack, and a reactive state component using some sort of bespoke proxy. I'm not yet sure how to implement this in C short of designing an accompanying runtime so I can manage an evented layer atop which the state would be managed (which I do not wish to do). I'll probably pull the v8 source code again and take a look at how they've implemented JavaScript proxies. The C programming language offers some interesting - if not obscure - faculties, too, so I've no doubt I'll find something with which I'll be satisfied. ### ChatGPT Posts Are Still a Scourge All that said and it's June 2023 and ChatGPT posts (and moreover ChatGPT everything) remain an absolute scourge. You can't read any programming, comp-sci, or even comp-sci-adjacent publication or forum (apparently including this blog) without being inundated with these. The legacy media and auxiliary hype-machines are still on their "omg this will replace X" kick. There's even a class of new companies foolishly built on the premise of GPT proxies. One of the more obnoxious sentiments is that ChatGPT and its bretheren will replace software engineers. First, that's completely without the confines of what these language models and prompt engineering solutions are intended to do. Second, thinking this is akin to thinking that because airplanes can fly faster than birds, they'll eventually outcompete birds for their food supply and drive them into extinction. Ha! I appreciate what computer scientist Jason Lanier had to say about this: > "Is AI really capable of outsmarting us and taking over the world? "OK! Well, your question makes no sense," Lanier says in his gentle sing-song voice. "You’ve just used the set of terms that to me are fictions. I’m sorry to respond that way, but it’s ridiculous … it’s unreal." This is the stuff of sci-fi movies such as The Matrix and Terminator, he says." Anyway, I quickly alleviated this headache at the very least on HackerNews by writing a quick and dirty Grease/Tamper/WhateverMonkey script: ```js // ==UserScript== // @name ChatGPT-HN-posts-be-gone // @namespace http://tampermonkey.net/ // @version 0.1 // @description whatever // @author Matthew Zito // @match https://news.ycombinator.com/* // @exclude https://news.ycombinator.com/jobs* // @exclude https://news.ycombinator.com/item* // @icon https://www.google.com/s2/favicons?sz=64&domain=ycombinator.com // @grant none // ==/UserScript== ;(function () { 'use strict' cleanHackerNews() })() function cleanHackerNews() { console.info('cleaning HackerNews...') const ranks = document.querySelectorAll('span.rank') if (!ranks) return // we don't want to mess with a complex regex; just return if it's not a feed page let rankCount = ranks.item(0)?.innerHTML ?? 1 const posts = document.querySelectorAll('tr.athing') if (!posts) { console.info('unable to select HackerNews posts; this is a bug') return } posts.forEach(post => { const title = post.querySelector('span.titleline > a') if (!title) { console.info('unable to select title; this is probably a bug') return } if (title.innerHTML.includes('GPT')) { post.remove() } }) // Fix counts posts.forEach(post => { const rank = post.querySelector('span.rank') if (!rank) { console.info('unable to select rank; this is probably a bug') return } rank.innerHTML = `${rankCount++}.` }) } ``` I can't promise this will be maintainable given it relies on the markup structure of HN's UI, but it was sufficient for me throughout June. Anyway, that's it for June. I've established a goal to write more, and moreover to use this blog I spent a non-trivial amount of time programming. See you next month.

A Complete Guide to Make

exbotanical@protonmail.com — Sat, 04 Mar 2023 08:00:00 GMT

Make is a build automation tool originally created by Stuart Feldman at Bell Labs in 1976. It should really say something to you that a build tool created almost 50 years ago (at the time of writing this post, anyway) is still among the most widely-used tools for building large-scale C and C++ programs (this tends to be the case, though Make doesn't care what language your project uses). Projects such as the Linux kernel, the GNU Compiler Collection (gcc), Git, and the Python programming language use Make. Needless to say, knowing how to use Make is a valuable skill. The only problem with this is Make is notoriously difficult to learn. There's no scarcity of criticism surrounding its complexity; many guides and tutorials only further confuse the eager learner. I was one such eager learner, and I failed to learn Make several times. But lately, I feel like I've finally got the hang of it, and I'd like to pass along what I've learned to you so perhaps you won't endure the same frustrations I did. > Note: This guide assumes you are using GNU Make. GNU Make was created by Richard Stallman and Roland McGrath in 1987 as part of the GNU Project. It's the standard implementation of Make these days and adds extensions over the original Make, many of which we'll learn about in this post. ## A Gentle Introduction Make is a build automation tool that is commonly used to build software projects. It reads a file called a "Makefile" that specifies the rules for building the project, and then automatically builds the project by executing the necessary commands. Here's a simple example of a Makefile: ```makefile program: main.o utils.o gcc main.o utils.o -o program main.o: main.c gcc -c main.c utils.o: utils.c gcc -c utils.c ``` This Makefile specifies that the `program` executable should be built from the `main.o` and `utils.o` object files. It also specifies how to build each of these object files from their respective source files (`main.c` and `utils.c`). To execute it, you'd run `make` in the directory where the Makefile resides. The syntax of a Makefile is based on rules that define how to build a target (usually a file or an executable) from its dependencies (usually other files or object files). Each rule consists of a target, its dependencies, and the commands needed to build the target from its dependencies. Let's take a closer look at the syntax used in Makefiles. ## Makefile Syntax 101 As mentioned in the (hopefully) gentle introduction, a Makefile rule consists of a target, its dependencies, and the commands needed to build the target from said dependencies. Here's the typical structure of a Makefile rule: ```makefile target: dependency1 dependency2 command1 command2 ``` In this rule, `target` is the name of the file or executable that we want to build, and `dependency1` and `dependency2` are the files or object files that `target` depends on. The commands `command1` and `command2` are the shell commands that are executed to build `target` from its dependencies. Note that the commands in a rule _must_ be indented with a tab character (not spaces), or execution will fail (with the mildly cryptic `Makefile:: *** missing separator. Stop`). If you're frustrated by this weird, seemingly arbitrary restriction, you're not the first. To quote the seminal [UNIX-HATERS Handbook](https://en.wikipedia.org/wiki/The_UNIX-HATERS_Handbook): > _"The problem with Dennis’ Makefile is that when he added the comment line, he inadvertently inserted a space before the tab character at the beginning of line 2. The tab character is a very important part of the syntax of Makefiles. All command lines (the lines beginning with cc in our example) must start with tabs. After he made his change, line 2 didn’t, hence the error."_ > > _"So what?"" you ask, "What’s wrong with that?"_ > > _"There is nothing wrong with it, by itself. It’s just that when you consider how other programming tools work in Unix, using tabs as part of the syntax is like one of those pungee stick traps in The Green Berets: the poor kid from Kansas is walking point in front of John Wayne and doesn’t see the trip wire. After all, there are no trip wires to watch out for in Kansas corn fields. WHAM!"_ So...yeah, gotta watch out for that. Also, you can have multiple commands in a rule, each on a separate line. Back to the earlier example of a Makefile: ```makefile program: main.o utils.o gcc main.o utils.o -o program main.o: main.c gcc -c main.c utils.o: utils.c utils.h gcc -c utils.c ``` In this rule, `program` is the name of the executable that we want to build, and it depends on `main.o` and `utils.o`. The first command in the rule uses `gcc` to link the two object files together into the `program` executable. The other two rules specify how to build the `main.o` and `utils.o` object files from their respective source files. The experienced developer at this point may notice we're repeating file names quite a bit here. One simple typo could bring your entire build to a screeching halt. Fortunately, Make allows us leverage variables and macros in Makefiles to make them more flexible. ## Variables and Macros As aforementioned, Makefiles support the use of variables and macros, which can make them more flexible and easier to maintain. Here's an example of how to define a variable in a Makefile: ```makefile CC = gcc ``` In this example, `CC` is a variable that is assigned the value `gcc`. We can then use this variable in a rule like this: ```makefile program: main.o utils.o $(CC) main.o utils.o -o program ``` This rule uses the `$(CC)` macro to expand to the value `gcc`. This makes the Makefile more flexible, because we can easily change the value of `CC` to use a different compiler. We can also define variables with more complex values. For example: ```makefile CFLAGS = -Wall -Werror -O2 ``` ### Best Practices on = versus := By the way, in Make there are two main ways to set variables: with `=` and with `:=`. The difference between them is _when_ they are evaluated. `=`-assigned variables are evaluated when they are _used_, whereas `:=`-assigned variables are evaluated when they are _defined_ (i.e. immediately). Here's an example to demonstrate the difference: ```makefile FOO := $(BAR) BAR := hello all: @echo $(FOO) ``` In this example, `FOO` is set using `:=`, which means it is evaluated immediately. At the time it is evaluated, `BAR` has not yet been set, so `FOO` is set to an empty string. Therefore, the output of the `echo` command will be an empty string. If we swap the two lines that set `FOO` and `BAR`: ```makefile BAR := hello FOO := $(BAR) all: @echo $(FOO) ``` Now `BAR` is set before `FOO`, so `FOO` will be set to `hello` and the output of the echo command will be `hello`. As a best practice, it is generally recommended to use `:=` for variables that don't depend on other variables, and `=` for variables that do. We'll talk about that `@` before the `echo` command in just a few. But first, let's step up the complexity a tad and look at the more advanced features of Make. ### Optional Assignment In Make, `?=` is a conditional variable assignment operator. It assigns the value to the variable if the variable is not already set, but if the variable is already set, then it keeps the existing value and does not override it. The syntax for this is: ```makefile VARIABLE ?= value ``` For example: ```makefile SOME_VAR ?= default_value target: @echo "SOME_VAR is $(SOME_VAR)" ``` Invoking `make target` here would yield `SOME_VAR is default_value`. However, if we set `SOME_VAR` _before_ invoking `make` e.g `SOME_VAR=custom_value make target`, `make` will instead output `SOME_VAR is custom_value`. ## Pattern-matching Rules Pattern rules are used to define how to build a target from a set of source files that match a particular pattern. Here's an example: ```makefile %.o: %.c $(CC) $(CFLAGS) -c $< -o $@ ``` In this rule, the `%.o` pattern matches any object file, and the `%.c` pattern matches any C source file. The `$<` and `$@` variables are used to refer to the first dependency and the target, respectively. This rule specifies how to build any object file from its corresponding C source file. Personally, I was a little confused by this at first, so here's a working example: ``` project/ ├── Makefile ├── src/ │ ├── file1.c │ ├── file2.c │ └── file3.c ``` ```makefile CC := gcc CFLAGS := -Wall -Wextra -pedantic -std=c17 TARGET := program %.o: src/%.c $(CC) $(CFLAGS) -c $< -o $@ $(TARGET): file1.o file2.o file3.o ar rcs $@ $^ ``` To build `TARGET`, we rely on three dependencies: `file1.o`, `file2.o`, and `file3.o`. These don't exist yet, so Make will look for a matching rule that tells it how to build those dependencies. It finds the matching rule in `%.o`, which relies on any files with a `.c` extension inside of the `src` directory. Since we do indeed have `.c` files in `src`, Make knows it can start at this `%.o` rule. Make will invoke the `%.o` rule for _each_ `src/%.c` file. The output of running `make` here will be: ```bash gcc -Wall -Wextra -pedantic -std=c17 -c src/file1.c -o file1.o gcc -Wall -Wextra -pedantic -std=c17 -c src/file2.c -o file2.o gcc -Wall -Wextra -pedantic -std=c17 -c src/file3.c -o file3.o ar rcs program file1.o file2.o file3.o ``` In a subsequent section, we'll talk about Make built-ins, which will enable us to avoid writing out every object file in the dependencies for the `TARGET` rule. ## Phony Phony targets were especially confusing to me when I first learned Make. I don't know why, but pretty much every guide or tutorial I've read makes this subject so much more confusing than it needs to be. Phony targets are just used to define targets that are not associated with files, but rather with actions that need to be performed. Here's an example: ```makefile .PHONY: clean clean: rm -f *.o program ``` In this rule, `clean` is a _phony_ target that specifies how to remove all object files and the `program` executable. Note that we use the `.PHONY` directive to tell make that `clean` is not a file, but rather a phony target. If we didn't do this — and we happened to have a file in the root directory named `clean` — running `make clean` would yield `make: 'clean' is up to date` because the file named `clean` exists (i.e. Make thinks it has been "built" already). By the way, you can specify many phony targets in a single line, like: ```makefile .PHONY: clean test whatever ``` ## Conditional Directives Makefiles support conditional directives that allow us to specify different rules depending on the value of a variable or the existence of a file. You'll probably see this used most often for versioning and cross-platform compatibility support (where building for a different platform means using different source files). Here's a simple, straight-forward example: ```makefile ifdef DEBUG CFLAGS = -g -Wall else CFLAGS = -O2 -Wall endif ``` In this example, we use the `ifdef` directive to check if the `DEBUG` variable is defined. If it is defined, we set the `CFLAGS` variable to include debugging symbols (`-g`). Otherwise, we set it to optimize the code (`-O2`). To set the `DEBUG` variable, we could have explicitly defined it inside the Makefile, or we could have exported it in the shell's environment. At this point, it'd be a good idea to look at other Make built-ins... ## Make Built-ins Make has many built-in functions that can be used to manipulate strings, perform arithmetic, etc. Let's look at a few of the most common and useful ones. For a full accounting of these functions, one can learn about all of them in the [GNU Make documentation](https://www.gnu.org/software/make/manual/make.html). ### wildcard The `wildcard` function can be used to search for files that match a certain pattern. For example: ```makefile SOURCES := $(wildcard *.c) ``` This will set `SOURCES` to a space-separated list of all files in the current directory that end in `.c`. ### patsubst The `patsubst` function can be used to perform pattern substitution on a string. For example: ```makefile SOURCES := $(wildcard *.c) OBJECTS := $(patsubst %.c, %.o, $(SOURCES)) ``` This will set `OBJECTS` to a space-separated list of all files in `SOURCES`, but with the `.c` extension replaced with `.o`. Earlier I hinted could leverage `wildcard` and `patsubst` to make our working example more efficient. Here's the updated example: ``` project/ ├── Makefile ├── src/ │ ├── file1.c │ ├── file2.c │ └── file3.c ``` ```makefile CC := gcc CFLAGS := -Wall -Wextra -pedantic -std=c17 TARGET := program SOURCES := $(wildcard src/*.c) # Grab all .c files in src/ OBJECTS := $(patsubst %.c, %.o, $(SOURCES)) # Make a list of all .c filenames from src/ but with .o %.o: src/%.c # Run this rule for each source file $(CC) $(CFLAGS) -c $< -o $@ $(TARGET): $(OBJECTS) # TARGET depends on all of the object files - one for each .c file in src/ ar rcs $@ $^ .PHONY: clean # Our phony rule for cleaning up the build artifacts clean: rm program src/*.o ``` ### foreach The `foreach` function can be used to iterate over a list of values and perform an action on each one. For example: ```makefile DIRECTORIES := src include lib make-dirs: $(foreach dir, $(DIRECTORIES), mkdir -p $(dir);) ``` This will create the directories `src`, `include`, and `lib` if they do not already exist. The breakdown here is: ```makefile target: $(foreach arg, args-list, command $(arg);) ``` ### ifeq The `ifeq` function can be used to conditionally execute a block of Makefile code. For example: ```makefile ifeq ($(CC), gcc) # Are we using gcc? CFLAGS += -std=c99 endif ``` This will add the `-std=c99` flag to the `CFLAGS` variable if the `CC` variable is set to `gcc`. Pretty straight-forward. ## Running External Commands from Make In an earlier example, we saw usage of the `echo` command inside of a rule. It's often useful to run external commands as part of a build process. We can do this for many simple commands by simply referencing the command. You'll typically want to prefix the command with `@` so Make knows to execute the command, and _not_ print the command itself to stdout: ```makefile all: @echo "beginning build..." ``` But what if you need output from some external command _inside_ of the Makefile? For example, let's suppose we want to include the current date and time in the output of a build. We can use the `$(shell)` function to execute the date command and capture its output: ```makefile BUILD_DATE := $(shell date) all: @echo "Build completed on $(BUILD_DATE)" ``` Here, the `$(shell)` function is used to execute the `date` command and assign its output to the `BUILD_DATE` variable. The variable can then be used in a rule to include the date and time in the output. Again, the `@` symbol before the echo command will prevent the command itself from being printed to the terminal. This is useful for keeping the output of your Makefile clean and concise. If you have a lot of commands being run, you may not want to clutter the output with the commands themselves. ## Bending the Rules Let's look at some cool things we can do with rules. ### The $(MAKE) Directive Sometimes it's necessary for one rule in a Makefile to call another rule. This can be done using the `$(MAKE)` directive, which tells Make to invoke itself recursively with the specified rule. For example, let's say we have two rules, `build` and `deploy`, and we want the `deploy` rule to invoke the `build` rule before executing. Here's how we would do that: ```makefile build: ./do_build deploy: $(MAKE) build ./do_deploy ``` Here, the deploy rule calls the `$(MAKE)` directive with the build rule as its argument. Make will then recursively invoke itself with the `build` rule, and once that is complete it will continue with the `deploy` rule. ### Private Rules? Now suppose we have some setup logic we need to perform before running several different targets: ```makefile build: # build commands test: # test commands release: # packaging commands setup_files: # commands to setup files needed by build, test, and release ``` Do we really need users running `setup_files` from the command-line via `make setup_files`? What if we want to have this rule because it's shared — and it's convenient to only write it once — but we don't necessarily want others to be able to invoke it through `make`? Well, unfortunately, Make does not have private rules. However, there's a trick you can use to approximate the same behavior: ```makefile build: --setup_files # build commands test: --setup_files # test commands release: --setup_files # packaging commands # private rule --setup_files: # commands to setup files needed by build, test, and release ``` By prefixing the "private" rule with `--`, we take advantage of the fact that command-line flags are passed with two hyphens — thus, invoking `make --setup_files` results in `make: unrecognized option '--setup_files'`. Not too shabby. ### Passing Arguments to Rules It can be very useful to pass arguments from one rule to another rule in a Makefile. This can be done using inline variable declarations and the `$(var)` syntax to expand them. In the following example, the rule `foo` accepts an argument `arg`, which is set inline to the string `hello world` by the rule `bar`. ```makefile foo: @echo "Foo argument: $(arg)" bar: @$(MAKE) foo arg="hello world" ``` To pass arguments to a rule from _outside_ of the Makefile, simply reference a variable that you've set via the command-line: ```makefile BIN_NAME ?= program # BIN_NAME defaults to "program" all: gcc main.c -o $(BIN_NAME) ``` Invoking `make` would produce a binary called `program`, while invoking `BIN_NAME=app make` would produce a binary named `app`. ## Using Multiple Makefiles In larger projects, it is common to split the Makefile into multiple files for better organization. This can be done using the `include` directive. Let's assume we have the following directory structure: ``` project/ ├── Makefile ├── src/ │ ├── file1.c │ ├── file2.c │ └── file3.c └── include/ ├── header1.h ├── header2.h └── header3.h ``` Now suppose we'd like to split the Makefile into multiple smaller Makefiles and include them in our main Makefile. For example, we can create a `sources.mk` file that specifies the source files and a `headers.mk` file that specifies the header files. Then, we can include these files in the main Makefile using the `include` directive like so ```makefile # Makefile # Include the sources and headers Makefiles include sources.mk include headers.mk # Compiler and linker flags CC := gcc CFLAGS := -Wall -Wextra -pedantic -std=c17 LDFLAGS := -L. # Target executable TARGET := program # Object files OBJS := $(SRCS:.c=.o) # Rule to build the executable $(TARGET): $(OBJS) $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^ -lutil # Rules to build object files from source files %.o: %.c $(HEADERS) $(CC) $(CFLAGS) -c $< -o $@ .PHONY: clean clean: rm -f $(OBJS) $(TARGET) ``` ```makefile # sources.mk # Source files SRCS := src/file1.c src/file2.c src/file3.c ``` ```makefile # headers.mk # Header files HEADERS := include/header1.h include/header2.h include/header3.h ``` In this example, the `sources.mk` file specifies the source files, and the `headers.mk` file specifies the header files. These files are then included in the main Makefile using the `include` directive. The main Makefile defines the compiler and linker flags, the target executable, and the object files. It also includes the rules to build the executable and the object files from the source files. Using the `include` directive allows you to split your Makefile into smaller, more manageable pieces, which can make it easier to maintain and understand your build system. Note that `include` more specifically tells Make to suspend reading the current Makefile and read one or more other Makefiles before continuing. This means if we were to place actual rules in `sources.mk` or `headers.mk`, they will be executed if matched. Keep this in mind when composing Makefiles together. ## Conclusion This has been a distillation of things that took me several tries and many projects to grasp. Ultimately, the best way to learn any tool is to get your hands dirty, so I recommend applying the knowledge in this guide by using Make in your next project. Remember, while Make is most often used for C projects, it's language agnostic. You could even leverage Make to automate something that has nothing to do with code (sort of analogous to how various government agencies [use Git](https://government.github.com/community/#:~:text=Government%20agencies%20at%20the%20national,GitHub%20to%20share%20and%20collaborate.))! And if you really want to step up to the current industry standard, I recommend looking at [CMake](https://cmake.org/), a build system _generator_ that is often used to _generate Makefiles_.

Rules to Die on a Hill By: A Decisive JavaScript Style Guide

exbotanical@protonmail.com — Tue, 19 Apr 2022 07:00:00 GMT

I recently made one of the most significant changes of my career[^1] — switching from tabs to spaces. But why stop there? Today, I rationalize my code style decisions over the years. These rationalizations aren't going to work for everyone (hence _A_ Comprehensive Guide, not _The_ Comprehensive Guide), but my hope is this guide gives you an starting point for thinking about these ostensibly mundane concerns. [^1]: No, not really. > It doesn't matter as long as you're consistent. > > – _Everyone ever_ Across the frontend ecosystem, we so often hear that _"It doesn't matter which one you choose — just be consistent"_, but for me this is and always has been _not good enough_. The reality is, when you are working on a highly-collaborative project at scale, it _does_ matter. Yes, your style decisions _matter_. Whether you use tabs or spaces, semicolons or not — these things affect your projects and the people who work on them. My take is your code style should be driven by concerted and deliberate decision-making that is equally utilitarian and appropriate for the technologies being used, and the people using them. That is, it's not the chosen style itself that matters, but the process by which you arrived at that decision. Let's begin! > Disclaimer: Some programming languages such as Go have style codified into the language. > Other languages such as C necessitate style by virtue of the compiler (e.g. semicolons). ECMAScript, however, notoriously has not codified code style into its language specification, thus this guide is concerned only with JavaScript and TypeScript codebases. ## Tabs versus Spaces Tabs have canonically been used for indentation and are the default indent character across UNIX systems. This tradition hails from terminals and teletypes wherein the character meant 'move to the right 8 columns'. The resulting ASCII tab character is here used as a compression mechanism — 8 space characters, on the contrary, take up more space in a file. Tabs also support visual configurability. For instance, I can adjust my IDE or text editor such that tabs have the appearance of 2 spaces. Meanwhile, another developer on my team might prefer 4 spaces and adjust their local environment in-kind. In source-control, the indentations are encoded as a tab character (decimal character code of `9`), ensuring a source-of-truth in bytes but not necessarily appearance. If tabs are arguably _designed for indentation_, why should we prefer spaces? Well, that visual configurability turns out to be as much a bane as it is a boon. Tabs might appear as 2 spaces in one environment and 8 in another. Meanwhile 2 spaces is always just...2 spaces. In JavaScript, we're less concerned with the tab character as an entity that affects the interpreter. What's more important is _how_ the character appears. The implied problem here is different programs have different settings for tabs. In my editor, the character appears to be expressed over 2 columns. Meanwhile, in source-control it's 4. Furthermore, in my _other_ text editor, tabs are 8 columns. Unlike Python, tabs are meaningless when interpreted by a JavaScript engine. **The Verdict** Prefer two spaces as it is: - a deliberate indicator of indentation - visually consistent across editors and source control - compact (nobody enjoys scrolling a mile to the right when reading nested control-flow) **Supporting Tools** - [eslint no-mixed-spaces-and-tabs](https://eslint.org/docs/rules/no-mixed-spaces-and-tabs#no-mixed-spaces-and-tabs) - [eslint no-tabs](https://eslint.org/docs/rules/no-tabs#no-tabs) - [prettier tabs](https://prettier.io/docs/en/options.html#tabs) - [editorconfig indent_style, indent_size](https://editorconfig.org/) ## Semicolons Why do we have semicolons in the first place? Well, requiring them makes compilers easier to write! But why do we use them _in JavaScript_? > Because C uses them. Ha. Yeah, okay. You're probably familiar with ASI (Automatic Semicolon Insertion), but in case you aren't, ASI is a compile-time convenience whereby the compiler or interpreter automatically inserts semicolons. Because ASI ensures JavaScript statements contain semicolons where necessary, their use by the programmer is largely optional. Here's a few other languages for which semicolons are optional: - python - go - ruby - groovy - scala As far as ASI in JavaScript is concerned, here's the gist of it: _Insert when..._ _a. The parser encounters a token disallowed by the formal grammar, **and** encounters a line break or closing brace._ ```js x = 1 y = 2 // Uncaught SyntaxError: Unexpected identifier ``` _b. A line break is found after one of the following tokens._ - postfix `++` / `--` - `continue` - `break` - `return` - `yield`, `yield*` - `module` The preceding list enumerates what are known as _restricted productions_. You see, part of JavaScript's ASI algorithm is syntactical forms (so-called _restricted productions_) which forbid a newline character from occurring at a certain point. Note this passage from the [ECMAScript 2015 spec](https://262.ecma-international.org/7.0/#sec-rules-of-automatic-semicolon-insertion): > If the phrase “[no LineTerminator here]” appears in the right-hand side of a production of the syntactic grammar, it indicates that the production is a restricted production: it may not be used if a LineTerminator occurs in the input stream at the indicated position. Restricted productions is why the following returns `undefined`. ```js ;(() => { return { x: 'y' } })() ``` Whereas this next example returns `{ x: 'y' }`. ```js ;(() => { return { x: 'y', } })() ``` For further reading, see the [full three rules](https://tc39.es/ecma262/#sec-rules-of-automatic-semicolon-insertion) for ASI in the spec. > "Programs are meant to be read by humans and only incidentally for computers to execute." > > – _Donald Knuth_ Today's compilers are smart enough to handle multi-line statements, and today's programmers are more than capable of recognizing EOLs via consistent whitespace formatting (which you should be using). Let's omit semicolons for the sake of brevity, only including them where necessary. Ah, and here's a simple heuristic for the _where necessary_ part: > Use a _leading_ semicolon when the line begins with one of the following characters: `+=[(/` For example, here's some code where we'll need to use a semicolon no matter what. ```js let fn = function () { /* ... */ } [(1, 2, 3)].forEach() // TypeError: undefined is not a function ``` The restricted productions will bite you in the ass regardless of whether you use semicolons, so you'll still have to remember this rule. **The Verdict** Semicolons in JavaScript are superfluous, except when they're not. In those situations, ASI can still bite you either way. Let's instead be deliberate with our use of semicolons, employing them only when necessary. **Supporting Tools** - [eslint semi](https://eslint.org/docs/rules/semi) - [typescript-eslint semi](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/docs/rules/semi.md) - [prettier semicolons](https://prettier.io/docs/en/options.html#semicolons) ## Double vs Single Quotes Always a fan of concision, you can probably guess I prefer single quotes in my JavaScript. On your standard fare [QWERTY keyboard](https://en.wikipedia.org/wiki/QWERTY), double-quotes require a keypress combination of `Shift`+`'`. Contrast that with single quotes, which require a single keypress. A common argument in favor of double-quotes is the need to escape quote characters within a string literal. However, the number of extra keystrokes needed to accommodate escaping a quote character is negligible when considering the number of keystrokes you'll conserve by using single-quotes. My rule here is to simply use double-quotes when I'm typing a string literal containing a single-quote. ```js const str1 = 'this is a string that required less keystrokes to type' const str2 = "this is a string that contains a ' character. instead of using \\ to escape it, I use double quotes." ``` Using double-quotes only when escaping `'` characters also has the added benefit of conveying intention. The rare occasion of a double-quoted string in your codebase will stand out immediately as a string literal that contains quote characters. As an aside, prefer back-ticks for interpolated or multi-line strings. **The Verdict** Prefer single quotes, unless typing a string literal that contains single quote characters, then use double quotes. Use back-ticks for string interpolations, multi-line strings, and the occasional [tagged template function](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates). JSX properties should use double quotes, however — both to maintain parity with common HTML conventions and to demarcate JSX templates from JavaScript business logic. **Supporting Tools** - [eslint quotes](https://eslint.org/docs/rules/quotes.html) - [typescript-eslint quotes](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/docs/rules/quotes.md) - [prettier quotes](https://prettier.io/docs/en/options.html#quotes) eslint: ```json rules: { "quotes": ["error", "single"] }, overrides: [ { "files": ["**/*.{ts,tsx}"], "rules": { "@typescript-eslint/quotes": ["error", "single"], "quotes": "off" } } ] ``` prettier: ```json { "singleQuote": true } ``` ## Line Length Line length, in this context not to be confused with exact characters per line, describes the approximate width in columns before a line should begin to wrap. I believe this is a rather important rule to die on a hill by as it dramatically affects code readability. A line of code should not endure so long that, on your average monitor, the reader must scroll horizontally. As for actual width, here's a [great article](https://javinpaul.medium.com/does-column-width-of-80-make-sense-in-2018-50c161fbdcf6) that points out the [archaic 80-column rule](https://en.wikipedia.org/wiki/Punched_card) is especially anachronistic and not exactly grounded in today's technologies. That said, I still stick with 80 characters. As you may have noticed, I just said _average monitor_ moments ago without clarifying an exact or even approximate monitor size or resolution. I kept it vague because statistics aside, I'm not going to presume what monitor size prevails across my team at work or peers online. You should discuss this rule with your peers to decide what works best. So long as your decided width accommodates the majority of monitors without the need for tons of scrolling, you're compliant with this guide. **The Verdict** 80 columns for me, but this number should be derived from whatever for you and your team accommodates using visual real estate effectively without causing horizontal overflow. **Supporting Tools** - [eslint max-len](https://eslint.org/docs/rules/max-len#max-len) - [prettier printWidth](https://prettier.io/docs/en/options.html#print-width) eslint: ```json { "max-len": ["error", { "code": 80 }] } ``` prettier: ```json { "printWidth": 80 } ``` > Note the eslint and prettier rules mentioned above are not equivalent, as noted in prettier's `printWidth` documentation. ## Bracket Spacing How often do you see this in modern JavaScript codebases? ```js const obj = {a: 1, b: 2} function x(){ console.log({obj}) } ``` Yeah, I don't see it much either. There's a reason for that: _minification_. Compact bracket spacing is a remnant of a JavaScript ecosystem where whitespace mattered when sending files over the wire. Fortunately, almost every framework and build tool has minification integrated into it. Whether you're building a full-fledged React app or minifying your vanilla JS with Terser, you're stripping whitespace around brackets by the time your code is in production. Minification is an easy opt-in across most modern JavaScript toolchains. What we're left with is a style that is generally less readable. **The Verdict** Yes, download size matters. Unless you're not using minification, prefer whitespace between brackets. **Supporting Tools** - [eslint object-curly-spacing](https://eslint.org/docs/rules/object-curly-spacing) - [typescript-eslint object-curly-spacing](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/docs/rules/object-curly-spacing.md) - [prettier bracketSpacing](https://prettier.io/docs/en/options.html#bracket-spacing) eslint: ```json rules: { "object-curly-spacing": ["error", "always"] }, overrides: [ { "files": ["**/*.{ts,tsx}"], "rules": { "@typescript-eslint/object-curly-spacing": ["error", "always"], "object-curly-spacing": "off" } } ] ``` prettier: ```json { "bracketSpacing": true } ``` ## Trailing Comma This is yet another one on which I've completely flipped in recent years. In my estimation, the following is rather awkward. ```js const points = { x: 12, y: 13, z: 14, } ``` Previously, my argument for the utility of omitting trailing commas has been that the omission more plainly conveys that a given property is the last in an object. Looking back, I think _Wow, what an absurd argument_ — as though we can't perceive that by the fact that _the last property is the last property_. We don't need an additional visual aid to convey that. And so, I've changed my tune. Sure, the trailing comma looks a bit awkward, but my inner John Stuart Mill says the utility of the trailing comma far supersedes the cleanliness (rather, lack thereof). That is, there's a utilitarian argument to be made for the trailing comma. First, ease of access. This has personally bit me innumerable times while writing and updating code. Consider this theme object from this blog's code back when I was still anti-trailing comma. It changed very often. ```js const darkTheme = { colors: { font: { primary: 'rgb(206, 166, 186)', secondary: 'rgb(206, 166, 186)', hover: 'rgb(47, 43, 69)' }, bg: { primary: 'rgb(25, 23, 37)', secondary: 'rgb(47, 43, 69)', tertiary: 'rgb(214, 102, 149)' }, border: { primary: 'rgb(100, 102, 140)' }, link: 'rgb(75, 187, 172)', scroll: { fg: 'rgb(214, 102, 149)', bg: 'rgb(100, 102, 140)' } } } ``` I can't tell you how many times I tried to add or amend a property here and I've had to backtrack with my keyboard to include a missing comma. I can attest to the convenience of trailing commas from anecdotal experience, for sure. Not convinced? Let's look at the second and more important reason to favor the trailing comma: _Git diffs_. If you're contributing to an open source or enterprise codebase, you're undoubtedly using a version control tool such as git. If you're not...uh, I'd love to [hear from you](mailto:exbotanical@protonmail.com). Let's see what happens in a Git diff when you add a single line to a no-trailing-commas codebase: ```diff const darkTheme = { colors: { font: { primary: 'rgb(206, 166, 186)', secondary: 'rgb(206, 166, 186)', hover: 'rgb(47, 43, 69)' }, bg: { primary: 'rgb(25, 23, 37)', secondary: 'rgb(47, 43, 69)', tertiary: 'rgb(214, 102, 149)' }, border: { primary: 'rgb(100, 102, 140)' }, link: 'rgb(75, 187, 172)', scroll: { - fg: 'rgb(214, 102, 149)' + fg: 'rgb(214, 102, 149)', + bg: 'rgb(100, 102, 140)' } } } ``` I don't know about you, but seeing a line of code that was already there pop out _twice_ while reviewing a PR is just obnoxious. I can't immediately discern whether the penultimate line was actually an addition. Duplicate this several times over across a single PR and you've got a mess on your hands that is difficult to read at best and greedy for cognitive overhead at worst. Meanwhile, the diff for a trailing comma version of this codebase would look like this: ```diff const darkTheme = { colors: { font: { primary: 'rgb(206, 166, 186)', secondary: 'rgb(206, 166, 186)', hover: 'rgb(47, 43, 69)', }, bg: { primary: 'rgb(25, 23, 37)', secondary: 'rgb(47, 43, 69)', tertiary: 'rgb(214, 102, 149)', }, border: { primary: 'rgb(100, 102, 140)', }, link: 'rgb(75, 187, 172)', scroll: { fg: 'rgb(214, 102, 149)', + bg: 'rgb(100, 102, 140)', }, }, } ``` That's better. I know immediately that `scroll.bg` was the addition here. Much like the aforementioned legacy bracket spacing argument, you may recall that trailing commas in object literals was once not valid JavaScript. Beginning with ECMAScript 5, however, trailing commas in object literals [are legal](https://tc39.es/ecma262/multipage/ecmascript-language-expressions.html#prod-ObjectLiteral). Near-ubiquitous transpilers like [Babel](https://babeljs.io/) will remove the trailing comma in transpiled code, so you don't need to worry about this one in legacy browsers. You may also prefer this rule for arrays: ```js const arr = [ 1, 2, 3, ] ``` And function parameters: ```js function x(first, middle, last,) { /* ... */ } ``` Though, a comma must not appear after a 'rest' element: ```js function x(first, middle, last, ...all) { /* ... */ } ``` **The Verdict** Prefer trailing commas in object literals (and optionally, arrays and function parameters). It makes it easier to add properties and results in cleaner, more readable diffs. If targeting legacy browsers, use a transpiler such as Babel to ensure trailing commas are stripped from transpiled code. **Supporting Tools** - [eslint comma-dangle](https://eslint.org/docs/rules/comma-dangle) - [eslint comma-spacing](https://eslint.org/docs/rules/comma-spacing) - [eslint comma-style](https://eslint.org/docs/rules/comma-style#comma-style) - [prettier trailing-commas](https://prettier.io/docs/en/options.html#trailing-commas) eslint: ```json { "comma-dangle": ["error", "always"], "comma-spacing": ["error", { "after": true, "before": false }], "comma-style": ["error", "last"] } ``` prettier: ```json { "trailingComma": "all" } ``` ## Conclusion There you have it, my rules to die on a hill by. I actually have many, many more, but this article is already long enough. Even if you disagree with my takes on these contentious issues, my hope is you'll draw inspiration to adopt a more decisive approach to code style when maintaining a JavaScript or TypeScript codebase. As a frontend lead, it's my job to think about these things so my team doesn't have to. Of course, the final decision should always be a collective one, or at least your team should feel comfortable suggesting a change. I'm a big believer in static analysis tools and I can confidently say that proper tooling can make or break a collaborative codebase. Despite my sardonic perspective on the _just be consistent_ adage, I should clarify that, yes, what ultimately matters is that you do whatever you decide consistently. ### Shared Configurations I've codified the above rules plus many more sensible defaults into shared configurations for eslint and prettier. They're installable via NPM, and you're welcome to use them in your projects (or copy mine and amend them to your liking). If using prettier in conjunction with eslint, please remember to use [eslint-config-prettier](https://prettier.io/docs/en/integrating-with-linters.html) to mitigate conflicts between the two tools. To use my [eslint configurations](https://github.com/exbotanical/eslint-config): JavaScript: ```bash npm i -D eslint @magister_zito/eslint-config-javascript ``` ```json // .eslintrc { "extends": ["@magister_zito/javascript"] } ``` TypeScript: ```bash npm i -D eslint @magister_zito/eslint-config-typescript ``` ```json // .eslintrc { "extends": ["@magister_zito/typescript"] } ``` React: ```bash npm i -D eslint @magister_zito/eslint-config-react ``` ```json // .eslintrc { "extends": ["@magister_zito/react"] } ``` Vue: ```bash npm i -D eslint @magister_zito/eslint-config-vue ``` ```json // .eslintrc { "extends": ["@magister_zito/vue"] } ``` To use my [prettier configurations](https://github.com/exbotanical/prettier-config): ```bash npm i -D prettier @magister_zito/prettier-config ``` ```json // .prettierrc "@magister_zito/prettier-config" ``` To integrate prettier with eslint: ```bash npm i -D eslint-config-prettier ``` And amend `.eslintrc` ```diff { "extends": [ - "@magister_zito/typescript" + "@magister_zito/typescript", + "prettier" ] } ``` Perhaps I'll write a post about writing your own extensible eslint configurations. Interested? Drop me a note and [let me know](mailto:exbotanical@protonmail.com).