[{"content":"The previous post pushed the Claude Code internals off to a follow-up. This is that follow-up.\nThe first time you open Claude Code to customize it, the surface feels scattered. settings.json, CLAUDE.md, slash commands, subagents, hooks, plugins — the same intent has too many possible homes, and the question \u0026ldquo;where does this go?\u0026rdquo; blocks you before anything else. Dump everything into CLAUDE.md and it bloats; split it and you lose track of which file fires when.\nPick one axis and the problem almost dissolves. When does it step in? This post walks through Claude Code\u0026rsquo;s customization surface rearranged along that axis — into four layers.\nLayer Structure Layer When it steps in Responsibility CLAUDE.md + Rules Always (loaded into every turn) Tacit conventions and guardrails Agents When a Skill or model delegates to it Context-isolated specialist roles Skills When I call it Reusable workflow recipes Hooks Automatically, before/after tool use Validation, automation, safety rails That table is arguably the whole post. The rest is how each layer settles onto this axis, and how all four interlock inside a single workflow.\nCLAUDE.md + Rules The knowledge Claude loads into context every turn. It applies even when nothing is called. Two tiers.\nCLAUDE.md is the top-level context at the project/user level. It can live at ./CLAUDE.md, .claude/CLAUDE.md, or ~/.claude/CLAUDE.md — when multiple exist, they merge in hierarchy order. My CLAUDE.md holds language-independent behavioral rules.\n# CLAUDE.md (excerpt) - On rejected approach: stop immediately and ask for direction - Change scope: only change what was explicitly requested - No commits: never commit until explicitly asked - Propose approach first: for changes touching 3+ files or affecting architecture, propose the approach before writing code rules/*.md is the lower tier, split per language and domain. Drop .md files into ~/.claude/rules/ or .claude/rules/ and they\u0026rsquo;re discovered recursively. A paths frontmatter field enables scoped rules that only apply to matching file patterns.\n# rules/go.md (excerpt) - No Get prefix: GetName() ❌ → Name() ✅ - Error wrapping: bare return err forbidden. fmt.Errorf(\u0026#34;context: %w\u0026#34;, err) - No panic in libraries: only in main/tests # rules/typescript.md (excerpt) - Destructuring-first: function params ({ server, db }: Config) - Braces required: if (x) return; ❌ → if (x) { return; } ✅ - Data shapes → type, implementation contracts → interface - Use enum: prefer enum over as const objects # rules/code-principles.md (excerpt, language-agnostic) - fail-fast: raise/throw immediately on validation failure - Guard clauses: early return instead of nested if - Immutability-first: limit mutations to explicit scopes - Pure functions preferred: push side effects to call boundaries - No Any types: use generics, unions, concrete types Splitting language rules into rules/ instead of putting them in CLAUDE.md keeps CLAUDE.md from swelling.\nThe test: \u0026ldquo;does this need to be in effect regardless of whether anything is called?\u0026rdquo; If yes, it belongs in this layer.\nAgents Agents are invocation units, but they don\u0026rsquo;t run directly in the main session. They\u0026rsquo;re defined at ~/.claude/agents/\u0026lt;name\u0026gt;.md and are delegated to by a Skill (covered next) or by the model itself. The key word is context isolation. An Agent opens its own context window, drills into a single responsibility, and hands back only the result.\nTake the architect agent — it handles architecture analysis and root-cause debugging. When some skill dispatches architect, the agent reviews the design in a separate context and returns a conclusion. Everything the agent processed stays out of the main session. Or verify-agent: it runs build → typecheck → lint → test as an isolated pipeline and reports only pass or fail. refactor-cleaner removes dead code and unused imports in isolation. code-reviewer, security-reviewer, and database-reviewer each inspect implementation output through a different lens.\nThe instinct for separating Skills and Agents is this. A Skill orchestrates; an Agent is the deep execution of one responsibility. Breaking work into stages and wiring the right tools into each stage is Skill work. When one of those stages needs to run independently without polluting the main context, that slot is filled by an Agent. The diagrams in the next section show exactly how these agents get dispatched.\nThe test: \u0026ldquo;does this need to be separated from the main context?\u0026rdquo; If not, a Skill is enough.\nSkills Skills are workflow recipes I call explicitly. Each one lives at ~/.claude/skills/\u0026lt;name\u0026gt;/SKILL.md and is triggered with /skill-name. Prompt, allowed tools, and model assignment are all bundled into a single file. Instead of retyping the same instructions every time, a recipe steps in: \u0026ldquo;for this situation, use this skill.\u0026rdquo;\n/code The natural home for Skills is repetitive development work. The heaviest skill in my setup is /code. It inspects the input and auto-detects two paths: pass it a text description and it enters design (Brainstorming); pass it a .claude/plans/ directory and it enters implementation (Pipeline).\nWhen given a text description, the Brainstorming path runs. It explores requirements, decides whether to split the work, and writes a design document (DESIGN.md), per-sub-task plan files (NN-\u0026lt;task\u0026gt;.md), and a dependency graph (_dag.yaml) under .claude/plans/\u0026lt;topic\u0026gt;/. It runs the architect agent for a design review and stops. The output becomes the Pipeline path\u0026rsquo;s input.\nflowchart TD I[\"Idea input/code (Brainstorming)\"] --\u003e C[\"Context collectionproject type · CLAUDE.md · git log\"] C --\u003e R[\"Requirements explorationAskUserQuestion 1:1\"] R --\u003e A[\"2-3 approaches + recommendation\"] A --\u003e PM[\"pm-code-agentsplit decision\"] PM --\u003e|SINGLE| D1[\"DESIGN.md + _dag.yaml01-main.md\"] PM --\u003e|SPLIT| D2[\"DESIGN.md + _dag.yamlNN-task.md × N\"] D1 --\u003e AR[\"architect agent review\"] D2 --\u003e AR AR --\u003e|NEEDS REVISION| D2 AR --\u003e|APPROVED| S[\"statusdraft → ready\"] S --\u003e O[(\".claude/plans/\u0026lt;topic\u0026gt;/\")] Passing a .claude/plans/\u0026lt;topic\u0026gt;/ directory switches to the Pipeline path. The interesting part is that the details are kept out of SKILL.md and live in references/stage-*.md, pulled in with Read only at the moment they\u0026rsquo;re needed. Normal turns only carry the orchestrator in context, and each stage document loads only when that stage begins.\nSub-tasks from _dag.yaml are topologically sorted, and each one passes through five stages.\nflowchart TD I[(\"/code input.claude/plans/\u0026lt;topic\u0026gt;/\")] --\u003e L[\"Load _dag.yamltopo-sort + status gate\"] L --\u003e PRE[\"Stage Prearchitect agent+ planner agent\"] PRE --\u003e IMP[\"Stage Implparallel build(agent team)\"] IMP --\u003e POST[\"Stage Post (parallel)code-reviewersecurity-reviewerdatabase-reviewerverify-agent\"] POST --\u003e|FAIL| FIX[\"Stage Fixverify-agentauto-repair\"] FIX --\u003e POST POST --\u003e|PASS| CLEAN[\"Stage Cleanrefactor-cleaneragent\"] CLEAN --\u003e DONE[\"statusready → done\"] DONE --\u003e N{\"Next sub-task?\"} N --\u003e|yes| PRE N --\u003e|no| R[\"Final report\"] Pre (stage-pre.md) — calls architect and planner in order to do structural analysis and produce the execution plan. The result is appended to the sub-task document as ## Plan. Impl (stage-impl.md) — runs parallel implementation based on the plan. Simple work is handled directly by the leader; complex work spawns a team of agent members. Post (stage-post.md) — calls code-reviewer, security-reviewer, database-reviewer, and verify-agent in parallel for comprehensive review. PASS / NEEDS ATTENTION / FAIL is decided here. Fix (stage-fix.md, conditional) — if Post returns FAIL, verify-agent runs to auto-fix fixable errors, then Post is re-run. Bound by retry-policy.md: max 3 attempts by default, with \u0026ldquo;same error twice in a row → stall detection.\u0026rdquo; Clean (stage-clean.md) — refactor-cleaner removes dead code, unused imports, and duplication. Failure here is treated as non-critical — a warning is logged and the pipeline continues. If cleanup happens to break the build, Post catches it on the next run. When a sub-task clears all five stages, its NN-\u0026lt;task\u0026gt;.md frontmatter gets promoted from status: ready to status: done. That transition is the re-run guard — running /code again against the same plan skips any task already marked done and only picks up the rest.\nThis skill shows the basic pattern of the Skill layer. Auto-detecting paths based on input, passing stateful files between paths, and offloading details from SKILL.md to references to keep context lean.\n/github-ship — Branch to Merge in One Pipeline Once implementation is done, the code needs to land. /github-ship bundles everything from branch creation to merge into a single pipeline. It\u0026rsquo;s the consolidation of what used to be three separate skills: /git-branch, /git-commit, and /github-pr-push.\nIt runs five phases.\nBranch — analyzes the changes by concern, decides whether to split into multiple PRs, and creates a convention-aligned branch. Commit — reviews the staged diff, splits by concern, and writes convention-conformant messages. Push \u0026amp; PR — runs static analysis (lint/typecheck), pushes, and creates a PR via gh pr create. Review — scales review intensity by change size (TRIVIAL/SMALL/MEDIUM/LARGE), fires review agents in parallel. If issues are found: fix → push → re-review loop. Merge — once all review issues are resolved, offers squash or merge commit, then merges. When /code finishes with all sub-tasks passing, it automatically asks whether to run /github-ship. Approve, and implementation flows seamlessly into PR merge.\nTwo PR-related skills remain standalone.\n/github-pr-review \u0026lt;PR number\u0026gt; — deep-reviews an existing PR. Uses the same agents as github-ship Phase 4, but callable independently. /github-pr-respond — walks through review comments on a PR, confirming whether to address each, and posts replies. CLI over MCP There are two main ways to wire tools into Claude Code: MCP servers and CLI invocation. In the git/GitHub space both options exist. Yet all the skills above call CLIs like gh and git via Bash(...:*) in their allowed-tools. The reason is context savings.\nThe moment an MCP server connects, its entire tool catalog becomes resident in context. A few hundred tokens per tool, and a single server exposing around twenty tools consumes thousands of tokens \u0026ldquo;doing nothing.\u0026rdquo; Attach three such servers and 4,000+ tokens are gone before you type a single character (Scott Spence). CLI tools only spend tokens when invoked. In real comparisons, CLI reduced token usage by roughly 68% against MCP on the same workload (BSWEN — MCP vs CLI), and monthly operating costs came out 4 to 32× apart in another sample (BSWEN — Token usage).\nAnthropic is aware of this cost and has introduced lazy-loading optimizations like Tool Search, which reportedly cut overall agent tokens by 46.9% when MCP is in use (Joe Njenga, Medium). Even so, in domains where mature CLIs already exist — git and GitHub are the obvious ones — the skill + Bash combo is still the lightest option. That\u0026rsquo;s why github-ship and the other git/GitHub skills in my setup don\u0026rsquo;t touch MCP and route everything through the CLI.\nThe test: the inner axis of the Skill layer is \u0026ldquo;do I call it, or can the model call it on its own?\u0026rdquo; The disable-model-invocation flag draws that line — write operations that are risky or hard to reverse stay locked, while anything that needs to auto-trigger for everyday productivity stays open.\nHooks Hooks are never invoked. They react to tool events and run automatically. Register them in settings.json under hooks as PreToolUse / PostToolUse, and shell scripts step in before or after specific tool calls.\nHooks cover two slots.\nSafety rails — block dangerous commands before they execute. remote-command-guard.sh hooks in before a Bash call (PreToolUse), checks categories like rm -rf, curl | sh, and reads of /etc/passwd, and blocks with exit 2 if anything matches. Automation — running formatters after an edit (format-file.sh), nudging for security review when a sensitive file is touched (security-auto-trigger.sh), masking secrets in every tool\u0026rsquo;s output (output-secret-filter.sh). The \u0026ldquo;things you don\u0026rsquo;t want to do by hand but must do every time\u0026rdquo; slot. Permission allow/deny pairs with Hooks. The permissions.deny list in settings.json is a static filter. Declare a pattern like Bash(*rm -rf*) and matching commands never reach the tool call at all. Hooks layer dynamic checks on top. The context-dependent risks that static filters miss (specific redirect targets, conditional combinations) get judged by the script. Static declarations + dynamic inspection, the two tiers bundled into one layer that plays the role of safety rail.\nThe test: \u0026ldquo;does this need to intervene automatically on a tool event?\u0026rdquo; The no-invocation requirement is what separates Hooks from Skills and Agents.\nIntegrated Workflow Taken one at a time, each layer\u0026rsquo;s responsibility stays sharp. But in practice all four run inside the same workflow. Here\u0026rsquo;s one flow as an example.\nI type /code \u0026quot;design a new auth module\u0026quot; → a Skill starts moving on the Brainstorming path. The Skill dispatches architect and planner internally → Agents perform design analysis and step decomposition inside isolated contexts. Throughout all of this, every turn has language-specific rules and code principles loaded into context → Rules are quietly in effect. Once design is done, /code asks whether to switch to Pipeline. Approve, and it starts writing files → Edit / Write fires over and over. Each time, Hooks react. format-file.sh runs the formatter, code-quality-reminder.sh nudges error handling and immutability checks, and for security-related files security-auto-trigger.sh requests review. At the end of the pipeline, /code calls verify-agent again → build, typecheck, lint, and tests run in isolation and only the result comes back. If everything passes, /code asks whether to run /github-ship → approve, and another Skill takes over: branch → commit → push → review → merge in one shot. All four layers participate in a single task, and none of their responsibilities overlap. What I called (Skills), what the Skill delegated (Agents), what was always in place (Rules), what reacted to an event (Hooks) — none of these four overlap.\nSummary When deciding where a new piece of configuration goes, four questions are enough.\nMust it always be in effect? → CLAUDE.md + Rules Must it react automatically to a tool event? → Hooks Is it a workflow that runs when invoked? → Skills Must it run in a separate context? → Agents If more than one applies, split the responsibility. Full config reference: .dotfiles/claude/.\n","permalink":"https://wid-blog.pages.dev/en/posts/tech/devenv/claude-code-config-layers/","summary":"settings.json, CLAUDE.md, slash commands, subagents, hooks. Claude Code\u0026rsquo;s customization surface settles into four layers once you pick one axis: when does each one step in?","title":"Claude Code Config in Four Layers"},{"content":"Working seriously with Claude Code pulled my CLI-based editor habit way up. The flow of \u0026ldquo;open a terminal from a GUI editor\u0026rdquo; flipped into \u0026ldquo;open everything from inside a terminal.\u0026rdquo;\nThis post covers the dotfiles that produce the screen above. It\u0026rsquo;s also about how the same environment moves to another machine: one git clone, one bash setup.sh, done.\nThe deeper Claude Code configuration — skills, agents, hooks, MCP — will be covered in separate posts, so I only touch it lightly here.\nStack The terminal emulator is alacritty. tmux splits the screen into three inside an alacritty window. zsh runs as the shell in each pane. The editor in the top-left pane is nvim on a LazyVim base, and Claude Code runs as an AI pair in the right 30% pane.\nTool Role alacritty terminal emulator tmux session/window/pane multiplexer zsh shell nvim (LazyVim) editor (top-left pane) Claude Code AI pair (right 30% pane) This post walks through them in that order — alacritty, tmux, zsh, nvim, Claude Code — covering what each does and why it was chosen.\nalacritty alacritty is a cross-platform, GPU-accelerated terminal emulator written in Rust. It keeps its own feature set minimal, delegating splits and session management to other tools.\nI picked alacritty as the terminal emulator. Three reasons.\nGPU rendering — OpenGL-based rendering, so input latency stays low. Config-as-code — every setting lives in a single alacritty.toml. No hunting around for where things were saved. Simplicity — alacritty intentionally omits tabs, splits, and sessions. That space is for tmux to fill. The last point is the key one. By delegating splits and sessions to tmux instead of letting alacritty own them, the same abstraction works identically on macOS and Linux. The simpler the layer below, the more portable the layer above — that\u0026rsquo;s how I saw it.\nThe default shipped config is nearly empty. Colors, fonts, window decorations — until the user fills them in, alacritty is as close as it gets to a \u0026ldquo;raw terminal.\u0026rdquo; That empty state is where config-as-code begins.\nMy customizations are simple. Window decorations turned off to hide the macOS title bar. Padding removed so no pixel is wasted. The font set to a nerd font variant so nvim\u0026rsquo;s devicons render. Colors set to catppuccin mocha, written directly into the toml. No external yml, no includes — one file, done.\nThe keybindings translate Cmd key combinations into ESC sequences.\n[keyboard] bindings = [ { chars = \u0026#34;\\u001Bh\u0026#34;, key = \u0026#34;H\u0026#34;, mods = \u0026#34;Command\u0026#34; }, { chars = \u0026#34;\\u001Bl\u0026#34;, key = \u0026#34;L\u0026#34;, mods = \u0026#34;Command\u0026#34; }, { chars = \u0026#34;\\u001Bw\u0026#34;, key = \u0026#34;W\u0026#34;, mods = \u0026#34;Command\u0026#34; }, ] There is one macOS-specific issue. Cmd+H gets intercepted at the OS menu level as \u0026ldquo;Hide Application.\u0026rdquo; That key is supposed to be translated by alacritty\u0026rsquo;s keybindings into ESC+h (vim\u0026rsquo;s M-h) and forwarded to nvim, but if AppKit consumes it first, the translation never happens. That is why setup-macos.sh adds one extra line at the end.\ndefaults write org.alacritty NSUserKeyEquivalents -dict-add \u0026#34;Hide Alacritty\u0026#34; \u0026#34;\u0026#34; That single line is what makes alacritty\u0026rsquo;s nvim integration work on macOS. The config file cannot solve this problem, so the decision lives in the setup script.\ntmux tmux is a terminal multiplexer. It manages multiple sessions, windows, and panes inside a single terminal, and keeps processes alive even after detaching.\nSplitting the screen on top of alacritty is tmux\u0026rsquo;s job. That\u0026rsquo;s why alacritty has no tabs; tmux\u0026rsquo;s sessions / windows / panes fill that role instead.\nThe config stays close to defaults. The prefix stays at C-b (C-a collides with readline\u0026rsquo;s line-start and interferes in the shell). Copy mode runs on vim keys, and the ESC delay between nvim and tmux is removed. That last setting is small but nvim users notice the difference immediately.\ntmux waits briefly after receiving ESC to decide whether it starts a prefix or meta sequence. set -gs escape-time 0 removes that wait, and mode switches in nvim happen instantly.\nTwo keybinding decisions are directly relevant. The shortcut that opens a Claude Code pane on the right, and the shortcut that normalizes that layout in one key. The nc function described below is a higher-level tool that combines these splits into a single function call.\nbind i split-window -fh -p 30 -c \u0026#34;#{pane_current_path}\u0026#34; \u0026#34;claude\u0026#34; bind o split-window -v -l 15% -c \u0026#34;#{pane_current_path}\u0026#34; prefix i creates a Claude Code pane on the right; prefix o creates a shell pane at the bottom. These are the manual equivalents of what nc automates.\nAnother important decision is unifying pane navigation between vim and tmux behind the same keys. Pressing C-h/j/k/l without prefix, nvim moves to its left split if focused, or tmux moves to the left pane if in a shell pane. The tmux side checks whether the current pane is running a vim-family process and branches automatically. The boundary between tools disappears. No prefix key is needed to move between three panes.\nis_vim=\u0026#34;ps -o state= -o comm= -t \u0026#39;#{pane_tty}\u0026#39; \\ | grep -iqE \u0026#39;^[^TXZ ]+ +(\\\\S+\\\\/)?g?(view|n?vim?x?)(diff)?$\u0026#39;\u0026#34; bind-key -n \u0026#39;C-h\u0026#39; if-shell \u0026#34;$is_vim\u0026#34; \u0026#39;send-keys C-h\u0026#39; \u0026#39;select-pane -L\u0026#39; bind-key -n \u0026#39;C-j\u0026#39; if-shell \u0026#34;$is_vim\u0026#34; \u0026#39;send-keys C-j\u0026#39; \u0026#39;select-pane -D\u0026#39; bind-key -n \u0026#39;C-k\u0026#39; if-shell \u0026#34;$is_vim\u0026#34; \u0026#39;send-keys C-k\u0026#39; \u0026#39;select-pane -U\u0026#39; bind-key -n \u0026#39;C-l\u0026#39; if-shell \u0026#34;$is_vim\u0026#34; \u0026#39;send-keys C-l\u0026#39; \u0026#39;select-pane -R\u0026#39; is_vim inspects the pane\u0026rsquo;s process. If a vim-family editor is running, the keystroke goes to nvim; otherwise tmux handles the pane switch.\nzsh zsh is a Bash-compatible shell with strong completion, extended globbing, and a plugin ecosystem. It has been the default shell on macOS since Catalina.\nzsh has two halves. A .zshrc that handles PATH and environment, and an aliases.zsh that holds functions and aliases. ZDOTDIR points at ~/.config/zsh, and .zshrc sources every *.zsh in that directory.\nZDOTDIR=$HOME/.config/zsh for _zsh_conf in $ZDOTDIR/*.zsh(N); do source \u0026#34;$_zsh_conf\u0026#34; done Thanks to this pattern, I can split aliases / functions / plugin configuration into separate files. New function? Drop a new .zsh file. .zshrc itself almost never gets touched.\nThe nc function is defined as follows.\nfunction nc() { if [[ -z \u0026#34;$TMUX\u0026#34; ]]; then echo \u0026#34;Not inside a tmux session. Run from within tmux.\u0026#34; return 1 fi local target=\u0026#34;${1:-$PWD}\u0026#34; local dir if [[ -d \u0026#34;$target\u0026#34; ]]; then dir=\u0026#34;$(realpath \u0026#34;$target\u0026#34;)\u0026#34; else dir=\u0026#34;$(realpath \u0026#34;$(dirname \u0026#34;$target\u0026#34;)\u0026#34;)\u0026#34; fi local nvim_pane nvim_pane=\u0026#34;$(tmux display-message -p \u0026#39;#{pane_id}\u0026#39;)\u0026#34; tmux split-window -h -c \u0026#34;$dir\u0026#34; -l 30% \u0026#34;claude; exec $SHELL\u0026#34; tmux select-pane -L tmux split-window -v -c \u0026#34;$dir\u0026#34; -l 15% tmux select-pane -t \u0026#34;$nvim_pane\u0026#34; nvim \u0026#34;$@\u0026#34; } Line by line.\nThe first guard checks we\u0026rsquo;re inside tmux. Calling nc outside tmux is meaningless since the splits have nowhere to go. Print one line, return 1.\nNext is deciding the working directory. No argument means $PWD. If the target is a directory, use it as-is; if it\u0026rsquo;s a file, use its parent. realpath makes it absolute. This dir becomes the cwd for all three panes. Open a file in nvim and run git status in the side pane and you\u0026rsquo;ll see the same repo.\nThen, save the current pane\u0026rsquo;s ID. Focus needs to come back to nvim after the splits, but pane ids shift mid-split, so we grab it ahead of time.\nNow the actual two splits.\nFirst — tmux split-window -h -c \u0026quot;$dir\u0026quot; -l 30% \u0026quot;claude; exec $SHELL\u0026quot; creates a 30%-wide pane on the right and runs claude in it. The exec $SHELL switches the pane to a shell after claude exits, so it does not close immediately.\nSecond — tmux select-pane -L jumps back to the left (the original nvim slot), and tmux split-window -v -c \u0026quot;$dir\u0026quot; -l 15% cuts that left side top and bottom. The bottom 15% becomes a small terminal pane.\nFinally, launch nvim. Move focus back to the saved nvim_pane (now the top-left) and run nvim \u0026quot;$@\u0026quot;. If the argument is a file, that file opens; if it\u0026rsquo;s a directory, LazyVim\u0026rsquo;s dashboard appears.\nThe result is this layout, produced by a single nc call.\n┌───────────────────────────┬──────────────┐ │ │ │ │ nvim │ claude │ │ (LazyVim) │ (30%) │ │ │ │ ├───────────────────────────┤ │ │ shell (15%) │ │ └───────────────────────────┴──────────────┘ A few helper aliases wrap this function.\nalias zrc=\u0026#34;nc ~/.config/zsh/\u0026#34; alias nvimrc=\u0026#34;nc ~/.config/nvim/\u0026#34; alias alc=\u0026#34;nc ~/.config/alacritty/\u0026#34; alias tlc=\u0026#34;nc ~/.tmux.conf\u0026#34; So one zrc opens the zsh config directory in nvim with Claude Code already sitting in the side pane. Edit a config file, get a review in the next pane over.\nOther aliases include f (pick a file via fzf and open in nvim), g (lazygit), ?? (fabric-ai), ? (w3m search). nc is the center of this post because one function defines where five tools are placed, all at once.\nnvim Neovim is a refactored fork of Vim, adding asynchronous plugins, a built-in LSP client, and Lua-based configuration. LazyVim is a configuration framework on top, providing sensible defaults and a modular extras system.\nThe editor is nvim on a LazyVim base. Instead of writing init.lua from scratch, I chose to inherit LazyVim\u0026rsquo;s reasonable defaults. LSP, treesitter, finder, mason-based LSP installation — all wired up out of the box; building the same thing by hand takes days. And because language extras toggle line-by-line inside lazyvim.json, adding a new language is one line plus :Lazy sync, which installs LSP / treesitter / formatter in one shot. Currently 32 extras are active, covering 14 languages alongside coding, editor, formatting, and test tooling.\nCustomization splits into two places. lua/config/* is where I override LazyVim\u0026rsquo;s defaults (keymap, option, autocmd overrides); lua/plugins/* is where new plugins or extra options for LazyVim extras land. Per-language settings are bundled into a single plugins/language/\u0026lt;lang\u0026gt;.lua. Removing go support means deleting that one file.\nlua/plugins/language/ ├── go.lua ├── html.lua ├── java.lua ├── markdown.lua └── typescript.lua I won\u0026rsquo;t go deeper into nvim itself. Keymaps, LSP configuration, debugger integration, snacks.nvim picker, harpoon2 workflow are each a separate post. This post\u0026rsquo;s scope ends at \u0026ldquo;the pattern of building modules on a LazyVim base.\u0026rdquo;\nClaude Code The right 30% pane belongs to Claude Code. It installs through one brew cask line (cask \u0026quot;claude-code\u0026quot;), and nc creates its position. Its role on the screen is simple: while you edit on the left, it pairs with you on the right with the same directory in context.\nThe dotfiles claude/ module actually carries more. settings, agents, hooks, rules, skills all get stowed under ~/.claude/, and they tune Claude Code\u0026rsquo;s behavior in fine grain. agents handle task delegation, hooks handle automation at file-save time, skills hold reusable workflows, rules hold per-language and shared code conventions.\nBut this post\u0026rsquo;s slot ends at \u0026ldquo;five tools in one screen.\u0026rdquo; Each Claude Code component — settings, agents, hooks, skills, MCP, output styles — will be covered in separate posts.\nThe one thing to take away from this post is simple. Claude Code runs in the right 30% pane that nc creates. Nothing more, nothing less. The layout starts the tool; the tool operates within the layout.\nLimits and Trade-offs A few cases where this setup doesn\u0026rsquo;t fit.\nWork that depends on a GUI debugger. If browser devtools or a heavyweight IDE\u0026rsquo;s visual debugger is your daily tool, a terminal-centric layout will keep pulling you between two worlds. This setup rests on the assumption that code editing + shell + AI pair cover 99% of the work.\nPair programming over screen share. When you show your screen to a colleague, the intent behind nvim keybindings often doesn\u0026rsquo;t read. Seeing dd delete a line can be confusing for viewers unfamiliar with vim. If pair programming is frequent, a GUI editor has a lower communication cost.\nDifferences from the Linux setup. The same dotfiles work on Linux too, but the parts this post doesn\u0026rsquo;t cover — Hyprland window manager, Kime input method, Linux-specific packages — are separate. On macOS, alacritty handles the terminal emulator role; on Linux, Hyprland takes part of it.\nThe missing pieces, called out honestly. The input method (kime), the window manager (hypr), the keymapper (karabiner) — they fall outside this post\u0026rsquo;s scope, and I left them out. Input method choice does ultimately matter for Korean-speaking developers, but I judged that to be a separate post.\nBootstrap The bootstrap has two phases. OS-specific package installation and stow-based dotfile symlinks.\nThe entry point is setup.sh. It branches on uname -s and sources either setup-macos.sh or setup-linux.sh. On macOS, the order goes like this.\nXcode Command Line Tools — install if missing, otherwise pass. Homebrew — install via the official script if missing, otherwise pass. brew bundle — installs everything in Brewfile in one shot. alacritty, tmux, neovim, the zsh plugin manager (zinit), search tools like fzf/fd/ripgrep/zoxide, lazygit, gh, claude-code — all together. alacritty defaults write — clears the NSUserKeyEquivalents entry so macOS doesn\u0026rsquo;t intercept Cmd+H at the AppKit menu level. This is what lets alacritty receive Cmd+H and translate it into vim\u0026rsquo;s M-h. Once OS-specific install is done, control returns to setup.sh proper. There, GNU stow symlinks the dotfiles, module by module.\nCOMMON_MODULES=(claude git lazygit nvim tmux obsidian zsh) for m in $COMMON_MODULES; do stow --restow \u0026#34;$m\u0026#34; done Each module directory has the shape \u0026lt;module\u0026gt;/.config/\u0026lt;tool\u0026gt;/.... For example, nvim/.config/nvim/init.lua gets symlinked by stow to ~/.config/nvim/init.lua. It follows the XDG_CONFIG_HOME convention directly, which makes adding or removing a module a clean per-directory operation.\nOn macOS, alacritty and karabiner are stowed in addition, and finally a second-brain Obsidian vault is cloned and registered as obsidian-cli\u0026rsquo;s default vault. After those two phases, one exec zsh activates the full configuration.\nClosing Starting from where we began: one nc call, a screen split into three, five tools each handling their role. Behind that call is a dotfiles repo that reproduces with one bash setup.sh. The repo is a set of decisions about what role alacritty / tmux / zsh / nvim / Claude Code each play.\nThe repository is here.\ngit clone https://github.com/byunghak/.dotfiles.git ~/.dotfiles cd ~/.dotfiles \u0026amp;\u0026amp; bash setup.sh Two lines to move the same environment to a new machine.\nFollow-up posts will cover Claude Code\u0026rsquo;s internals (settings, agents, hooks, skills, MCP) one at a time. If this post covered the layout itself, the next ones cover the decisions within each tool.\n","permalink":"https://wid-blog.pages.dev/en/posts/tech/devenv/macos-dev-environment/","summary":"alacritty + tmux + nvim + zsh + Claude Code, all in a single screen. A dotfiles repo that reproduces the whole thing with one line of setup.sh.","title":"macOS Dev Environment: Dotfiles"},{"content":"I threw away the AirPods I\u0026rsquo;d been using for over a year — a prize I won at a company event.\nIn 2022, I joined my current company as a backend engineer. There was a lot to learn in this new domain, and I focused on tackling each task one at a time.\nBuild a feature, resolve an issue, move on to the next sprint. The cycle itself wasn\u0026rsquo;t the problem.\nThe problem was that somewhere along the way, I had closed my ears.\nWhen sharing technical context or the reasoning behind decisions with colleagues, I thought I was communicating — but in reality, the message often didn\u0026rsquo;t land. In code reviews, technical discussions, incident responses — I was poor at translating what was in my head into a form others could understand, and poor at listening to others on their terms.\nWorking in my own bubble became a habit, and burnout followed.\nAfter much deliberation, I proposed a three-month sabbatical.\nI didn\u0026rsquo;t want to simply rest and call it done. I wanted to reflect on what I was lacking and spend the time improving.\nI used to hate documentation at work. I knew it mattered, but committing thoughts to writing always felt like a chore.\nBut communication isn\u0026rsquo;t a skill you build in your head. It builds slowly — by putting things out, trying to convey them, and sitting with how they land.\nSo I\u0026rsquo;m starting this blog. Not to write well — to practice the act of writing and sharing itself.\nTechnical things, things I\u0026rsquo;ve felt at work — I want to keep putting my thoughts out there, and build the muscle of sharing.\nI used to think \u0026ldquo;a good engineer = someone who knows technology well.\u0026rdquo; That\u0026rsquo;s not wrong, but it\u0026rsquo;s an incomplete definition.\nIf what I know never reaches my team, that knowledge might as well not exist.\nA good engineer isn\u0026rsquo;t someone who knows technology well, but someone who can share that knowledge with their team.\nThat\u0026rsquo;s why I threw away my AirPods. To listen — and to reach out.\n","permalink":"https://wid-blog.pages.dev/en/posts/career/dable/starting-sabbatical/","summary":"A good engineer isn\u0026rsquo;t someone who knows technology well, but someone who can share that knowledge with their team.","title":"I Threw Away My AirPods"},{"content":"Late last year, a conversation started about improving our DSP Fallback performance by introducing a CTR prediction model.\nFallback kicks in when the main DSP decides there\u0026rsquo;s no ad to serve. Its purpose is to raise fill rate — the ratio of impressions to ad slots.\nI was a backend engineer. I had no background in AI.\nThe expectation was that wiring the surrounding systems would take more work than the model itself, so the project landed on my plate.\nThis post is a record of the technical decisions I made, the reasoning behind them, and what I learned as an AI non-expert building ML infrastructure.\nModel Choice: Logistic Regression The model was Logistic Regression.\nSince the goal was improving ad CTR, we just needed to learn whether a given impression would be clicked — a binary classification problem.\nInternally, both LR and LightGBM were recommended — the two models commonly used in ad platforms. But this was an initial version, and I didn\u0026rsquo;t want to take on complex tuning and operational burden from day one.\nSo I picked the simpler option: LR.\nLanguage and Framework Choice I went with Python and sklearn. For both the training batch and the inference server.\nI initially considered ONNX + Go. Our backend was evaluating a migration from Node to Go for performance reasons, and a new project felt like a place where I could start with Go. For inference, pushing the model through ONNX would give me framework independence and better performance.\nBut the internal ML operating environment was Python-centric. The reference examples, the shareable code, the deployment patterns — all in Python. When you need advice and reviews, the same language felt like the right call. I set aside the performance angle and chose continuity of operations.\nThe framework choice followed similar logic. I knew ONNX has better inference performance than sklearn, but for a lightweight model like LR, that gain wouldn\u0026rsquo;t move the needle. sklearn felt enough for training and saving, and forcing a heavy pipeline onto a light model seemed like overengineering to me.\nML Lifecycle Architecture: Split Into Three Components I divided the ML Lifecycle into three components.\nTraining batch: Periodically trains the LR model and pushes the trained model to the model store. Model store: Built on MLflow. Keeps versioned copies of models written by the training batch. Inference server: Loads the latest model from the store and serves real-time predictions. flowchart LR A[\"Training batch\"] --\u003e|\"① push model② move champion alias\"| B[\"Model store(MLflow)\"] A --\u003e|\"③ call Argo Rollouts API\"| C[\"Inference server\"] B -.-\u003e|\"④ load champion on pod startup\"| C The flow is simple: training batch → model store → inference server. The three components connect only through model files, and the training schedule runs independently from inference.\nInside the Training Batch: The Promotion Gate The training batch wasn\u0026rsquo;t just \u0026ldquo;train → save.\u0026rdquo; Once training finished, the model had to pass through a Promotion Gate — a quality check — before the champion alias would move.\nflowchart LR A[\"Data loading\"] --\u003e B[\"Preprocessing\"] --\u003e C[\"Training\"] --\u003e D[\"Evaluation\"] --\u003e E{\"Promotion Gate\"} E --\u003e|\"PASS\"| F[\"Update champion alias+ trigger rollout\"] E --\u003e|\"FAIL\"| G[\"Keep current champion\"] The criteria were simple. If the trained model\u0026rsquo;s evaluation metrics crossed the predefined thresholds, it passed; otherwise, it failed. On pass, the champion alias moved to the new version and a rollout was triggered. On failure, the new model was only logged to the registry while the current champion kept serving traffic.\nThis meant a degraded model couldn\u0026rsquo;t accidentally reach production — without any code changes.\nDeployment: Argo Rollouts For getting new models into the inference server, I used Argo Rollouts — a natural choice since we were already running on k8s.\nsequenceDiagram participant T as Training batch participant R as MLflow participant I as Inference server pod T-\u003e\u003eR: ① Register new model T-\u003e\u003eR: ② Move champion alias to new version T-\u003e\u003eI: ③ Call Argo Rollouts API Note over I: ④ Rollout replaces pods one by one I-\u003e\u003eR: ⑤ New pod loads the champion model R--\u003e\u003eI: Model + metadata Note over I: ⑥ Service resumes with the new model MLflow\u0026rsquo;s alias feature lets you tag a model version with a name like \u0026ldquo;champion\u0026rdquo; to point at the current production model. When the training batch passes the Promotion Gate, it moves the champion alias to the new version and then calls the Argo Rollouts API to trigger a deployment. The rollout replaces inference server pods one at a time, and each new pod loads whichever model has the champion alias on startup before entering the service.\nLooking Back The LR + sklearn + MLflow combination was simple, and it ran light and fast.\nWhat I regret most was choosing Python + sklearn. The inference server is currently a FastAPI-based Python service running on k8s. Each pod runs on a single core and loads its own copy of the model. As features grew, inference cost climbed and the number of pods grew with it. If we had gone with ONNX + Go and used multiple cores inside a single process, the same load could probably have been handled with fewer pods. At the time, I judged that continuity of operations was the right call — but the cost of that decision showed up in the operational phase.\nStarting out, my biggest worry was \u0026ldquo;can I do this without an AI background?\u0026rdquo; By the end, I found that what I needed was a bit different. It wasn\u0026rsquo;t ML algorithms or infrastructure expertise — what mattered more was how precisely I understood the domain, and being able to judge which features to combine and how. Reading data and spotting patterns — that analysis skill — turned out to be just as important.\nReferences Revisiting Logistic Regression Choosing a Model Training Framework: sklearn vs ONNX MLflow: Filling the Gap in the ML Lifecycle ","permalink":"https://wid-blog.pages.dev/en/posts/career/dable/dsp-fallback-ctr-ml-lifecycle/","summary":"Building my first ML Lifecycle — a three-tier architecture for DSP Fallback CTR prediction — as a backend engineer without an AI background. The technical decisions I made, and what I learned through running it.","title":"My Journey Building an LR-based ML Lifecycle"},{"content":"MLflow fills the boundary between experiment and model in the ML lifecycle. On the experiment side it remembers \u0026ldquo;what parameters trained what.\u0026rdquo; On the model side it holds \u0026ldquo;which version is production right now.\u0026rdquo; That boundary is not the exclusive problem of large ML teams. It shows up just as clearly when you\u0026rsquo;re running a single Logistic Regression model.\nThis post looks at MLflow itself: what it is, where it is positioned in the ML lifecycle, and which pieces a lightweight team can pick.\nML Lifecycle ML projects roughly move through four stages.\nExperiment — you look at the data, try parameters, train models. You log metrics and go back to try again. Model — once you have something worth keeping, you declare \u0026ldquo;this is our model.\u0026rdquo; A version and a lineage attach to it. Deployment — you put that model into the serving environment. Rollout, rollback, traffic shifting all live here. Monitoring — you watch the live model for drift and degradation. Each stage has its own problem. Experiment struggles with remembering what was tried. Model struggles with agreeing on which one is real right now. Deployment struggles with swapping one thing for another. Monitoring struggles with deciding when to retrain.\nMLflow mostly fills the first two. It reaches into deployment and monitoring, but its center of gravity is in experiment and model.\nThe model_v3_final.pkl Problem To see why a tool belongs at this boundary, first look at what breaks when you try to manage models with just a filesystem.\nIt starts simple. You upload model.pkl to S3 and the inference server reads it. Every time training finishes, you overwrite the file.\nThen you need to roll back. Yesterday\u0026rsquo;s version. But the file has already been overwritten. So you start splitting: model_v2.pkl, model_v3.pkl. Before long you get model_v3_final.pkl. Then model_v3_final_really.pkl.\nThese names don\u0026rsquo;t solve three things.\nLineage — there is no way to trace model_v3_final.pkl back to the code, the data, and the parameters that produced it. You cannot reproduce it even with the same code. Alias — \u0026ldquo;which model is production right now\u0026rdquo; gets managed with a filename convention outside the code. Does the inference server read latest.pkl? Does it take a version via env var? Every decision becomes ad hoc. Reproducibility — a few months later you want to repeat an experiment, but nothing remembers the parameters and the code from that run. To fix these you eventually need a metadata layer on top of the \u0026ldquo;filename\u0026rdquo; layer. That is the gap MLflow fills.\nThe Four Pieces of MLflow MLflow is four independent components in one package. You pick which ones you use.\nTracking It records a training session as a run. Parameters, metrics, and artifacts (model files, plots, logs) all attach to the run. Multiple runs group under an experiment.\nimport mlflow with mlflow.start_run(): mlflow.log_param(\u0026#34;C\u0026#34;, 0.1) mlflow.log_metric(\u0026#34;val_auc\u0026#34;, 0.782) mlflow.sklearn.log_model(model, \u0026#34;model\u0026#34;) This chunk is the seed of lineage. Months later you can ask \u0026ldquo;val_auc was 0.78, what were the parameters?\u0026rdquo; and have an answer.\nModel Registry If Tracking records how it was trained, Registry records which result you want to declare yours. You promote one of the logged artifacts into a registered model and a version number attaches. v1, v2, v3 stack up automatically.\nOn top of those versions you can attach an alias. The champion alias is a mutable reference that points to a specific version. When a new version passes validation, you move the champion alias. No code changes, no filename rules, just one alias moving. The \u0026ldquo;model that production points at\u0026rdquo; is replaced.\nmlflow.register_model(\u0026#34;runs:/\u0026lt;run-id\u0026gt;/model\u0026#34;, name=\u0026#34;ctr-model\u0026#34;) client.set_registered_model_alias(\u0026#34;ctr-model\u0026#34;, \u0026#34;champion\u0026#34;, version=7) Registry clears the entire model_v3_final.pkl problem. Lineage auto-links to runs, aliases replace filename rules, and reproduction is a matter of looking up a run id.\nOne important constraint: using Registry requires a database backend. File storage alone (./mlruns) does not expose the registry API. Even if you want to start light, you have to stand up at least SQLite, or PostgreSQL / MySQL for real use. Since MLflow 3.7.0 the default backend switched to SQLite, which lowers the first entry barrier a little.\nModels This piece standardizes what \u0026ldquo;a model file\u0026rdquo; means. Each framework (sklearn, pytorch, xgboost) gets a flavor, and the same model can be saved under multiple flavors. A saved model can be loaded without the original framework code.\nModels is the portability layer between experiment and deployment. Where Tracking and Registry deal with which model is it, Models deals with how is it serialized.\nProjects An MLproject file plus conda or docker config wraps everything so that \u0026ldquo;anyone running it gets the same environment.\u0026rdquo; mlflow run . sets up the environment and runs training.\nThis is the least used of the four. Teams with their own batch execution standard usually don\u0026rsquo;t add an MLproject layer; they keep their own standard.\nLifecycle Mapping flowchart LR E[Experiment] --\u003e|\"Tracking(run, param, metric)\"| M[Model] M --\u003e|\"Registry(version, alias)\"| D[Deployment] M -.-\u003e|\"Models(flavor)\"| D P[Projects] -.-\u003e|\"runtime env\"| E D --\u003e Mo[Monitoring] Tracking: inside the experiment stage Registry: in the model box between experiment and deployment Models: the portability axis from model into deployment Projects: an optional reproducibility layer on experiment Monitoring is not covered by MLflow directly. It\u0026rsquo;s another tool\u0026rsquo;s job. That diagram shows MLflow\u0026rsquo;s scope most concisely. Four pieces, each in its own slot, and your project picks which slots to fill.\nChoosing Tracking and Registry Picture a lightweight LR model running in production. The combination you most often see, out of the four pieces, is two: Tracking and Registry.\nWhy Tracking. The training batch re-runs LR every cycle, and each run has different parameters and validation metrics. You need to trace, later, which run produced which number. The records pile up faster than filenames can describe them. Tracking fills exactly that gap.\nWhy Registry. Only the models that pass the validation step should become \u0026ldquo;champion.\u0026rdquo; The inference server loads that champion. If you manage this with filename conventions, the server ends up polling latest.pkl and you get a race where an unvalidated model reaches production before validation finishes. Aliases remove that race. The actor pulling the deploy trigger and the object being deployed are cleanly separated.\nMoving the alias and swapping the inference server pods are two different events. Once the alias moves, a deployment tool (for example, Argo Rollouts) triggers the pod replacement. Rollouts starts new pods; each new pod, on boot, loads the model that champion currently points at and joins the service. MLflow says \u0026ldquo;which one is champion,\u0026rdquo; and the deployment tool handles \u0026ldquo;how to place it into service.\u0026rdquo;\nThis separation is the point. MLflow does not need to do everything. It just needs to fill its boundary.\nComponents Not Used Models format comes along for free when you log models through Tracking. You don\u0026rsquo;t pick it explicitly, but you get its benefits. Registry can return the model as runs:/\u0026lt;id\u0026gt;/model because of this format.\nProjects is often skipped. If a team already has a stable batch execution standard, adding an MLproject layer is duplication. When a batch runs inside a single framework, the reproducibility win from Projects is small.\nServing is also optional. MLflow offers its own serving endpoint (mlflow models serve), but handling lightweight-model inference directly in an existing server with sklearn is often lighter and easier to integrate with existing infrastructure. Delegating the serving layer to MLflow is rarely justified.\nUsing two pieces out of four is not \u0026ldquo;half using\u0026rdquo; MLflow. Filling only the boundary you need and leaving the rest to other tools is, if anything, closer to how this tool is meant to be used.\nClosing The word was \u0026ldquo;boundary.\u0026rdquo; That boundary is where meta-information (when, how, with what, which one is real right now) starts piling up faster than filenames can describe it. MLflow is the lightweight metadata layer at that point. How lightweight depends on you.\nIt isn\u0026rsquo;t a tool for large ML teams only. Even running a single LR, the same boundary shows up. When it does, you fill the slots you need.\n","permalink":"https://wid-blog.pages.dev/en/posts/tech/ml/mlflow/","summary":"Which slot of the ML lifecycle each MLflow component fills, and which pieces a lightweight team can pick.","title":"MLflow and the ML Lifecycle"},{"content":"sklearn and ONNX aren\u0026rsquo;t answers to the same question. The moment you line them up with \u0026ldquo;what should I use to train my LR?\u0026rdquo; the comparison turns into an illusion. One is a framework for training models. The other is a format for shipping already-trained models. They don\u0026rsquo;t operate at the same layer.\nThis post isn\u0026rsquo;t about choosing between sklearn and ONNX. It\u0026rsquo;s about why \u0026ldquo;sklearn or ONNX?\u0026rdquo; isn\u0026rsquo;t a well-formed question to begin with.\nPrerequisites Search \u0026ldquo;sklearn vs ONNX\u0026rdquo; and the two tools come back stacked side by side as if they were competing for the same role. Pros and cons, benchmarks, usage examples — all arranged as parallel choices. That arrangement is what creates the illusion.\nsklearn is a library that takes data and trains models. LogisticRegression, RandomForest, GradientBoosting — training algorithms and their implementations. When training finishes, you save the resulting model as a .pkl file and reload it in a Python process to run predictions. Training through serving, the entire workflow stays within the Python ecosystem.\nONNX has no training algorithms. What ONNX provides is a framework-neutral way of representing a model that has already been trained. A transformer trained in PyTorch and a logistic regression trained in sklearn can both be converted into the same ONNX graph. From there, any compatible runtime can execute that graph.\nPut plainly — one is a trainer, the other is transport. Asking which to pick between \u0026ldquo;a trainer and a transport\u0026rdquo; is a malformed question. Either they move together, or the transport isn\u0026rsquo;t needed at all.\nsklearn sklearn does two things at once. It trains models, and it stores those models as Python objects you can reload later.\nfrom sklearn.linear_model import LogisticRegression import joblib model = LogisticRegression() model.fit(X_train, y_train) joblib.dump(model, \u0026#34;model.pkl\u0026#34;) That .pkl file follows Python\u0026rsquo;s native serialization format. Nothing outside Python can read it. You need the same sklearn version and the same NumPy version installed to reload it safely. In return, training, storage, and serving connect in a single pipeline with no seams.\nMost ML code trains in Python and serves from a Python process. If nothing in that path demands another layer, sklearn\u0026rsquo;s native storage format is the shortest route.\nONNX ONNX is a framework-neutral intermediate representation (IR). It records a model\u0026rsquo;s compute graph in a standardized opset, and a separate runtime such as ONNX Runtime reads that graph and executes it.\nInserting this one extra step unlocks a few things.\nLanguage boundary — a model trained in PyTorch or sklearn can run for inference in C++, C#, Java, or Rust. No Python needed. Hardware boundary — ONNX Runtime provides graph optimizations and hardware-specific execution providers. The same model runs on CPU, CUDA GPU, TensorRT, CoreML, and more. Framework boundary — when the team has PyTorch models and TensorFlow models mixed together and wants a single serving stack, ONNX becomes the common denominator. If those boundaries actually exist in your project, the ONNX layer justifies its cost. If they don\u0026rsquo;t, the layer is nothing more than an extra step in the pipeline.\nPerformance gains are conditional \u0026ldquo;ONNX Runtime is faster\u0026rdquo; is a claim you hear often. It\u0026rsquo;s half-true.\nONNX Runtime can apply graph optimizations (operator fusion, constant folding) and plug into hardware accelerators (CUDA, TensorRT, OpenVINO). In those cases, it can run a given model faster than the native framework. The important word is can.\nFor those gains to actually show up, at least one of the following usually has to be present.\nA GPU or dedicated accelerator A non-Python runtime that sidesteps the GIL A graph large enough that optimization yields meaningful gains Logistic regression meets none of these conditions. It\u0026rsquo;s a single dot product between the weight vector and the input vector. Graph fusion has almost nothing to fuse. On a CPU, expecting a meaningful latency difference between ONNX Runtime and sklearn for LR inference isn\u0026rsquo;t realistic.\nSo \u0026ldquo;ONNX is faster\u0026rdquo; is a sentence that isn\u0026rsquo;t actually true until you also specify which model and which environment.\nWhen an ONNX layer is worth it Rather than abstract decision rules, it\u0026rsquo;s more useful to list the concrete conditions under which adding ONNX clearly pays off.\nTraining language and serving language differ. Training runs in Python; inference has to run inside a C++/Java/Go service. ONNX bridges the gap. GPU or edge inference is required. The model is large, latency requirements are tight, or it has to live on an edge device. ONNX Runtime\u0026rsquo;s execution providers support those targets. Multiple frameworks need to converge on one serving stack. PyTorch, sklearn, and TensorFlow models all have to run on the same inference server. ONNX becomes the common format. Training code and serving infrastructure have different lifecycles. You want the training code refactored and version-bumped frequently, but the serving binary has to stay stable. ONNX gives you a fixed point in between. If none of those match your situation, what you actually get from adding ONNX is an extra conversion step, opset version compatibility to worry about, and float/double precision edge cases to debug. Cost without the payoff.\nLightweight LR Scenario Consider a lightweight LR model running on a Python training plus Python serving path. GPU inference isn\u0026rsquo;t needed. The model is the size of a single weight vector. There\u0026rsquo;s no plan to run models from other frameworks alongside it. None of the four conditions above applies.\nIn that setup, the real decision isn\u0026rsquo;t \u0026ldquo;should we use ONNX?\u0026rdquo; — it\u0026rsquo;s \u0026ldquo;does an ONNX layer belong in this architecture?\u0026rdquo; It doesn\u0026rsquo;t. sklearn\u0026rsquo;s native .pkl storage is the shortest path from training to serving.\nSummary Back to the starting question. \u0026ldquo;sklearn or ONNX?\u0026rdquo; isn\u0026rsquo;t in a form that can be answered. The two tools don\u0026rsquo;t operate at the same layer.\nThat question has to be split in two. One half is \u0026ldquo;which library should I train with?\u0026rdquo; — a choice between sklearn, PyTorch, XGBoost, and other training frameworks. The other half is \u0026ldquo;what format should the trained model ship in?\u0026rdquo; — which can be each framework\u0026rsquo;s native storage format, or ONNX.\nOnce you split it, \u0026ldquo;do I need an ONNX layer?\u0026rdquo; becomes independent of the training framework question. And for most lightweight models, that question closes fast with a \u0026ldquo;no\u0026rdquo;. There\u0026rsquo;s no reason to add a layer where none is needed.\nTwo tools that aren\u0026rsquo;t answers to the same question give awkward answers whenever you force them into the same question. Rewrite the question first, and the answers follow naturally.\n","permalink":"https://wid-blog.pages.dev/en/posts/tech/ml/model-training-frameworks/","summary":"sklearn and ONNX aren\u0026rsquo;t competing at the same layer. Once you separate their roles, the real question becomes \u0026lsquo;do I need an ONNX layer at all?\u0026rsquo;","title":"Choosing a Model Training Framework: sklearn vs ONNX"},{"content":"When picking a baseline for CTR prediction, the candidates are many. Gradient Boosting, Neural Networks, and Logistic Regression. Among them, LR is still often selected as the baseline. There are reasons for that.\nThis post revisits those reasons from first principles: what LR is, and why it is so often chosen for this role.\nThree Properties The reasons LR has long served as the baseline in CTR prediction come down to three.\nLightweight. The model is a single dot product. Training and inference both scale linearly with the number of features.\nInterpretable. Every coefficient directly indicates \u0026ldquo;how much this feature contributes to the outcome.\u0026rdquo;\nProbability output. It outputs values between 0 and 1. In ads, you multiply those directly against a bid.\nThe rest of the post explains why these three are \u0026ldquo;structurally\u0026rdquo; inherent to LR.\nFrom Linear to Sigmoid The most direct way to understand Logistic Regression is to start from linear regression.\nLinear regression outputs a weighted sum of the inputs.\n$$ z = w \\cdot x + b $$The problem is that $z$ ranges over all real numbers. To produce a probability like CTR, the output must lie between 0 and 1. Linear regression does not guarantee that.\nThe sigmoid function solves this.\n$$ \\sigma(z) = \\frac{1}{1 + e^{-z}} $$Sigmoid smoothly compresses the entire real line into $(0, 1)$. No matter how large the input, it approaches 1; no matter how small, it approaches 0. Pass the output of linear regression through sigmoid, and you get a probability.\nThis simple composition is all there is to Logistic Regression: a linear model with a probability layer on top.\nOne thing worth noting. The probability output is nonlinear, but the decision boundary, the surface that separates the two sides at probability 0.5, remains linear. The hyperplane $w \\cdot x + b = 0$ is itself the boundary. LR is \u0026ldquo;a linear classifier with probabilities bolted on.\u0026rdquo;\nlog-loss Once the model structure is fixed, training becomes \u0026ldquo;finding good $w$ and $b$.\u0026rdquo; We need a criterion for \u0026ldquo;good.\u0026rdquo;\nLinear regression uses MSE. LR does not. Why?\nLR\u0026rsquo;s output is a probability. There is a more suitable choice of loss for probabilistic models: log-loss (a.k.a. cross-entropy).\n$$ L = -\\frac{1}{N} \\sum_{i=1}^{N} \\left[ y_i \\log \\hat{y}_i + (1 - y_i) \\log (1 - \\hat{y}_i) \\right] $$When the label is 1, the loss shrinks as $\\log \\hat{y}$ grows; when the label is 0, the loss shrinks as $\\log(1 - \\hat{y})$ grows. The closer the predicted probability gets to the truth, the closer the loss gets to zero.\nLog-loss is convex for LR. No local minima. The optimization converges to the global optimum. This property is the mathematical reason LR trains quickly on large-scale data.\nThe Structural Roots of the Three Faces The three characteristics from the overview, lightweight, interpretable, probability output, all follow from the structure above.\nLightweight A trained LR model is ultimately a weight vector $w$ and a bias $b$. Inference is one dot product and one sigmoid. Whether you have a million features or ten million, the computation scales linearly with the feature count. Compared to the many multiplications and nonlinearities in tree ensembles or neural networks, LR requires far less computation.\nInterpretable A coefficient $w_i$ means \u0026ldquo;when feature $i$ increases by one unit, the log-odds shift by $w_i$.\u0026rdquo; The sign indicates direction; the magnitude indicates influence. When you want to know \u0026ldquo;which feature contributes positively to clicks\u0026rdquo; in the ad domain, LR answers with a single table of coefficients. This satisfies the accountability requirements on the operations side.\nProbability Output Many classifiers output only a ranking score. LR outputs a calibrated probability. Ad expected-value math requires multiplying that number directly: predicted CTR × bid = expected revenue. A score that is not a probability cannot be used directly in the bidding formula.\nCTR Prediction CTR prediction as a problem has three characteristics.\nSparse. Most features are one-hot-encoded categoricals. Out of millions of dimensions, only a handful are 1; the rest are 0.\nHigh-dimensional. The cross-product of ad, user, and context spreads across millions to hundreds of millions.\nLarge-scale. Training data accumulates in large daily volumes.\nLR aligns with all three. The dot product of a sparse vector only needs to touch the non-zero entries, so the computation scales with the actual count of populated features, not the raw dimensionality. Training is easy to distribute via the SGD family. Inference fits inside the tight latency budget of real-time bidding.\nWhen bringing up a CTR model for the first time, these characteristics become decisive. You need to establish a baseline quickly, covering the training pipeline, serving, and monitoring, and validate the entire lifecycle first. A more complex model delays that validation itself.\nLimits and What Comes Next Having seen why LR serves as the baseline, we should also see why it is eventually replaced.\nThe biggest limit is the absence of nonlinear interactions. Products of features, conditional effects, complex combinations. LR cannot discover those on its own. A human has to define them in advance through feature engineering. As feature combinations grow, the engineering cost increases and operations become constrained by feature-design reviews.\nSo when do you move on? When data and operational headroom reach a point \u0026ldquo;feature engineering can no longer absorb.\u0026rdquo; Gradient Boosting Decision Trees learn interactions on their own. Neural networks go further, converting high-cardinality categoricals into continuous vectors through embeddings. Both directions address exactly LR\u0026rsquo;s limits.\nThat said, LR remains a reasonable starting point. Without a baseline, if you start with a complex model, you cannot distinguish the model\u0026rsquo;s contribution from the pipeline\u0026rsquo;s. The numbers LR provides become the reference line for every comparison that follows.\nClosing Choosing the old model had its reasons.\nThose reasons are in the structure. The composition of a linear model and sigmoid, the convexity of log-loss, the efficiency in sparse, high-dimensional spaces. Together, these three keep LR as the baseline for CTR prediction.\nEven when the time comes to move to the next model, the numbers LR provided remain as the baseline.\n","permalink":"https://wid-blog.pages.dev/en/posts/tech/ml/logistic-regression/","summary":"The structure and characteristics of Logistic Regression, and why an old model still holds the baseline position in CTR prediction.","title":"Revisiting Logistic Regression"},{"content":"I understood Go\u0026rsquo;s concurrency model conceptually. Goroutines are lightweight threads, channels handle communication, the sync package provides synchronization. But I had never compared the patterns side by side with code and benchmarks.\nI decided to implement and benchmark them myself. One project, three approaches: mutex, channel, and lock-free.\nMutex The first implementation was a concurrency-safe map using sync.RWMutex. Writes use Lock(), reads use RLock(), allowing multiple goroutines to access the map concurrently.\nAfter implementing it, I benchmarked against Go\u0026rsquo;s standard sync.Map. I created three scenarios: contended writes on the same key, disjoint writes across different keys per goroutine, and a read-heavy workload at 90% reads.\nOn contended same-key writes, both performed similarly. But on disjoint key writes, sync.Map was 2-3x faster, and on read-heavy workloads, 33% faster. The results matched exactly what the sync.Map documentation states as its optimization conditions. Conversely, on concentrated same-key writes, sync.Map only used more memory with no performance advantage.\nChannel For the channel pattern, I implemented data flow control. FanOut distributes data from one input channel to multiple output channels. It uses a select statement to send to whichever output channel is ready first.\nTurnOut routes from multiple inputs to multiple outputs while handling shutdown signals through a quit channel. Including the quit channel in the select statement lets the loop handle both data processing and graceful shutdown naturally. I also implemented the cleanup process of draining remaining data after closing channels.\nGenerics ([T any]) made the implementations reusable across types.\nLock-free This was the most interesting part. I implemented two lock-free patterns.\nSpinningCAS implements a lock using atomic.CompareAndSwapInt32. When another goroutine holds the lock, instead of entering a wait queue, it spins by repeating the CAS operation. runtime.Gosched() proved critical here. Without yielding the CPU during the spin loop, other goroutines couldn\u0026rsquo;t execute, creating a near-deadlock situation. One line of code changed the entire behavior.\nI benchmarked SpinningCAS against the standard sync.Mutex. On a high-contention scenario incrementing a single shared variable, SpinningCAS was about 7x faster. Mutex carries the overhead of parking and unparking goroutines in a wait queue, while CAS retries immediately. The numbers confirmed that spinning wins on short critical sections.\nTicketStorage addresses cases requiring ordering guarantees. atomic.AddUint64 issues ticket numbers, and each goroutine spins with CAS until its ticket comes up. It guarantees fairness (FIFO) but trades off longer wait times under high contention.\nRetrospective Understanding concurrency patterns conceptually and experiencing them through benchmarks were different things.\nThe biggest lesson was benchmark methodology. I initially wrote benchmarks that spawned a fixed number of goroutines, and results varied between runs. Switching to Go\u0026rsquo;s b.RunParallel, which lets the framework auto-calibrate iteration counts, stabilized results and made pattern differences clear. Benchmark code accuracy determines result quality.\nsync.Map is not \u0026ldquo;always a faster map\u0026rdquo; — its advantage appeared only under the conditions stated in the official documentation. SpinningCAS dominated Mutex on short critical sections, but longer sections or lower contention could reverse the result. Each tool has optimal conditions, and verifying those conditions is what benchmarks are for.\nThe experience of runtime.Gosched() changing behavior with a single line also stayed with me. In concurrent code, a theoretically correct implementation can behave differently in practice.\nKnowing concurrency patterns conceptually versus implementing them and facing the numbers. This project confirmed the difference.\nReferences concurrency-go GitHub Repository Go Concurrency Model ","permalink":"https://wid-blog.pages.dev/en/posts/career/personal/concurrency-go-retrospective/","summary":"A record of implementing and benchmarking three Go concurrency patterns — mutex, channel, and lock-free — to build hands-on understanding.","title":"concurrency-go Retrospective"},{"content":"Go\u0026rsquo;s concurrency model builds on CSP (Communicating Sequential Processes). The core philosophy is one line:\n\u0026ldquo;Do not communicate by sharing memory; instead, share memory by communicating.\u0026rdquo;\nInstead of locking shared memory, pass data through channels. Goroutines handle execution, Channels handle communication, and the sync/atomic packages provide auxiliary synchronization.\nGoroutine A goroutine is Go\u0026rsquo;s lightweight execution unit. It is not an OS thread. The Go runtime multiplexes many goroutines onto a small number of OS threads.\ngo func() { // This function runs in a new goroutine }() A single go keyword creates one. The initial stack is only a few kilobytes, and the runtime grows and shrinks it automatically as needed. OS threads become impractical at a few thousand; goroutines scale to hundreds of thousands in the same address space.\nGMP Scheduler The Go runtime uses an M:N scheduling model, known as the GMP model.\nflowchart TB subgraph Runtime[\"Go Runtime\"] subgraph P1[\"P (Processor)\"] LRQ1[\"Local Queue: G1, G2, G3\"] end subgraph P2[\"P (Processor)\"] LRQ2[\"Local Queue: G4, G5\"] end GRQ[\"Global Queue: G6, G7...\"] end subgraph OS[\"OS\"] M1[\"M (OS Thread)\"] M2[\"M (OS Thread)\"] M3[\"M (OS Thread)\"] end P1 --\u003e M1 P2 --\u003e M2 GRQ -.-\u003e|\"Stolen when P's local queue is empty\"| P1 G (Goroutine). A lightweight execution unit carrying a function and its stack.\nM (Machine). An OS thread. Executes instructions on the actual CPU.\nP (Processor). A logical processor. Provides the context needed to run goroutines. GOMAXPROCS controls the number of Ps, defaulting to the CPU core count.\nEach P has a local queue. When a goroutine is created, it enters the current P\u0026rsquo;s local queue. An M attaches to a P and executes goroutines from its local queue one by one. If a goroutine blocks on a system call, the runtime moves the other goroutines on that P to a different M to keep them running.\nThe overhead averages about three cheap instructions per function call.\nChannel A Channel is a typed communication mechanism for passing data between goroutines.\nUnbuffered Channel ch := make(chan int) Both sender and receiver must be ready for the transfer to complete. The sender blocks until the receiver takes the value; the receiver blocks until the sender sends. Communication and synchronization happen simultaneously.\nsequenceDiagram participant G1 as Goroutine 1 participant Ch as Channel (unbuffered) participant G2 as Goroutine 2 G1-\u003e\u003eCh: Send (blocks) Note over G1,Ch: Waits until G2 receives G2-\u003e\u003eCh: Receive Ch--\u003e\u003eG1: Send completes Ch--\u003e\u003eG2: Value delivered Buffered Channel ch := make(chan int, 10) // buffer size 10 Sends complete immediately if buffer space is available. The sender blocks only when the buffer is full. Buffered channels can serve as semaphores to limit concurrency.\nDirectionality Specifying channel direction clarifies a function\u0026rsquo;s intent.\nfunc producer(out chan\u0026lt;- int) { // send-only out \u0026lt;- 42 } func consumer(in \u0026lt;-chan int) { // receive-only val := \u0026lt;-in } select The select statement executes whichever channel operation is ready. It handles waiting on multiple channels, timeouts, and non-blocking operations.\nselect { case msg := \u0026lt;-ch1: handle(msg) case ch2 \u0026lt;- response: // send completed case \u0026lt;-quit: return default: // no channel ready } Including default makes the select non-blocking when no channel is ready.\nKey Patterns flowchart LR subgraph FanOut[\"Fan-Out\"] IN1[\"Input\"] --\u003e W1[\"Worker 1\"] IN1 --\u003e W2[\"Worker 2\"] IN1 --\u003e W3[\"Worker 3\"] end subgraph FanIn[\"Fan-In\"] R1[\"Result 1\"] --\u003e OUT1[\"Output\"] R2[\"Result 2\"] --\u003e OUT1 R3[\"Result 3\"] --\u003e OUT1 end Fan-Out. Multiple goroutines read from a single channel to distribute work.\nFan-In. Results from multiple channels merge into one.\nPipeline. Processing stages connected by channels. Each stage reads from an input channel, processes, and sends to an output channel.\nsync Package Channels are not always the best choice. For simple shared state protection, the sync package fits well.\nMutex. Ensures only one goroutine enters a critical section. Controlled with Lock() and Unlock().\nRWMutex. Multiple goroutines read concurrently; writes are exclusive. Effective when reads far outnumber writes.\nWaitGroup. Waits for multiple goroutines to finish. Add() increments the counter, Done() decrements it, Wait() blocks until zero.\nOnce. Runs a function exactly once. Used for initialization.\natomic Package The sync/atomic package provides atomic operations on integers and pointers. It reads and writes single variables safely without locks.\nCompareAndSwap (CAS) is the foundation of lock-free algorithms. If the current value equals the expected value, it swaps in the new value and returns true. Otherwise, it returns false and does nothing.\nvar counter int64 // Safe increment from multiple goroutines atomic.AddInt64(\u0026amp;counter, 1) // CAS: swap only if expected value matches atomic.CompareAndSwapInt64(\u0026amp;counter, oldVal, newVal) These are lower-level tools than the sync package. Suitable for simple counters and flags, but Mutex or Channel is better for complex synchronization.\nSelection Criteria Scenario Tool Passing data between goroutines Channel Work distribution, result collection Channel (fan-out/fan-in) Protecting shared state (read/write) sync.RWMutex Limiting concurrency Buffered Channel Waiting for multiple goroutines sync.WaitGroup Simple counters/flags sync/atomic The Go wiki summarizes it this way: channels suit ownership transfer, work distribution, and async result delivery. Mutexes suit caches and shared resource access control. Both are valid tools — choose based on the situation.\n","permalink":"https://wid-blog.pages.dev/en/posts/tech/language/go-concurrency-model/","summary":"Go\u0026rsquo;s concurrency model builds on CSP, providing Goroutines and Channels as core tools. An overview of how each works and when to choose what.","title":"Go Concurrency Model"},{"content":"I used Kafka at work. Producing messages, consuming them, checking monitoring dashboards — that was the extent of it. I had never configured a cluster from scratch or made decisions from topic design to consumer group strategy.\nHexagonal Architecture was similar. I understood the concept and followed port/adapter patterns in existing code, but I had never structured layers from an empty project.\nI wanted to build it myself. So I decided to create a chat system.\nWhy Chat Chat aligns naturally with Kafka\u0026rsquo;s pub/sub model. Publishing messages and delivering them to subscribers mirrors the core behavior of a chat system.\nReal-time communication over WebSocket, event-driven architecture, message synchronization across multiple instances — I decided a single project could cover all three.\nTechnology Choices Go + Java I built the chat service in Go. Lightweight goroutine-based concurrency suited a WebSocket server well. The user authentication service used Java with Spring WebFlux. The Spring Security ecosystem provided solid OAuth2 + JWT support, and I was already familiar with the framework.\nThe API Gateway used Kotlin with Spring Cloud Gateway. It ran on the same reactive stack as the user-service, maintaining consistency within the Java ecosystem.\nMongoDB Chat messages fit naturally into a document structure. Rooms and messages resembled unstructured data, and I expected frequent schema changes.\nI started with Redis. It worked well for quick prototyping, but I switched to MongoDB when message persistence became necessary.\nKafka KRaft I configured Kafka in KRaft mode — Kafka managing its own metadata without depending on ZooKeeper. No need to operate a separate ZooKeeper cluster, which simplified the infrastructure.\nI set up a 3-node cluster using Docker Compose, with each node serving as both controller and broker.\nArchitecture Evolution The project was not designed all at once. It evolved incrementally through pull requests.\nStarting Point I started with two services: user-service (Java) and chat-service (Go). The chat-service handled WebSocket connections, room management, message storage, and broadcasting. Redis served as the data store.\nRedis → MongoDB Messages needed persistent storage. Redis was not suitable due to its in-memory nature, so I replaced it with MongoDB. During this process, I experienced the benefit of only needing to swap the repository layer — a direct advantage of Hexagonal Architecture.\nHexagonal Architecture Cleanup I restructured the user-service first. Packages that had been loosely organized were rearranged into domain/entity, port/driving, port/driven, adapter/driving, and adapter/driven. I then applied the same structure to the chat-service.\nKafka Integration I implemented the Kafka producer first, then added the consumer. This is when I encountered concurrency issues.\nRace conditions occurred when users joined or left chat rooms while messages were being broadcast simultaneously. I introduced a two-level lock strategy in the RoomManager: an RWMutex at the RoomManager level for room list access, and a separate RWMutex per LiveRoom for participant access. This reduced contention.\nService Separation As the chat-service grew, I split it into messenger-service and message-service. The messenger-service handles Kafka producer/consumer and WebSocket connections. The message-service handles message storage and retrieval.\nFat Domain Initially, domain entities only held data. I moved domain logic into entities and introduced the use case pattern in the application layer. Each use case has a single Handle method, responsible for one business operation.\nKafka as Chat Message Broker The message flow:\nsequenceDiagram participant C as WebSocket Client participant S as SendUseCase participant DB as MongoDB participant K as Kafka participant B as MessageBroker participant R as RoomManager C-\u003e\u003eS: Send message S-\u003e\u003eDB: Store message S-\u003e\u003eK: Kafka publish K-\u003e\u003eB: Consumer receives B-\u003e\u003eS: OnReceive callback S-\u003e\u003eR: Broadcast R-\u003e\u003eC: WebSocket delivery SendUseCase directly implements the MessageSubscriber interface and registers itself with the MessageBroker — the Observer pattern. When the consumer receives a message, it calls OnReceive on all registered subscribers. Each subscriber uses the RoomManager to deliver the message to every WebSocket client in the corresponding room.\nThe advantage is horizontal scaling. When multiple chat service instances run, a message from one instance reaches other instances through Kafka. Users connected to different instances can still exchange messages within the same room.\nRetrospective I started this project because I wanted hands-on experience with Kafka.\nI confirmed that Hexagonal Architecture works naturally in Go. Go\u0026rsquo;s implicit interfaces made defining ports and implementing adapters straightforward. Assembling dependencies directly in the main function without a DI framework turned out to be explicit and easy to trace.\nConcurrency control taught me the most. I initially protected the entire room list with a single RWMutex, which created a bottleneck. Switching to a two-level strategy — separate locks for room list access and per-room participant access — showed a clear difference in benchmarks. Understanding concurrency in theory and experiencing it through benchmarks were different things entirely.\nThere are regrets. Test coverage was insufficient. One key benefit of Hexagonal Architecture is easy testing by swapping ports with mocks, but I did not write enough tests to take full advantage of this.\nI also configured gRPC but never applied it to inter-service communication. All services currently communicate over REST. gRPC integration remains for the next iteration.\nI started because I wanted to work with Kafka directly, and I gained more than that. Architecture design, concurrency control, service decomposition — encountering them together within a single system was a different experience from studying each one separately.\nReferences chat-services GitHub Repository Implementing Hexagonal Architecture in Go Kafka Fundamentals and KRaft Mode ","permalink":"https://wid-blog.pages.dev/en/posts/career/personal/chat-services-retrospective/","summary":"A record of designing and building a chat system as a personal project to gain hands-on experience with Kafka and Hexagonal Architecture.","title":"chat-services Retrospective"},{"content":"Kafka is a distributed event streaming platform. It provides a structure for publishing and subscribing to large volumes of events in real time. It serves real-time data pipelines, event-driven architectures, log aggregation, and more.\nThis post covers Kafka\u0026rsquo;s core concepts and explains why KRaft mode — which removes the ZooKeeper dependency — was introduced.\nTopics and Partitions Topics In Kafka, messages are published to a Topic. A topic is a logical category of messages. Topics are created per event type: order-events, user-signups, and so on.\nA topic is an append-only log that stores messages. Once written, messages are immutable. They are deleted when the retention period expires.\nPartitions A single topic is divided into multiple Partitions. Partitions are the core unit that provides both parallelism and ordering guarantees.\nflowchart LR subgraph Topic[\"Topic: order-events\"] P0[\"Partition 0msg0, msg3, msg6...\"] P1[\"Partition 1msg1, msg4, msg7...\"] P2[\"Partition 2msg2, msg5, msg8...\"] end Messages within a partition maintain order. Across partitions, no ordering is guaranteed. Messages with the same key are assigned to the same partition, ensuring event ordering for a specific entity (e.g., a particular order).\nIncreasing partition count increases throughput, since multiple consumers can process each partition in parallel.\nOffset Each message within a partition has a unique Offset number, starting from 0 and incrementing sequentially. Offsets serve as the reference point for tracking \u0026ldquo;how far a consumer has read.\u0026rdquo;\nProducer A Producer publishes messages to a topic.\nWhen a producer sends a message, it must decide which partition to target. Three approaches exist.\nKey-based partitioning. When a message has a key, a hash of the key determines the partition. The same key always maps to the same partition. This is used when event ordering for a specific user or order is required.\nRound robin. Without a key, messages are distributed across partitions in sequence. Suitable when ordering is unnecessary and even load distribution is desired.\nCustom partitioner. Custom partitioning logic can be implemented. Used when specific business rules dictate partition selection.\nAcks The producer can configure the level of acknowledgment required from brokers.\nacks=0: No acknowledgment. Fastest, but messages can be lost. acks=1: Leader broker acknowledges after writing. Messages can still be lost if the leader fails before replication. acks=all: All ISR (In-Sync Replicas) acknowledge. Safest, but increases latency. Consumer A Consumer reads messages from a topic. Unlike the producer\u0026rsquo;s \u0026ldquo;push,\u0026rdquo; consumers \u0026ldquo;pull\u0026rdquo; messages themselves, processing at their own pace.\nConsumers commit the offset of messages they have read. Committed offsets are stored in an internal Kafka topic (__consumer_offsets). When a consumer restarts, it resumes from the last committed offset.\nConsumer Groups Multiple consumers can be grouped into a Consumer Group. Within the same group, each partition is assigned to exactly one consumer.\nflowchart LR subgraph Topic[\"Topic (3 Partitions)\"] P0[\"P0\"] P1[\"P1\"] P2[\"P2\"] end subgraph Group[\"Consumer Group A\"] C1[\"Consumer 1\"] C2[\"Consumer 2\"] C3[\"Consumer 3\"] end P0 --\u003e C1 P1 --\u003e C2 P2 --\u003e C3 If the number of consumers exceeds the number of partitions, the excess consumers remain idle. To increase throughput, increase the partition count first.\nWhen consumers join or leave a group, Rebalancing occurs — the process of reassigning partitions. During rebalancing, message processing for that group pauses temporarily.\nMultiple Consumer Groups Different consumer groups read the same topic independently, each managing its own offsets.\nflowchart LR subgraph Topic[\"Topic (3 Partitions)\"] P0[\"P0\"] P1[\"P1\"] P2[\"P2\"] end subgraph GA[\"Group A (Order Processing)\"] A1[\"Consumer A1\"] A2[\"Consumer A2\"] end subgraph GB[\"Group B (Analytics)\"] B1[\"Consumer B1\"] end P0 --\u003e A1 P1 --\u003e A2 P2 --\u003e A1 P0 --\u003e B1 P1 --\u003e B1 P2 --\u003e B1 Multiple consumer groups subscribing to a single topic is the pub/sub pattern. A common example: an order processing system and an analytics system independently consuming the same events.\nBrokers and Clusters Broker A Broker is a Kafka server instance. It receives messages, persists them to disk, and delivers them to consumers. Multiple brokers form a Cluster.\nEach partition is assigned to one broker as the Leader. Producers and consumers communicate with the leader broker.\nReplication Partitions are replicated across multiple brokers. If the leader fails, one of the followers is promoted to the new leader.\nflowchart TB subgraph Cluster[\"Kafka Cluster\"] subgraph B1[\"Broker 1\"] P0L[\"P0 (Leader)\"] P1F[\"P1 (Follower)\"] end subgraph B2[\"Broker 2\"] P0F[\"P0 (Follower)\"] P1L[\"P1 (Leader)\"] end subgraph B3[\"Broker 3\"] P0F2[\"P0 (Follower)\"] P1F2[\"P1 (Follower)\"] end end P0L -.-\u003e|replication| P0F P0L -.-\u003e|replication| P0F2 P1L -.-\u003e|replication| P1F P1L -.-\u003e|replication| P1F2 ISR (In-Sync Replicas) is the set of replicas synchronized with the leader. If a follower falls behind, it is removed from the ISR. With acks=all, writes are acknowledged only after all ISR replicas have recorded the message.\nmin.insync.replicas sets the minimum ISR count. With a replication factor of 3 and min ISR of 2, writes succeed even if one broker fails. If two brokers fail, writes are rejected to protect data consistency.\nZooKeeper and Its Limitations Before Kafka 3.3, ZooKeeper managed cluster metadata: broker lists, topic/partition configurations, controller election, and ACL information.\nThe ZooKeeper-based architecture had several problems.\nOperational overhead of a separate system. A ZooKeeper cluster (typically 3-5 nodes) must be operated alongside the Kafka cluster. Monitoring, upgrades, and incident response targets double.\nMetadata propagation bottleneck. Brokers fetch metadata from ZooKeeper, so as partition counts grow, metadata synchronization takes longer. This slows controller failover recovery in large clusters.\nDual consensus problem. ZooKeeper runs its own consensus algorithm (ZAB), while Kafka separately operates ISR-based replication. The two systems can temporarily fall out of sync.\nKRaft Mode KRaft (Kafka Raft) removes ZooKeeper and lets Kafka manage metadata internally. Production use became available in Kafka 3.3, and ZooKeeper mode was removed starting from 4.0.\nIn KRaft, some brokers take on the Controller role. Controller nodes use the Raft consensus algorithm to agree on a metadata log. Metadata is stored in an internal Kafka topic, eliminating the need for a separate system.\nKey changes from ZooKeeper mode:\nNo ZooKeeper cluster. The operational target reduces to Kafka alone. Metadata is managed as an event log. Brokers subscribe to the metadata log and maintain their own state. Propagation is faster than polling from ZooKeeper. Controller failover speeds up. The Raft protocol elects a new leader who takes over the metadata log. Summary Kafka\u0026rsquo;s core consists of topics, partitions, and consumer groups. Partitions provide parallelism and ordering guarantees. Consumer groups enable horizontal scaling. Broker replication ensures fault tolerance.\nKRaft mode removed ZooKeeper as an external dependency from this structure. Kafka now handles metadata consensus and management on its own.\n","permalink":"https://wid-blog.pages.dev/en/posts/tech/infra/kafka-fundamentals-kraft/","summary":"Core Kafka concepts (topics, partitions, consumer groups, replication) and the background behind KRaft mode, which removes the ZooKeeper dependency.","title":"Kafka Fundamentals and KRaft Mode"},{"content":"The core of Hexagonal Architecture (Ports \u0026amp; Adapters) is dependency direction control. It isolates all external dependencies behind interfaces (Ports) so that business logic never depends on frameworks or databases.\nGo\u0026rsquo;s implicit interfaces and package structure make this pattern a natural fit.\nHexagonal Architecture This pattern, proposed by Alistair Cockburn, divides an application into three areas.\nDomain. The core layer containing business rules. It depends on no external technology.\nPort. The interface between the application and the outside world. Two kinds exist:\nDriving port (inbound): Entry points from outside into the application. Defines what the application offers. Driven port (outbound): Interfaces through which the application requests external systems. Defines what the application needs. Adapter. The implementation of a Port. Driving adapters (HTTP handlers, gRPC handlers) receive external requests and call ports. Driven adapters (DB repositories, message brokers) implement port interfaces to communicate with external systems.\nDependencies always point inward: Adapter → Port → Domain. Domain knows nothing about Port, and Port knows nothing about Adapter.\nflowchart LR subgraph Adapter[\"Adapter\"] DA[\"Driving AdapterREST, gRPC\"] DRA[\"Driven AdapterDB, Kafka\"] end subgraph Port[\"Port\"] DP[\"Driving Port\"] DRP[\"Driven Port\"] end subgraph Core[\"Domain + Application\"] D[\"Entity\"] A[\"UseCase / Service\"] end DA --\u003e|calls| DP DP -.-\u003e|defines| A A --\u003e|uses| DRP DRP -.-\u003e|implements| DRA A --\u003e|contains| D Go Directory Structure A directory structure commonly used when applying Hexagonal Architecture in Go:\ninternal/ ├── domain/ │ ├── entity/ # Business entities │ └── service/ # Domain services ├── port/ │ ├── driving/ # Inbound interfaces │ └── driven/ # Outbound interfaces ├── application/ │ ├── usecase/ # Business operation units │ ├── dto/ # Data transfer objects │ └── mapper/ # entity ↔ dto conversion └── adapter/ ├── driving/ # REST handler, gRPC handler └── driven/ # DB repository, message broker The internal/ package prevents direct access from external modules, naturally encapsulating the application\u0026rsquo;s internals.\nPort Ports are defined as Go interfaces.\nDriving Port Entry points from outside into the application. Defining one interface per use case gives each interface a single responsibility.\n// port/driving/messenger.go type JoinRoomUseCase interface { Handle(ctx context.Context, req dto.JoinRequest) error } type SendMessageUseCase interface { Handle(ctx context.Context, req dto.SendRequest) error } Driven Port Interfaces through which the application requests external systems.\n// port/driven/message.go type MessageRepository interface { Create(ctx context.Context, message entity.Message) error FindByRoom(ctx context.Context, roomID string, cursor string, limit int) ([]entity.Message, error) } type MessageBroker interface { Publish(ctx context.Context, message entity.Message) error Subscribe(subscriber MessageSubscriber) } Implicit Interfaces Go interfaces are satisfied implicitly. If an adapter has the methods defined by a port interface, it satisfies that interface without any explicit declaration — no implements keyword like Java.\nThis characteristic suits Hexagonal Architecture well. A driven adapter implementing a driven port does not need to import the port package. Dependencies stay separated at the code level too.\nTo guarantee interface compliance at compile time, a common convention exists:\nvar _ driven.MessageRepository = (*MongoMessageRepository)(nil) This single line verifies at compile time that MongoMessageRepository satisfies driven.MessageRepository.\nAdapter Driving Adapter An HTTP handler is a typical driving adapter. It receives external requests and calls the driving port (use case).\n// adapter/driving/rest/handler.go type Handler struct { sendUseCase driving.SendMessageUseCase } func NewHandler(uc driving.SendMessageUseCase) *Handler { return \u0026amp;Handler{sendUseCase: uc} } func (h *Handler) Send(c *gin.Context) { var req dto.SendRequest if err := c.ShouldBindJSON(\u0026amp;req); err != nil { c.JSON(http.StatusBadRequest, gin.H{\u0026#34;error\u0026#34;: err.Error()}) return } if err := h.sendUseCase.Handle(c.Request.Context(), req); err != nil { c.JSON(http.StatusInternalServerError, gin.H{\u0026#34;error\u0026#34;: err.Error()}) return } c.Status(http.StatusOK) } The handler depends only on the driving port interface. It has no knowledge of what implementation sits behind it.\nDriven Adapter A DB repository is a typical driven adapter. It implements the driven port interface.\n// adapter/driven/persistence/repository.go type MongoMessageRepository struct { collection *mongo.Collection } func NewMongoMessageRepository(db *mongo.Database) *MongoMessageRepository { return \u0026amp;MongoMessageRepository{ collection: db.Collection(\u0026#34;messages\u0026#34;), } } func (r *MongoMessageRepository) Create(ctx context.Context, message entity.Message) error { doc := orm.FromMessage(message) _, err := r.collection.InsertOne(ctx, doc) if err != nil { return fmt.Errorf(\u0026#34;insert message: %w\u0026#34;, err) } return nil } ORM models and domain entities use separate structs. orm.FromMessage() and ToDomain() methods handle conversion, keeping domain entities independent of the database schema.\nDomain and Application Entity Domain entities contain business rules. Fields are unexported (lowercase) with getter methods.\n// domain/entity/message.go type Message struct { id string roomID string userID string body string sentAt time.Time } func NewMessage(roomID, userID, body string) Message { return Message{ id: uuid.New().String(), roomID: roomID, userID: userID, body: body, sentAt: time.Now(), } } func (m Message) ID() string { return m.id } func (m Message) RoomID() string { return m.roomID } func (m Message) Body() string { return m.body } Unexported fields prevent direct external modification. Creation only happens through the NewMessage constructor, protecting domain invariants.\nUseCase A use case handles one business operation. It implements a driving port and depends on driven ports.\n// application/usecase/send.go type SendUseCase struct { repo driven.MessageRepository broker driven.MessageBroker } func NewSendUseCase(repo driven.MessageRepository, broker driven.MessageBroker) *SendUseCase { return \u0026amp;SendUseCase{repo: repo, broker: broker} } func (uc *SendUseCase) Handle(ctx context.Context, req dto.SendRequest) error { message := entity.NewMessage(req.RoomID, req.UserID, req.Body) if err := uc.repo.Create(ctx, message); err != nil { return fmt.Errorf(\u0026#34;save message: %w\u0026#34;, err) } if err := uc.broker.Publish(ctx, message); err != nil { return fmt.Errorf(\u0026#34;publish message: %w\u0026#34;, err) } return nil } The use case depends only on driven port interfaces. Whether the backing store is MongoDB or PostgreSQL, any implementation of MessageRepository can be swapped in.\nDependency Injection In Go, assembling dependencies directly in the main function without a DI framework is the common approach.\nfunc main() { // driven adapters db := mongodb.Connect(os.Getenv(\u0026#34;MONGO_URI\u0026#34;)) messageRepo := repository.NewMongoMessageRepository(db) broker := messaging.NewKafkaBroker(kafkaConfig) // use case (inject driven ports) sendUseCase := usecase.NewSendUseCase(messageRepo, broker) // driving adapter (inject driving port) handler := rest.NewHandler(sendUseCase) // start server server := rest.NewServer(handler) server.Run(\u0026#34;:8080\u0026#34;) } The dependency graph appears explicitly in one place. Tracing which implementation is injected into which interface requires nothing more than reading the code.\nIn Java/Spring, @Component and @Autowired let the framework inject dependencies automatically. In Go, this process is manual — but the dependency flow stays explicit and easy to trace.\nSummary Hexagonal Architecture implementation varies by language idiom. In Go, implicit interfaces, the internal package, and manual DI align well with this pattern. Define ports as interfaces, implement them in adapters, and assemble everything in main. Dependency direction appears directly in the code structure, no framework required.\n","permalink":"https://wid-blog.pages.dev/en/posts/tech/architecture/go-hexagonal-architecture/","summary":"Core concepts of Hexagonal Architecture and its idiomatic implementation in Go using implicit interfaces and package structure for dependency direction control.","title":"Implementing Hexagonal Architecture in Go"},{"content":"I wanted hands-on experience with a low-level language. Managing memory through language rules, not a runtime. I chose Rust and followed the Rust Book Chapter 20 — a multithreaded HTTP server. About 200 lines, using only the standard library with no external crates.\nMemory Management Rust has no garbage collector. Instead, the ownership system determines when memory is freed at compile time.\nEvery value has exactly one owner. When the owner goes out of scope, the value is automatically dropped. Assigning a value to another variable moves ownership, and the original variable becomes unusable. The compiler enforces this.\nlet s1 = String::from(\u0026#34;hello\u0026#34;); let s2 = s1; // ownership moves // s1 is no longer usable — compile error Values can be borrowed without transferring ownership. Through references (\u0026amp;). Multiple immutable references can coexist, but only one mutable reference (\u0026amp;mut) is allowed at a time. This rule prevents data races at compile time.\nIn GC-based languages, the runtime handles memory reclamation. Rust delegates that decision to the compiler. Memory safety with zero runtime cost.\nThread Pool The core of the server is its thread pool. When a TCP connection arrives, work is distributed to worker threads.\nWork distribution uses channels. A single sender dispatches jobs, and multiple workers share the receiver. The problem was that Rust\u0026rsquo;s Receiver does not implement Clone. Sharing one receiver across multiple threads required a different approach.\nArc\u0026lt;Mutex\u0026lt;Receiver\u0026lt;T\u0026gt;\u0026gt;\u0026gt; was the answer. Arc enables multiple threads to own the same value through reference counting. Mutex ensures only one thread accesses the receiver at a time. This was where ownership rules extended naturally from single-threaded to concurrent contexts.\nlet (sender, receiver) = mpsc::channel(); let receiver = Arc::new(Mutex::new(receiver)); for id in 0..size { let receiver = Arc::clone(\u0026amp;receiver); // each worker shares the receiver\u0026#39;s reference count } Arc::clone() does not copy the value. It only increments the reference count. The type system makes this distinction explicit.\nGraceful Shutdown The Drop trait is called automatically when a value goes out of scope. I used it to clean up workers when the pool is destroyed.\nOrdering mattered. First, send a Terminate message to every worker. Then join each thread. Reversing this order risks deadlock — blocking on the first worker\u0026rsquo;s join while the remaining workers never receive the shutdown signal.\n// 1. send termination signals first for _ in \u0026amp;self.workers { self.sender.send(Message::Terminate)?; } // 2. then join for worker in \u0026amp;mut self.workers { if let Some(thread) = worker.thread.take() { thread.join()?; } } Declaring worker.thread as Option\u0026lt;JoinHandle\u0026lt;()\u0026gt;\u0026gt; was an idiomatic Rust pattern. take() extracts the handle, leaving None in its place. This prevents double-joining the same thread at the type level.\nTrait Bounds The thread pool\u0026rsquo;s generic type has three constraints.\nPool\u0026lt;T: FnOnce() + Send + \u0026#39;static\u0026gt; FnOnce means the closure is called exactly once. A job runs once on one worker and that is it. Send guarantees the closure can be safely transferred to another thread. 'static constrains the closure\u0026rsquo;s referenced values to live for the entire program. Since a thread\u0026rsquo;s lifetime is unpredictable, this prevents borrowed references from being freed prematurely.\nRemove any one of these three and the code will not compile. In Go, passing a closure to a goroutine has no such constraints. Race conditions are detected at runtime with the -race flag instead. Rust moves that verification to the compiler.\nRetrospective This project started from wanting to experience a low-level language. What I actually experienced was \u0026ldquo;safety enforced by the compiler\u0026rdquo; more than \u0026ldquo;low-level.\u0026rdquo;\nWithout composing Arc\u0026lt;Mutex\u0026lt;T\u0026gt;\u0026gt;, multiple threads cannot share a receiver. Without specifying FnOnce + Send + 'static, a closure cannot be sent to a thread. Without declaring Option\u0026lt;JoinHandle\u0026gt;, take() is unavailable. The compiler explains through error messages why each combination is necessary, and resolving them guarantees concurrency safety.\nI started because I wanted to work with a low-level language. What I learned was the experience of a type system catching concurrency bugs before runtime.\nReferences rust-server GitHub Repository ","permalink":"https://wid-blog.pages.dev/en/posts/career/personal/rust-server-retrospective/","summary":"A record of implementing the multithreaded HTTP server from Rust Book Chapter 20, experiencing how ownership and concurrency safety are enforced at the type level.","title":"rust-server Retrospective"},{"content":"Backend engineer. Writing about technology and lessons learned along the way.\n","permalink":"https://wid-blog.pages.dev/en/about/","summary":"\u003cp\u003eBackend engineer. Writing about technology and lessons learned along the way.\u003c/p\u003e","title":"About"}]