Making a Log Viewer 12× Faster: A Go Optimization Guide
A hands-on Go pprof tutorial: every command, every output, and every line of annotated source code from optimizing a real terminal log viewer.
Notes, experiments, and writeups.
A hands-on Go pprof tutorial: every command, every output, and every line of annotated source code from optimizing a real terminal log viewer.
A performance story from sembed-engine: the search visited the same number of nodes and returned the same answers, but became much faster by changing what the CPU had to do for every distance calculation.
Duplicate detection looks like a solved problem: use a hash set. A benchmark suite of 4050 measurements across finite batches and streaming workloads shows the fastest strategy can be 148x faster than a hash set, or 90,000x slower, depending on what you are deduplicating and what guarantees you need.
A dev log about implementing a Vamana-style ANN index from the DiskANN paper, why my first version was slower than brute force, and what it taught me about reading algorithms with implementation in mind.
Diagnose a Kafka throughput drop after v3.9.0. Learn why min.insync.replicas slows consumers and how to fix performance safely.
I recently looked into my blog’s performance and was surprised to find my pages were downloading over 3.6 MB of JavaScript and render-blocking CSS on every load. Here is the step-by-step breakdown of how I reduced my payload and removed JavaScript.
A hands-on Go pprof tutorial: every command, every output, and every line of annotated source code from optimizing a real terminal log viewer.
A complete guide to installing Boost in any C++ project. Covers CMake find_package, FetchContent, vcpkg manifest mode, Conan, manual g++ linking, and building from source with b2.
Duplicate detection looks like a solved problem: use a hash set. A benchmark suite of 4050 measurements across finite batches and streaming workloads shows the fastest strategy can be 148x faster than a hash set, or 90,000x slower, depending on what you are deduplicating and what guarantees you need.
std::unordered_set looks like O(1), but in hot C++ loops the memory access pattern can dominate. Here is a duplicate-filtering benchmark with dense ids, sparse ids, Boost dynamic_bitset, sort+unique, and the case where unordered_set wins.
A dev log about implementing a Vamana-style ANN index from the DiskANN paper, why my first version was slower than brute force, and what it taught me about reading algorithms with implementation in mind.
A performance story from sembed-engine: the search visited the same number of nodes and returned the same answers, but became much faster by changing what the CPU had to do for every distance calculation.