<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>/home/rook1e</title><description>Rook1e&apos;s home directory.</description><link>https://rook1e.com/</link><item><title>AI Coding Notes 1</title><link>https://rook1e.com/en/posts/ai-coding-1/</link><guid isPermaLink="true">https://rook1e.com/en/posts/ai-coding-1/</guid><description>Reflections on my recent experience coding with AI.</description><pubDate>Sat, 03 Jan 2026 08:08:49 GMT</pubDate><content:encoded>&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Outsource the grunt work of CRUD to a Coding Agent&lt;/strong&gt;, freeing up your energy for discussing solutions, breaking down tasks, reviewing code, and tackling hard problems. Enjoy the pure joy of creation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The loom of software development has arrived&lt;/strong&gt;. AI Agents won&apos;t fully replace programmers, and probably won&apos;t reduce the number of jobs in the long run either, but they will reshape how the entire industry operates. We may see software engineering practices, programming languages, and system architectures better suited for AI Agents, driving digitalization across all aspects of society at greater scale, lower cost, and higher efficiency.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Burnout comes easier&lt;/strong&gt;. AI thinks and generates far faster than human eyes can read and minds can process. Auditing every line of AI-generated code and context-switching between multiple task sessions is mentally exhausting.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Claude Code + Opus/Sonnet 4.5 is still SOTA&lt;/strong&gt;, but the moat isn&apos;t that deep.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GLM 4.6 is an excellent workhorse model&lt;/strong&gt;: fast, cheap, and generates at about 80% quality — perfect as an executor after you&apos;ve designed the approach/plan with a SOTA model.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://github.com/sst/opencode&quot;&gt;OpenCode&lt;/a&gt; is a solid open-source alternative to Claude Code&lt;/strong&gt;. OpenCode has many strengths, such as built-in LSP, easy model switching, and a built-in HTTP API. But the downsides are equally significant — it&apos;s not stable enough, and the system prompts still need polish. Promising for the future.&lt;/li&gt;
&lt;li&gt;Stick with mainstream tech stacks.&lt;/li&gt;
&lt;li&gt;Don&apos;t buy annual subscriptions for AI products.&lt;/li&gt;
&lt;/ol&gt;
</content:encoded></item><item><title>Indie Hacking Memo</title><link>https://rook1e.com/en/posts/indie-hacking-memo/</link><guid isPermaLink="true">https://rook1e.com/en/posts/indie-hacking-memo/</guid><description>Over a year into indie hacking, I haven&apos;t launched a successful product, but I&apos;ve learned a lot.</description><pubDate>Sun, 03 Aug 2025 14:11:13 GMT</pubDate><content:encoded>&lt;p&gt;Over a year into indie hacking, I haven&apos;t launched a successful product, but I&apos;ve learned a lot.&lt;/p&gt;
&lt;h2&gt;Use Boring Tech Stacks&lt;/h2&gt;
&lt;p&gt;Most indie hackers I&apos;ve observed come from a programmer background, and &amp;quot;the best tech stack for indie hacking&amp;quot; is a recurring topic in various communities.&lt;/p&gt;
&lt;p&gt;In this circle, the cool kids on the block use Next.js, like Next.js + Prisma + Shadcn UI + NextAuth + Supabase. They&apos;ll share how smooth the DX is and how quickly they can build a beautiful UI, but beneath the surface:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;These so-called &amp;quot;full-stack frameworks&amp;quot; are built for the frontend, with extremely limited backend capabilities.&lt;/li&gt;
&lt;li&gt;These tech stacks iterate very quickly, unnecessarily increasing learning and migration costs.&lt;/li&gt;
&lt;li&gt;A huge number of JS dependencies are like ticking time bombs.&lt;/li&gt;
&lt;li&gt;Serverless architecture is restrictive; even scheduled tasks might require workarounds (I know Vercel has this feature, but even Pro accounts have quantity limits).&lt;/li&gt;
&lt;li&gt;The &lt;a href=&quot;https://x.com/zemotion/status/1798558292681343039&quot;&gt;surprise&lt;/a&gt; from Vercel bills.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, users don&apos;t care about your code. Simple tech stacks can build very successful projects:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pieter Levels likes to put all his frontend and backend code into &lt;a href=&quot;https://x.com/levelsio/status/1308406118314635266&quot;&gt;one huge index.php&lt;/a&gt;, but his projects print money.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.levels.fyi/blog/scaling-to-millions-with-google-sheets.html&quot;&gt;Levels.fyi uses Google Sheets as a backend to serve millions of users&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, if your technical background leans towards the backend, you don&apos;t have to use Shadcn UI. Just stick to a backend-first approach and &lt;a href=&quot;https://boringtechnology.club/&quot;&gt;use boring technology&lt;/a&gt;: use the backend tech stack you&apos;re most comfortable with, use a templating engine for server-side rendering, and deploy to a VPS (remember to put it behind Cloudflare CDN).&lt;/p&gt;
&lt;p&gt;After all, writing code is just the first step.&lt;/p&gt;
&lt;h2&gt;MVP Should Only Include One Core Feature&lt;/h2&gt;
&lt;p&gt;MVP, as the name suggests, must be &lt;strong&gt;minimum&lt;/strong&gt;, completed with the least amount of effort:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Solve only one pain point at a time.&lt;/li&gt;
&lt;li&gt;Focus only on core features; it doesn&apos;t even necessarily require writing code.&lt;/li&gt;
&lt;li&gt;The UI can be rough, simple and elegant is fine.&lt;/li&gt;
&lt;li&gt;Don&apos;t design caching, message queues, etc., and don&apos;t use K8s.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Performance, stability, and a refined UI are sweet problems for the future. Refactoring the codebase can wait until your MRR meets expectations.&lt;/p&gt;
&lt;h2&gt;Charge From Day One&lt;/h2&gt;
&lt;p&gt;Pricing strategy is also a long-debated, subjective topic.&lt;/p&gt;
&lt;p&gt;Offering a free trial to lower the barrier to entry seems reasonable, but in practice, it&apos;s a different story:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It attracts customers who only want freebies.&lt;/li&gt;
&lt;li&gt;It adds an extra conversion step: traffic -&amp;gt; &lt;strong&gt;trial users&lt;/strong&gt; -&amp;gt; paying users.&lt;/li&gt;
&lt;li&gt;People don&apos;t value free things.&lt;/li&gt;
&lt;li&gt;Suggestions from free users may be less valuable.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Instead of offering a free trial, charge from day one, but offer a money-back guarantee if the user is not satisfied within XX days:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It conveys confidence to the user (&amp;quot;This product will definitely solve your pain point&amp;quot;) and provides a safety net (&amp;quot;Even if you don&apos;t like it, you can get an unconditional refund within 14 days&amp;quot;).&lt;/li&gt;
&lt;li&gt;It pre-filters high-risk users through payment channels like Stripe.&lt;/li&gt;
&lt;li&gt;If no one pays, it indicates a false demand or that you haven&apos;t found your niche yet.&lt;/li&gt;
&lt;li&gt;Genuinely ask for user feedback during refunds; this feedback is more valuable.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Regarding pricing, I prefer Tibo&apos;s &lt;a href=&quot;https://www.tmaker.io/what-is-the-ideal-pricing-for-a-saas&quot;&gt;perspective&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;A low price DOESN’T compensate for delivering LOW value.&lt;/li&gt;
&lt;li&gt;I price my SaaS in a range: $29-$99, and decide what to build, and how to build it based on that.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Fail Fast, Grow Fast&lt;/h2&gt;
&lt;p&gt;One of the advantages of indie hacking is the low cost of experimentation. And indie hacking itself has a very high failure rate.&lt;/p&gt;
&lt;p&gt;Even Pieter Levels only made money from 4 out of his first 70 projects (https://x.com/levelsio/status/1457315274466594817).&lt;/p&gt;
&lt;h2&gt;Be a Salesperson, a Founder, Not Just a Developer&lt;/h2&gt;
&lt;p&gt;Writing code is the simplest part of indie hacking because the input-output is stable and predictable, especially with AI boosting efficiency now. Beyond coding, how to build connections and trust with customers and eventually get them to pay is a difficult question with no standard answer.&lt;/p&gt;
&lt;p&gt;After launching the product, you need to day after day:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Reply to customer emails and DMs.&lt;/li&gt;
&lt;li&gt;Manage personal and product social media, newsletters, etc.&lt;/li&gt;
&lt;li&gt;Optimize cold reach content strategies and discover new potential user groups.&lt;/li&gt;
&lt;li&gt;Do SEO.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are all things that might not generate significant revenue for months but are essential, and they are also things that most tech people are not good at. Not to mention subsequent tasks like company registration, taxes, and data compliance.&lt;/p&gt;
&lt;p&gt;So don&apos;t limit yourself. Not only should you maintain the code as a developer, but you should also manage your business as an entrepreneur.&lt;/p&gt;
</content:encoded></item><item><title>RawWeb Updates: SimHash and Meilisearch</title><link>https://rook1e.com/en/posts/rawweb-updates-simhash-meilisearch/</link><guid isPermaLink="true">https://rook1e.com/en/posts/rawweb-updates-simhash-meilisearch/</guid><description>Over the past two weeks, I&apos;ve made two significant changes to RawWeb: introduced SimHash for document deduplication and migrated from Elasticsearch to Meilisearch to lower operational costs. The migration and cleanup of 56k similar documents went smoothly, but I also ran into some memory and performance challenges with Meilisearch.</description><pubDate>Mon, 14 Apr 2025 03:29:48 GMT</pubDate><content:encoded>&lt;p&gt;Over the past two weeks, I&apos;ve made two significant changes to &lt;a href=&quot;https://rawweb.org/&quot;&gt;RawWeb&lt;/a&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Introduced SimHash for document deduplication.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Migrated from Elasticsearch to Meilisearch&lt;/strong&gt; to lower operational costs.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The implementation went smoothly. I successfully migrated and cleaned up 56k similar documents, but I also ran into some memory and performance challenges with Meilisearch.&lt;/p&gt;
&lt;h2&gt;Document Deduplication&lt;/h2&gt;
&lt;p&gt;Previously, I simply used URLs as the unique constraint, which often led to discovering a large number of duplicate documents during maintenance.&lt;/p&gt;
&lt;p&gt;Common reasons for duplication:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Non-standardized URLs
&lt;ul&gt;
&lt;li&gt;Inconsistent case sensitivity&lt;/li&gt;
&lt;li&gt;Inconsistent trailing slashes&lt;/li&gt;
&lt;li&gt;Useless query parameters&lt;/li&gt;
&lt;li&gt;Different number or order of query parameters&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;A single blog having multiple domains&lt;/li&gt;
&lt;li&gt;Blogs changing their path structure, e.g., from &lt;code&gt;/blog/1&lt;/code&gt; to &lt;code&gt;/posts/1&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;URL-based deduplication is simple to implement but has significant limitations and can&apos;t cover all edge cases. To solve the duplication problem fundamentally, we need to work with the content itself.&lt;/p&gt;
&lt;h3&gt;SimHash&lt;/h3&gt;
&lt;p&gt;SimHash is a locality-sensitive hashing algorithm. It reflects the features of different parts of a text and allows for efficient similarity assessment using the &lt;a href=&quot;https://en.wikipedia.org/wiki/Hamming_distance&quot;&gt;Hamming distance&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For example, consider the following two strings. Their SimHash values differ by only a few bits, while their md5 hashes are completely different.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;data&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;simhash&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;md5&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;i think &lt;strong&gt;a&lt;/strong&gt; is the best&lt;/td&gt;
&lt;td&gt;10000110110000000000000000101001&lt;/td&gt;
&lt;td&gt;846ff6bebe901ead008e9c0e01a87470&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;i think &lt;strong&gt;b&lt;/strong&gt; is the best&lt;/td&gt;
&lt;td&gt;10000010110000000000100001101010&lt;/td&gt;
&lt;td&gt;ba1d2dc00d0a23dbb2001d570f03fb19&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Compared to other text similarity calculation methods, Simhash&apos;s advantages are its lightweight computation, storage, and comparison. It also has a strong endorsement: Google&apos;s web crawler uses SimHash to identify duplicate web pages.&lt;/p&gt;
&lt;h3&gt;Calculating Hash and Hamming Distance&lt;/h3&gt;
&lt;p&gt;Implementing SimHash is simple. I relied entirely on Claude Sonnet 3.5 to implement the basic SimHash and Hamming distance calculations, then used test cases to evaluate the results.&lt;/p&gt;
&lt;p&gt;Key points:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use a 64-bit hash value. This is a balanced length.&lt;/li&gt;
&lt;li&gt;Use the fnv hash algorithm. It&apos;s simple, efficient, and distributes well.&lt;/li&gt;
&lt;li&gt;Split the hash value into four 16-bit segments for database storage. This provides more flexibility when comparing Hamming distances.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package simhash

import (
	&amp;quot;backend/core/pkgs/simhash/tokenizer&amp;quot;
	&amp;quot;hash/fnv&amp;quot;
)

const (
	// HASH_BITS represents the number of bits in a SimHash value
	HASH_BITS = 64
)

// CalculateSimHash calculates the SimHash value of a text
// SimHash is a hashing algorithm used for calculating text similarity, effectively detecting the degree of similarity between texts
// Algorithm steps:
// 1. Tokenize the text
// 2. Calculate the hash value for each token
// 3. Merge all token hash values into a feature vector
// 4. Derive the final SimHash value from the feature vector
func CalculateSimHash(text string) uint64 {
	if text == &amp;quot;&amp;quot; {
		return 0
	}

	words := tokenizer.Tokenize(text)

	weights := make([]int, HASH_BITS)

	// Calculate hash for each token and update weights
	for _, word := range words {
		hash := getHash(word)
		// Update weights based on each bit position
		for i := 0; i &amp;lt; HASH_BITS; i++ {
			if (hash &amp;amp; (1 &amp;lt;&amp;lt; uint(i))) != 0 {
				weights[i]++
			} else {
				weights[i]--
			}
		}
	}

	// Generate the final simhash value
	var simhash uint64
	for i := 0; i &amp;lt; HASH_BITS; i++ {
		if weights[i] &amp;gt; 0 {
			simhash |= (1 &amp;lt;&amp;lt; uint(i))
		}
	}

	return simhash
}

// getHash calculates the hash value of a string
func getHash(s string) uint64 {
	h := fnv.New64a()
	h.Write([]byte(s))
	return h.Sum64()
}

// HammingDistance calculates the Hamming distance between two simhash values
// Hamming distance is the number of different characters at corresponding positions in two equal-length strings
// In SimHash, a smaller Hamming distance indicates higher similarity between two texts
func HammingDistance(hash1, hash2 uint64) int {
	xor := hash1 ^ hash2
	distance := 0

	// Count the number of different bits (Brian Kernighan algorithm, better performance)
	for xor != 0 {
		distance++
		xor &amp;amp;= xor - 1
	}

	return distance
}

// SplitSimHash splits a simhash value into four 16-bit parts
func SplitSimHash(hash uint64) [4]uint16 {
	return [4]uint16{
		uint16((hash &amp;gt;&amp;gt; 48) &amp;amp; 0xFFFF),
		uint16((hash &amp;gt;&amp;gt; 32) &amp;amp; 0xFFFF),
		uint16((hash &amp;gt;&amp;gt; 16) &amp;amp; 0xFFFF),
		uint16(hash &amp;amp; 0xFFFF),
	}
}

// MergeSimHash merges four 16-bit parts into a single simhash value
func MergeSimHash(parts [4]uint16) uint64 {
	return (uint64(parts[0]) &amp;lt;&amp;lt; 48) | (uint64(parts[1]) &amp;lt;&amp;lt; 32) | (uint64(parts[2]) &amp;lt;&amp;lt; 16) | uint64(parts[3])
}

// IsSimilar determines whether two texts are similar
// threshold represents the similarity threshold, typically a value between 3-10
// Returns true if the two texts are similar, false if not
func IsSimilar(hash1, hash2 uint64, threshold int) bool {
	return HammingDistance(hash1, hash2) &amp;lt;= threshold
}
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;# pgsql

CREATE OR REPLACE FUNCTION hamming_distance(
    simhash1_1 SMALLINT,
    simhash1_2 SMALLINT,
    simhash1_3 SMALLINT,
    simhash1_4 SMALLINT,
    simhash2_1 SMALLINT,
    simhash2_2 SMALLINT,
    simhash2_3 SMALLINT,
    simhash2_4 SMALLINT
) RETURNS INTEGER
	PARALLEL SAFE
AS $$
BEGIN
    RETURN bit_count((simhash1_1 # simhash2_1)::BIT(16)) +
           bit_count((simhash1_2 # simhash2_2)::BIT(16)) +
           bit_count((simhash1_3 # simhash2_3)::BIT(16)) +
           bit_count((simhash1_4 # simhash2_4)::BIT(16));
END;
$$ LANGUAGE plpgsql IMMUTABLE;
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Further Optimizing Tokenization&lt;/h3&gt;
&lt;p&gt;The tokens generated by the tokenizer are the basic units for calculating SimHash. The higher the quality of tokenization, the more accurately SimHash can reflect the document&apos;s features.&lt;/p&gt;
&lt;p&gt;I tested several types of tokenizers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Based on language features
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/meilisearch/charabia&quot;&gt;Charabia&lt;/a&gt; works quite well, maintained by the Meilisearch team.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/go-ego/gse&quot;&gt;gse&lt;/a&gt; seems sufficient for Chinese tokenization based on test cases, but the overall experience isn&apos;t as good as Charabia.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Whitespace: Suitable for languages like English that use spaces as separators, but requires additional implementation for normalization, stopword removal, etc.&lt;/li&gt;
&lt;li&gt;Unicode: Intended as a tokenizer for CJK languages, but the tokenization quality was not ideal.&lt;/li&gt;
&lt;li&gt;N-gram: Considered as a general-purpose tokenizer, but the quality fluctuates significantly.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Overall, Charabia produced the best results. However, it&apos;s a Rust project, while the RawWeb backend stack is Go. This requires using CGO to call Charabia (calling via Go&apos;s &lt;code&gt;exec&lt;/code&gt; package is at least 10x slower than calling via CGO), which introduces cross-compilation complexity.&lt;/p&gt;
&lt;p&gt;I&apos;m not familiar with Rust or CGO, so most of the following code was generated by Claude Sonnet 3.5/3.7, with some adjustments based on the actual situation.&lt;/p&gt;
&lt;p&gt;First, expose Charabia&apos;s &lt;code&gt;Tokenize&lt;/code&gt; method with a simple Rust function:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-rust&quot;&gt;use charabia::Tokenize;
use libc::{c_char};
use serde_json::json;
use std::ffi::{CStr, CString};
use std::ptr;

fn tokenize_string(input: &amp;amp;str) -&amp;gt; Vec&amp;lt;String&amp;gt; {
    input
        .tokenize()
        .filter(|token| token.is_word())
        .map(|token| token.lemma().to_string().trim().to_string())
        .filter(|token| !token.is_empty())
        .collect()
}

/// Tokenizes the input string and returns a JSON string containing the tokens
///
/// # Safety
///
/// This function is unsafe because it deals with raw pointers
#[no_mangle]
pub unsafe extern &amp;quot;C&amp;quot; fn tokenize(input: *const c_char) -&amp;gt; *mut c_char {
	// C stuff ...

    // Tokenize the input
    let tokens = tokenize_string(input_str);

	// C stuff ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Cargo.toml:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-toml&quot;&gt;# ...

[lib]
name = &amp;quot;charabia_rs&amp;quot;
crate-type = [&amp;quot;cdylib&amp;quot;, &amp;quot;staticlib&amp;quot;]

[dependencies]
charabia = { version = &amp;quot;0.9.3&amp;quot;, default-features = false, features = [
    &amp;quot;chinese-segmentation&amp;quot;, # disable chinese-normalization (https://github.com/meilisearch/charabia/issues/331)
    #&amp;quot;german-segmentation&amp;quot;,
    &amp;quot;japanese&amp;quot;,
] }

# ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For local testing, just run &lt;code&gt;cargo build --release&lt;/code&gt;. But cross-platform compilation is much more complicated. Fortunately, the &lt;a href=&quot;https://ziglang.org/&quot;&gt;Zig&lt;/a&gt; toolchain greatly simplifies C cross-compilation, eliminating the need for musl libc!&lt;/p&gt;
&lt;p&gt;Install Zig and &lt;a href=&quot;https://github.com/rust-cross/cargo-zigbuild&quot;&gt;zigbuild&lt;/a&gt;, then compile:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;cargo zigbuild --release --target aarch64-unknown-linux-gnu
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After compiling the Rust code into a &lt;code&gt;.so&lt;/code&gt; file, call its exported method in RawWeb. Need to configure linking to the correct &lt;code&gt;.so&lt;/code&gt; during cross-compilation and loading the &lt;code&gt;.so&lt;/code&gt; file from &lt;code&gt;./lib&lt;/code&gt; when the application starts online:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// #cgo linux,amd64 LDFLAGS: -L${SRCDIR}/charabia-rs/target/x86_64-unknown-linux-gnu/release -lcharabia_rs -Wl,-rpath,./lib
// #cgo linux,arm64 LDFLAGS: -L${SRCDIR}/charabia-rs/target/aarch64-unknown-linux-gnu/release -lcharabia_rs -Wl,-rpath,./lib
// #cgo LDFLAGS: -L${SRCDIR}/charabia-rs/target/release -lcharabia_rs
// #include &amp;lt;stdlib.h&amp;gt;
// #include &amp;lt;stdint.h&amp;gt;
//
// typedef void* charabia_result_t;
//
// extern char* tokenize(const char* input);
// extern void free_tokenize_result(char* ptr);
import &amp;quot;C&amp;quot;

// Tokenize tokenizes the given text using the Rust implementation via cgo
func Tokenize(text string) []string {
	// C stuff ...

	cResult := C.tokenize(cText)

	// C stuff ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Also using Zig for cross-compiling the Go code:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;CGO_ENABLED=1 GOOS=linux GOARCH=arm64 CC=&amp;quot;zig cc -target aarch64-linux&amp;quot; CXX=&amp;quot;zig c++ -target aarch64-linux&amp;quot; go build ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, during deployment, place &lt;code&gt;libcharabia_rs.so&lt;/code&gt; into the &lt;code&gt;./lib/&lt;/code&gt; directory so it can be loaded.&lt;/p&gt;
&lt;h3&gt;Filtering Similar Content&lt;/h3&gt;
&lt;p&gt;According to resources, a Hamming distance of less than 3 for a 64-bit SimHash can generally identify similar content.&lt;/p&gt;
&lt;p&gt;However, due to limitations in tokenization quality, content length, etc., I observed false positives even with a Hamming distance threshold of 1 in my test cases. Additionally, my server has low specs. Calculating and comparing the Hamming distance for one document against 700k records takes about 1.2 seconds. At this rate, a full comparison would take 10 days, which is unacceptable.&lt;/p&gt;
&lt;p&gt;Therefore, for now, I only filtered documents with identical hash values. This avoids calculating Hamming distance and allows the search to use database indexes, making it very fast. Ultimately, I cleaned up 56,000 similar documents. This number was much higher than I expected. Given that I encountered SimHash collisions during testing, I reasonably suspect there might be quite a few false positives among them. Further optimization of tokenization and token weighting is needed.&lt;/p&gt;
&lt;h3&gt;References&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://moz.com/devblog/near-duplicate-detection&quot;&gt;Near-Duplicate Detection&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://ben-whitmore.com/simhash-and-solving-the-hamming-distance-problem-explained/&quot;&gt;SimHash and solving the hamming distance problem: explained&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/33026.pdf&quot;&gt;Detecting Near-Duplicates for Web Crawling - Google Research&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Migrating to Meilisearch&lt;/h2&gt;
&lt;p&gt;The full-text search engine I previously used was Elasticsearch. It&apos;s feature-rich and battle-tested in countless production environments.&lt;/p&gt;
&lt;p&gt;As RawWeb&apos;s features and data volume stabilized, I realized I wasn&apos;t using most of Elasticsearch&apos;s capabilities, yet I still had to bear the extra operational costs (actually, it had been very stable since deployment, but if such a behemoth encountered problems one day, I wasn&apos;t sure if I had the ability or energy to fix them). Also, the &lt;code&gt;elasticsearch-go&lt;/code&gt; client is very difficult to use.&lt;/p&gt;
&lt;p&gt;Meilisearch is a more lightweight alternative, with most features working out of the box. The migration process was very smooth, although a few unexpected issues popped up.&lt;/p&gt;
&lt;h3&gt;Multilingual Documents&lt;/h3&gt;
&lt;p&gt;Referencing &lt;a href=&quot;https://w3techs.com/technologies/overview/content_language&quot;&gt;W3Techs&apos; statistics on content languages on the internet&lt;/a&gt;, RawWeb specifically tags content in English, Chinese, German, French, Spanish, Russian, and Japanese to enable filtering search results by language.&lt;/p&gt;
&lt;p&gt;In Elasticsearch, I used separate fields like &lt;code&gt;content_en&lt;/code&gt;, &lt;code&gt;content_zh&lt;/code&gt;, etc., with dedicated tokenizers. Theoretically, this step could be simplified in Meilisearch because it can automatically detect content language. However, I ended up splitting the content into multiple indexes because:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;RawWeb&apos;s existing natural language detection module can automatically sample based on document title and content length, switching precision modes automatically, which is more efficient than full-text detection.&lt;/li&gt;
&lt;li&gt;To filter search results by language, I need to add a &lt;code&gt;lang&lt;/code&gt; field in Meilisearch to mark the document&apos;s language. So, besides Meilisearch, I still need to perform natural language detection once.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Splitting different languages into separate indexes also aligns with Meilisearch&apos;s official recommendations.&lt;/p&gt;
&lt;h3&gt;Issue 1: High Storage Space Usage&lt;/h3&gt;
&lt;p&gt;The PostgreSQL database size is about 2.4GB. After importing documents, the Meilisearch database size grew to about 23GB (with &lt;code&gt;searchableAttributes&lt;/code&gt; and &lt;code&gt;filterableAttributes&lt;/code&gt; configured correctly).&lt;/p&gt;
&lt;p&gt;Initially, I didn&apos;t realize the implication about disk usage mentioned in the &lt;a href=&quot;https://www.meilisearch.com/docs/learn/engine/storage#measured-disk-usage&quot;&gt;documentation&lt;/a&gt;, which led to the hard drive filling up. Fortunately, hard drive space is the cheapest cloud resource, so expanding it wasn&apos;t expensive.&lt;/p&gt;
&lt;p&gt;Besides this, there&apos;s another potential issue: Meilisearch doesn&apos;t release disk space after deleting documents (&lt;a href=&quot;https://www.meilisearch.com/docs/learn/engine/storage#database-size&quot;&gt;docs&lt;/a&gt;). Reclaiming space might require using snapshots (&lt;a href=&quot;https://github.com/meilisearch/meilisearch/discussions/3156#discussioncomment-4262368&quot;&gt;related discussion&lt;/a&gt;).&lt;/p&gt;
&lt;h3&gt;Issue 2: Memory Usage Limit Ineffective&lt;/h3&gt;
&lt;p&gt;Meilisearch is deployed on a low-spec server with 2 vCPUs and 4GB RAM. Since Elasticsearch previously ran fine on a server with the same configuration, I assumed Meilisearch would be smooth sailing too. After indexing all documents, I went to sleep peacefully (I later realized they were probably just queued in Meilisearch&apos;s task queue at that time).&lt;/p&gt;
&lt;p&gt;I woke up to find the server&apos;s CPU maxed out and disk read speeds exceeding 1GB/s, causing the entire system to freeze. After a forced reboot, I checked the system logs and the only abnormality found was an OOM error from the Meilisearch container. I then used &lt;code&gt;MEILI_MAX_INDEXING_MEMORY&lt;/code&gt; to limit indexing memory usage to 2GB. However, the next day, it experienced OOM and maxed-out CPU again.&lt;/p&gt;
&lt;p&gt;Looking through the documentation, I found the &lt;code&gt;MEILI_EXPERIMENTAL_REDUCE_INDEXING_MEMORY_USAGE&lt;/code&gt; parameter. Although experimental, I tried it and found it worked really well. CPU and disk I/O were no longer aggressive.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/24/meilisearch.png&quot; alt=&quot;monitor&quot;&gt;&lt;/p&gt;
&lt;h3&gt;Issue 3: Very Slow Document Deletion Leading to Task Backlog&lt;/h3&gt;
&lt;p&gt;To clean up documents in Meilisearch that were deleted from the database, the operation for each batch during synchronization was:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Delete documents in Meilisearch within the range &lt;code&gt;id &amp;gt;= ? AND id &amp;lt;= ?&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Add new documents.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After deploying, data synchronization ran into problems. Investigation revealed that Meilisearch had accumulated 133k tasks:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;//GET /tasks?statuses=enqueued,processing

{&amp;quot;results&amp;quot;:[{&amp;quot;uid&amp;quot;:16354,&amp;quot;batchUid&amp;quot;:null,&amp;quot;indexUid&amp;quot;:&amp;quot;items_es&amp;quot;,&amp;quot;status&amp;quot;:&amp;quot;enqueued&amp;quot;,&amp;quot;type&amp;quot;:&amp;quot;documentAdditionOrUpdate&amp;quot;,&amp;quot;canceledBy&amp;quot;:null,&amp;quot;details&amp;quot;:{&amp;quot;receivedDocuments&amp;quot;:7,&amp;quot;indexedDocuments&amp;quot;:null},&amp;quot;error&amp;quot;:null,&amp;quot;duration&amp;quot;:null,&amp;quot;enqueuedAt&amp;quot;:&amp;quot;2025-04-12T04:59:27.183657254Z&amp;quot;,&amp;quot;startedAt&amp;quot;:null,&amp;quot;finishedAt&amp;quot;:null},...],&amp;quot;total&amp;quot;:13385,&amp;quot;limit&amp;quot;:20,&amp;quot;from&amp;quot;:16354,&amp;quot;next&amp;quot;:16334}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Observing task execution, I found that document deletion operations were extremely slow. A range deletion involving up to 1,000 documents took nearly 20 minutes. Furthermore, because deletion and addition operations were interspersed during synchronization, Meilisearch couldn&apos;t automatically merge adjacent tasks.&lt;/p&gt;
&lt;p&gt;I couldn&apos;t find similar issues online; this seems to be a problem unique to my setup. I suspect it&apos;s because of the limited memory and the fact that Hetzner&apos;s expanded storage performance is only 1/10th of the original disk performance (&lt;a href=&quot;https://pcr.cloud-mercato.com/providers/hetzner/flavors/cpx11/performance/storage-bandwidth&quot;&gt;benchmark&lt;/a&gt;). I&apos;ll retest this once I get sponsorship funds to upgrade to a higher-spec server.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The main goals of this update have been achieved. I will continue debugging and optimizing.&lt;/p&gt;
&lt;p&gt;Additionally, the implementation process exposed some operational issues, such as excessive downtime and lack of server resource alerts (I removed New Relic monitoring during the last refactor). These will be targets for the next round of optimization.&lt;/p&gt;
</content:encoded></item><item><title>The first three iterations of RawWeb.org&apos;s tech stack</title><link>https://rook1e.com/en/posts/the-first-three-iterations-of-rawweb/</link><guid isPermaLink="true">https://rook1e.com/en/posts/the-first-three-iterations-of-rawweb/</guid><description>RawWeb.org is a search engine project I launched in 2024-08. The initial goal was to help more people discover personal digital gardens that are often overlooked by mainstream search engines, while also exploring some tech stacks I was interested in through hands-on practice.</description><pubDate>Sat, 08 Feb 2025 04:25:26 GMT</pubDate><content:encoded>&lt;p&gt;&lt;a href=&quot;https://rawweb.org/&quot;&gt;RawWeb.org&lt;/a&gt; is a search engine project I launched in 2024-08. The initial goal was to help more people discover personal digital gardens that are often overlooked by mainstream search engines. I also wanted to explore some interesting tech stacks through practical implementation.&lt;/p&gt;
&lt;p&gt;Currently, it has indexed 17k sites and 615k articles. Feel free to &lt;a href=&quot;https://rawweb.org/feeds&quot;&gt;submit&lt;/a&gt; your favorite independent blogs.&lt;/p&gt;
&lt;p&gt;This article only represents my personal experience and views.&lt;/p&gt;
&lt;h2&gt;Middleware&lt;/h2&gt;
&lt;p&gt;PostgreSQL is used as the database, instead of SQLite, because I might need Pg&apos;s rich plugins in the future. Redis is used for caching. RabbitMQ is used as the message queue.&lt;/p&gt;
&lt;p&gt;Additionally, a search engine requires crawler and full-text search capabilities.&lt;/p&gt;
&lt;p&gt;Elasticsearch is used for full-text search. The reason for not implementing inverted indexing myself or using lightweight solutions like Meilisearch is that ES has better Chinese tokenizers.&lt;/p&gt;
&lt;p&gt;To reduce potential risks and development complexity, the crawler only obtains data from websites&apos; RSS feeds. Therefore, the crawler is simply implemented as an HTTP requester and RSS parser.&lt;/p&gt;
&lt;p&gt;Keeping things simple, all the above components are deployed as single-node, without any optimization tricks (I don&apos;t know how).&lt;/p&gt;
&lt;h2&gt;Multi-language Content Support&lt;/h2&gt;
&lt;p&gt;This is a search engine capable of indexing content in multiple languages, where tokenization quality determines search result quality.&lt;/p&gt;
&lt;p&gt;To configure specialized tokenizers for different languages, multiple fields are set up in Elasticsearch, such as &lt;code&gt;content-en&lt;/code&gt;, &lt;code&gt;content-zh&lt;/code&gt;, to store content in different languages.&lt;/p&gt;
&lt;p&gt;This involves:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Natural language detection&lt;/li&gt;
&lt;li&gt;Routing content to dedicated fields with specialized tokenizers&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;First, clean the raw content:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Parse HTML, remove useless tags like style, script;&lt;/li&gt;
&lt;li&gt;Remove code, URLs, and other content as much as possible to avoid affecting language detection accuracy;&lt;/li&gt;
&lt;li&gt;Remove HTML and XML tags to get plain text;&lt;/li&gt;
&lt;li&gt;Remove excess whitespace characters.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Then identify the content&apos;s language. There are two approaches:&lt;/p&gt;
&lt;p&gt;The first is &lt;a href=&quot;https://github.com/pemistahl/lingua-go&quot;&gt;lingua&lt;/a&gt;, which has implementations in Python, Go, and other languages. It has excellent performance and accuracy, and allows selective loading of language models. The downside is that it increases the executable size by about 100MB.&lt;/p&gt;
&lt;p&gt;The second is Elasticsearch&apos;s built-in &lt;a href=&quot;https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-lang-ident.html&quot;&gt;lang_ident_model_1&lt;/a&gt;, which requires creating a pipeline to call. In testing, the accuracy was good but performance was an issue. With the same data, it was &lt;a href=&quot;https://x.com/rook1e_stdout/status/1830871816279335045&quot;&gt;4 times slower&lt;/a&gt; than the Python version of lingua running on lower-spec hardware. I suspect this is because lang_ident_model_1 needs to test all supported languages, while lingua only needs to load a few language models.&lt;/p&gt;
&lt;p&gt;Considering performance and flexibility, lingua was ultimately chosen. Lingua has high and low accuracy modes, with low accuracy offering about 2x performance improvement without significant accuracy loss for inputs over 120 characters. So currently, a hybrid approach of high and low accuracy detection is used, with input being the title and content sampling. In actual testing, detecting one article only takes 100μs.&lt;/p&gt;
&lt;p&gt;Once the content&apos;s language is determined, the best tokenizer can be set for it. Based on W3Techs&apos; estimated internet content distribution, separate tokenizers are set for the most mainstream languages - Chinese, English, Spanish, Russian, German, French, and Japanese, while other languages use the default tokenizer.&lt;/p&gt;
&lt;h2&gt;Backend&lt;/h2&gt;
&lt;p&gt;The crawler is a simple Go program. The main backend went through three iterations with Django, Nest.js, and Go.&lt;/p&gt;
&lt;h3&gt;v1 - Django&lt;/h3&gt;
&lt;p&gt;Tech stack:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Django v5&lt;/li&gt;
&lt;li&gt;django-ninja as API endpoint&lt;/li&gt;
&lt;li&gt;huey as task queue, though I only used it for managing scheduled tasks&lt;/li&gt;
&lt;li&gt;uv as package manager&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I had been recommended Django multiple times before, and I wanted to learn a batteries-included framework through this project. Considering it has Django Admin, I used it for prototype development.&lt;/p&gt;
&lt;p&gt;Django&apos;s documentation quality is among the best I&apos;ve seen, making it very pleasant to read. Since the project is frontend-backend separated and doesn&apos;t use built-in plugins like auth and view, Django&apos;s &amp;quot;batteries&amp;quot; didn&apos;t reduce my workload, and the overall development experience wasn&apos;t particularly exciting.&lt;/p&gt;
&lt;p&gt;Considering the framework&apos;s stability and community prosperity, I would probably like Django if I were a dynamic language enthusiast. Unfortunately, I&apos;ve been deeply influenced by Go&apos;s philosophy, and Django&apos;s level of &amp;quot;magic&amp;quot; exceeded my comfort zone, like using field name + double underscore + method name to build query conditions. BTW, It&apos;s hard to imagine I once wanted to learn RoR.&lt;/p&gt;
&lt;p&gt;Finally, after development was complete, even with all built-in plugins disabled, async maximally utilized, and Uvicorn in use, the load test results were far below my expectations. So I started looking into rebuilding with Node.js.&lt;/p&gt;
&lt;h3&gt;v2 - Nest.js&lt;/h3&gt;
&lt;p&gt;Tech stack:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;TypeScript&lt;/li&gt;
&lt;li&gt;Components wrapped by Nest&lt;/li&gt;
&lt;li&gt;Prisma as ORM&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Since the main latency in a search request comes from waiting for Elasticsearch, with the web service mainly acting as a request forwarder, this I/O-intensive scenario is very suitable for Node.js.&lt;/p&gt;
&lt;p&gt;Popular frameworks include Nest.js and Adonis.js, and I ultimately chose the more popular Nest. Don&apos;t ask why not Express or Fastify - they&apos;re not full-fledged frameworks.&lt;/p&gt;
&lt;p&gt;Nest seems more like a dependency injector plus multiple officially maintained components (modules). Although it includes common components like cache and message queue, from my observation, most are wrappers around third-party libraries, so Nest users don&apos;t need to piece things together themselves. However, even with official wrappers, I was still unfortunately affected by underlying library changes (&lt;a href=&quot;https://github.com/nestjs/cache-manager/issues/516&quot;&gt;cache-manager@6&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;For developers with Java/Spring background, Nest might be great. But for me, Nest&apos;s various decorators, pipes, and other concepts created a heavy mental burden. When switching back to a Nest project after two or three months, I needed to review the documentation to confirm their usage.&lt;/p&gt;
&lt;p&gt;Additionally, while the documentation appears comprehensive, its quality is far below Django&apos;s. For example, I couldn&apos;t understand the module lifecycle part from the documentation alone, and finally had to rely on an article analyzing the source code to roughly figure it out.&lt;/p&gt;
&lt;p&gt;Exploring new technology is always good, but choosing Nest for this project was a mistake because the project&apos;s complexity was even less than the complexity Nest introduced.&lt;/p&gt;
&lt;h3&gt;v3 - Go&lt;/h3&gt;
&lt;p&gt;Tech stack:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Echo as API endpoint&lt;/li&gt;
&lt;li&gt;GORM Gen as ORM&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Based on the previous two experiences, I&apos;ve temporarily demystified batteries-included frameworks. After coming full circle, I found my true love was still the original - Go.&lt;/p&gt;
&lt;p&gt;I previously had two main complaints about web development with Go:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The syntax is too basic, making CRUD uncomfortable&lt;/li&gt;
&lt;li&gt;Lack of good ORM or SQL builder&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Fortunately, both issues have been largely resolved.&lt;/p&gt;
&lt;p&gt;Thanks to the development of LLM and AI IDE, Go&apos;s basic syntax is no longer a disadvantage but has become somewhat of an advantage (to me), as LLMs can very easily understand the code, and AI completion is very accurate.&lt;/p&gt;
&lt;p&gt;Regarding ORM, a quite popular opinion in the Go community is that &amp;quot;ORM is harmful,&amp;quot; preferring approaches like sqlc generating Go code from SQL, or sqlx directly using SQL. ORM indeed sometimes makes simple things complex - for example, Prisma only recently started supporting &lt;a href=&quot;https://www.prisma.io/blog/prisma-6-better-performance-more-flexibility-and-type-safe-sql#pick-the-best-join-strategy&quot;&gt;true JOIN&lt;/a&gt;. However, a well-designed, type-safe ORM can greatly improve CRUD experience.&lt;/p&gt;
&lt;p&gt;GORM Gen made me fall in love with GORM again. Through code generation, it not only achieves type safety but, more importantly, can generate Go code from custom SQL, meaning I have almost full SQL capabilities.&lt;/p&gt;
&lt;p&gt;Thus, this code refactoring with Go was very enjoyable, except for the disastrous official Elasticsearch SDK.&lt;/p&gt;
&lt;p&gt;Go also reduced the infra burden, no longer needing multi-stage builds in Dockerfile (without CI server or GitHub Action, the previous two tech stacks required building Docker images after pushing code to production environment).&lt;/p&gt;
&lt;p&gt;Keeping things simple, I also removed RabbitMQ, instead using a database table to store tasks and providing an API for the crawler to sync data. Since Redis might be simplified away in the future, I didn&apos;t use Redis as message queue here.&lt;/p&gt;
&lt;h3&gt;Alternatives&lt;/h3&gt;
&lt;p&gt;There are some interesting options I passed on but might try in the future:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;C# &amp;amp; .Net. I&apos;ve heard C# is very enjoyable to write with, and .Net is a great enterprise framework. But I&apos;m not interested in OOP, and I&apos;m concerned about whether Microsoft might make risky moves in .Net open source work again (&lt;a href=&quot;https://github.com/dotnet/sdk/issues/22247&quot;&gt;Hot Reload removed from dotnet watch - Why?&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;Elixir &amp;amp; Phoenix. Elixir&apos;s features seem very suitable for high-concurrency scenarios, and the development experience is very good. But I currently don&apos;t have the energy to learn functional programming.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&amp;lt;details&amp;gt;
&amp;lt;summary&amp;gt;Easter egg&amp;lt;/summary&amp;gt;
Are you looking for Rust? Haha, I&apos;ll never learn it for web development.
&amp;lt;/details&amp;gt;&lt;/p&gt;
&lt;h2&gt;Frontend&lt;/h2&gt;
&lt;p&gt;The frontend uses my favorite SvelteKit, compiled into hybrid SSG and SPA pages. UI components are from shadcn-svelte.&lt;/p&gt;
&lt;p&gt;React is good, but I equally dislike most things in its ecosystem, especially Next.js. I don&apos;t understand why the community keeps getting &amp;quot;richer&amp;quot; while making developers more miserable. Svelte is currently my painkiller, and I recommend you try it too.&lt;/p&gt;
&lt;h2&gt;Infrastructure&lt;/h2&gt;
&lt;p&gt;Avoiding vendor lock-in, only using generic infra technologies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Backend services are orchestrated with Docker Compose, compiled and deployed to VPS by a simple Shell script&lt;/li&gt;
&lt;li&gt;Main backend services use Hetzner&apos;s Arm VPS, currently two Debian instances with 2 vCPU + 4G RAM (Great value for money, welcome to use my &lt;a href=&quot;https://hetzner.cloud/?ref=YVojp0f9kRad&quot;&gt;aff&lt;/a&gt; to register, you&apos;ll get €20 credit)&lt;/li&gt;
&lt;li&gt;Crawler service is on another budget VPS&lt;/li&gt;
&lt;li&gt;Web pages, CDN, DNS are on Cloudflare&lt;/li&gt;
&lt;li&gt;Monitoring service uses self-hosted Uptime Kuma, and some services are connected to New Relic&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Future plans include setting up a Prometheus + Grafana observability system to visualize metrics like search volume and new indexing volume.&lt;/p&gt;
</content:encoded></item><item><title>Setting Up a Lightweight Remote Linux Dev Environment (Fedora 38)</title><link>https://rook1e.com/en/posts/lightweight-remote-linux-devenv-fedora38/</link><guid isPermaLink="true">https://rook1e.com/en/posts/lightweight-remote-linux-devenv-fedora38/</guid><description>Tips for configuring and optimizing Fedora Workstation 38 as a remote development environment, including remote desktop setup and disabling GNOME to save resources.</description><pubDate>Fri, 14 Jul 2023 09:02:18 GMT</pubDate><content:encoded>&lt;p&gt;First, pick a distro. It needs comprehensive and up-to-date package repositories, but since I won&apos;t necessarily use it every day, I&apos;d prefer to avoid rolling releases. This time I went with Fedora Workstation 38, then trimmed down GNOME and other services to achieve:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Idle memory usage under 300MB&lt;/li&gt;
&lt;li&gt;Desktop environment available on demand&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Basic Setup&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Enable sshd:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;sudo systemctl start sshd
sudo systemctl enable sshd
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Rename home directory folders to English:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;export LANG=en_US
xdg-user-dirs-gtk-update
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After the conversion, set the system language back to &lt;code&gt;zh_CN&lt;/code&gt;, restart, and when prompted at login, choose not to convert and not to ask again.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Remove unnecessary &lt;a href=&quot;https://docs.fedoraproject.org/en-US/workstation-working-group/third-party-repos/#_included_software&quot;&gt;third-party repositories&lt;/a&gt;:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;sudo rm /etc/yum.repos.d/_copr\:copr.fedorainfracloud.org\:phracek\:PyCharm.repo
sudo rm /etc/yum.repos.d/rpmfusion-nonfree-steam.repo
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Install Go + nvim development tools:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;sudo dnf install vim neovim go gcc-c++
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Remove Software Update Services&lt;/h2&gt;
&lt;p&gt;Right after booting, memory usage was already at 1.5GB, with packagekitd taking up a large chunk.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.freedesktop.org/software/PackageKit/pk-intro.html&quot;&gt;PackageKit&lt;/a&gt; is a generic abstraction layer over package managers like dnf and apt. But since I only use dnf and don&apos;t need &amp;quot;advanced&amp;quot; tools like gnome-software -- not to mention its &lt;a href=&quot;https://www.reddit.com/r/Fedora/comments/ts5tgd/is_fedora_doing_something_to_reduce_packagekits/&quot;&gt;heavy resource consumption&lt;/a&gt; -- I removed both components:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;sudo dnf remove gnome-software PackageKit
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This saved 600MB of memory.&lt;/p&gt;
&lt;h2&gt;Disable GNOME Desktop Environment&lt;/h2&gt;
&lt;p&gt;Most of the time I connect via SSH, and only use RDP remote desktop on rare occasions. So I disabled &lt;a href=&quot;https://en.wikipedia.org/wiki/GNOME_Display_Manager&quot;&gt;gdm (GNOME Display Manager)&lt;/a&gt; to save resources (&lt;a href=&quot;https://superuser.com/a/444003&quot;&gt;reference&lt;/a&gt;):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;sudo systemctl stop gdm
sudo systemctl disable gdm
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start it manually when needed:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;sudo systemctl start gdm
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This saved another 600MB+ of memory.&lt;/p&gt;
&lt;h2&gt;Remote Desktop&lt;/h2&gt;
&lt;p&gt;When you need the desktop environment, start gdm first, then connect remotely.&lt;/p&gt;
&lt;h3&gt;Auto-login + Unlocking Remote Login Password&lt;/h3&gt;
&lt;p&gt;In this version, remote desktop is built into GNOME and requires an active user session, so you&apos;d need to log in via VNC first.&lt;/p&gt;
&lt;p&gt;For convenience, you can set up GNOME auto-login (&lt;a href=&quot;https://help.gnome.org/admin/system-admin-guide/stable/login-automatic.html.en&quot;&gt;documentation&lt;/a&gt;):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;# /etc/gdm/custom.conf

[daemon]
AutomaticLoginEnable=True
AutomaticLogin={{ username }}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, you&apos;ll find that after each auto-login, the remote desktop password gets reset to a random one. This happens because service passwords on the system are stored encrypted in a keyring. The default keyring&apos;s unlock password is the login password, and it gets unlocked together during a normal login. But with auto-login, no password is entered to unlock it, so GNOME can&apos;t read the encrypted remote login password and generates a new random one instead.&lt;/p&gt;
&lt;p&gt;Setting the default keyring password to empty would leave all stored passwords in plaintext, which is insecure.&lt;/p&gt;
&lt;p&gt;Following a &lt;a href=&quot;https://askubuntu.com/a/1409857&quot;&gt;community solution&lt;/a&gt;, create a password-free insecure keyring specifically for storing the RDP password:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Install the keyring management tool. After installation, you can find the &amp;quot;Passwords and Keys&amp;quot; application:&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;sudo dnf install seahorse
&lt;/code&gt;&lt;/pre&gt;
&lt;ol start=&quot;2&quot;&gt;
&lt;li&gt;In the tool, check the default &amp;quot;Login&amp;quot; keyring. It should already contain a remote desktop password entry named &amp;quot;GNOME Remote Desktop RDP credentials&amp;quot;. Delete this entry.&lt;/li&gt;
&lt;li&gt;Create a new keyring with an empty password, and set it as the default keyring.&lt;/li&gt;
&lt;li&gt;Restart the system to apply the new default keyring.&lt;/li&gt;
&lt;li&gt;Set the remote desktop password. Check the newly created keyring again -- it should now contain the &amp;quot;GNOME Remote Desktop RDP credentials&amp;quot; entry. From now on, remote desktop will use this keyring to read and set passwords.&lt;/li&gt;
&lt;li&gt;Restore the default keyring back to the original &amp;quot;Login&amp;quot; keyring, then restart the system.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Note: You need to change the default keyring so that GNOME creates the password entry. Such entries appear as &amp;quot;Password or Key&amp;quot;, whereas manually created password entries show up as &amp;quot;Stored Note&amp;quot; and won&apos;t be used by GNOME.&lt;/p&gt;
&lt;h3&gt;Using Remote Desktop While Locked&lt;/h3&gt;
&lt;p&gt;By design, GNOME remote desktop mirrors the local screen -- when the local screen is locked, the remote desktop connection is dropped.&lt;/p&gt;
&lt;p&gt;This differs from Windows, where connecting via remote desktop automatically locks the local desktop. The GNOME team hasn&apos;t officially responded to this &lt;a href=&quot;https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/3212&quot;&gt;feature request&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For now, I&apos;ve simply disabled the auto-lock screen.&lt;/p&gt;
</content:encoded></item><item><title>Deep Dive into Linux TProxy</title><link>https://rook1e.com/en/posts/linux-tproxy/</link><guid isPermaLink="true">https://rook1e.com/en/posts/linux-tproxy/</guid><description>TProxy (Transparent Proxy) is a kernel-supported transparent proxying mechanism introduced in Linux 2.6.28. Unlike NAT, which modifies the packet&apos;s destination address for redirection, TProxy merely replaces the socket held by the packet&apos;s skb, without modifying packet headers.</description><pubDate>Fri, 23 Jun 2023 09:06:25 GMT</pubDate><content:encoded>&lt;p&gt;This was my first foray into the kernel networking stack. If you spot any errors, feel free to let me know (&lt;a href=&quot;mailto:rook1e404@outlook.com&quot;&gt;email&lt;/a&gt;) and I will annotate corrections in the article.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;TProxy (&lt;strong&gt;T&lt;/strong&gt;ransparent &lt;strong&gt;Proxy&lt;/strong&gt;) is a kernel-supported transparent proxying mechanism introduced in Linux &lt;a href=&quot;https://kernelnewbies.org/Linux_2_6_28#Network:_Transparent_proxying.2C_new_drivers.2C_DSA...&quot;&gt;2.6.28&lt;/a&gt;. Unlike NAT, which modifies the packet&apos;s destination address for redirection, TProxy merely replaces the socket held by the packet&apos;s &lt;a href=&quot;https://elixir.bootlin.com/linux/v6.1.34/source/include/linux/skbuff.h#L692&quot;&gt;skb&lt;/a&gt;, without modifying packet headers.&lt;/p&gt;
&lt;p&gt;Terminology note: TProxy is the general name for the feature, while TPROXY is the name of an iptables extension.&lt;/p&gt;
&lt;h2&gt;IP_TRANSPARENT&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;IP_TRANSPARENT&lt;/code&gt; option allows a socket to treat any non-local address as a local address, enabling it to bind to non-local addresses and masquerade as a non-local address when sending and receiving data.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;int opt = 1;
setsockopt(sockfd, SOL_IP, IP_TRANSPARENT, &amp;amp;opt, sizeof(opt));
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For example, a gateway (&lt;code&gt;192.168.0.1&lt;/code&gt; / &lt;code&gt;123.x.x.94&lt;/code&gt;) acting as a transparent proxy intercepts the connection between a client (&lt;code&gt;192.168.0.200&lt;/code&gt;) and a remote server (&lt;code&gt;157.x.x.149&lt;/code&gt;). It connects to the remote server on behalf of the client, while also masquerading as the remote server when communicating with the client:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;$ netstat -atunp
Proto Recv-Q Send-Q Local Address           Foreign Address            State       PID/Program name
tcp        0      0 123.x.x.94:37338        157.x.x.149:443            ESTABLISHED 2904/proxy
tcp        0      0 ::ffff:157.x.x.149:443  ::ffff:192.168.0.200:56418 ESTABLISHED 2904/proxy
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Inbound Redirection&lt;/h2&gt;
&lt;h3&gt;Why Replace the Socket&lt;/h3&gt;
&lt;p&gt;When the kernel networking stack receives a packet, it looks up the most closely matching socket from the corresponding protocol&apos;s hash table based on the packet&apos;s 5-tuple, then places the packet into that socket&apos;s receive queue. Taking UDP as an example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;// https://elixir.bootlin.com/linux/v6.1.34/source/net/ipv4/udp.c#L2405
int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
		   int proto)
{
	// ...
	sk = skb_steal_sock(skb, &amp;amp;refcounted);
	if (sk) {
		// ...
		ret = udp_unicast_rcv_skb(sk, skb, uh);
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;static inline struct sock *
skb_steal_sock(struct sk_buff *skb, bool *refcounted)
{
	if (skb-&amp;gt;sk) {
		struct sock *sk = skb-&amp;gt;sk;
		// ...
		return sk;
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;static int udp_unicast_rcv_skb(struct sock *sk, struct sk_buff *skb,
			       struct udphdr *uh)
{
	// ...
	ret = udp_queue_rcv_skb(sk, skb);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Netfilter hooks execute before the protocol stack, so modifying &lt;code&gt;skb-&amp;gt;sk&lt;/code&gt; in netfilter determines which socket&apos;s receive queue the packet will ultimately be placed into.&lt;/p&gt;
&lt;h3&gt;Kernel Implementation&lt;/h3&gt;
&lt;p&gt;Based on kernel v6.1.34, using the iptables TPROXY module implementation as an example. The nftables &lt;a href=&quot;https://elixir.bootlin.com/linux/v6.1.34/source/net/netfilter/nft_tproxy.c#L21&quot;&gt;implementation&lt;/a&gt; is essentially the same.&lt;/p&gt;
&lt;h4&gt;Core Logic&lt;/h4&gt;
&lt;p&gt;The main processing flow is in &lt;code&gt;tproxy_tg4()&lt;/code&gt; from &lt;a href=&quot;https://elixir.bootlin.com/linux/v6.1.34/source/net/netfilter/xt_TPROXY.c&quot;&gt;&lt;code&gt;net/netfilter/xt_TPROXY.c&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Extract headers from the skb:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;static unsigned int
tproxy_tg4(struct net *net, struct sk_buff *skb, __be32 laddr, __be16 lport,
	   u_int32_t mark_mask, u_int32_t mark_value)
{
	const struct iphdr *iph = ip_hdr(skb);
	struct udphdr _hdr, *hp;
	struct sock *sk;

	hp = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_hdr), &amp;amp;_hdr);
	if (hp == NULL)
		return NF_DROP;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then begin searching for a socket (&lt;code&gt;sk&lt;/code&gt; in the code) to replace the packet skb&apos;s original socket.&lt;/p&gt;
&lt;p&gt;If a previous packet with the same 4-tuple was already redirected, then the proxy should have already established a connection with the client, and the current packet should also be redirected to that connection:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;	/* check if there&apos;s an ongoing connection on the packet
	 * addresses, this happens if the redirect already happened
	 * and the current packet belongs to an already established
	 * connection */
	sk = nf_tproxy_get_sock_v4(net, skb, iph-&amp;gt;protocol,
				   iph-&amp;gt;saddr, iph-&amp;gt;daddr,
				   hp-&amp;gt;source, hp-&amp;gt;dest,
				   skb-&amp;gt;dev, NF_TPROXY_LOOKUP_ESTABLISHED);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Set the default redirection destination — unprocessed packets should all be redirected here. The rule-specified address takes priority; otherwise, the primary address of the receiving network device is used:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;	laddr = nf_tproxy_laddr4(skb, laddr, iph-&amp;gt;daddr);
	if (!lport)
		lport = hp-&amp;gt;dest;
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;__be32 nf_tproxy_laddr4(struct sk_buff *skb, __be32 user_laddr, __be32 daddr)
{
	const struct in_ifaddr *ifa;
	struct in_device *indev;
	__be32 laddr;

	if (user_laddr)
		return user_laddr;

	laddr = 0;
	indev = __in_dev_get_rcu(skb-&amp;gt;dev);

	in_dev_for_each_ifa_rcu(ifa, indev) {
		if (ifa-&amp;gt;ifa_flags &amp;amp; IFA_F_SECONDARY)
			continue;

		laddr = ifa-&amp;gt;ifa_local;
		break;
	}

	return laddr ? laddr : daddr;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Forward SYN packets to the proxy to establish new connections instead of reusing TIME_WAIT connections. My guess is that this allows the proxy to more easily synchronize the state of both sides of the connection (client &amp;lt;-&amp;gt; proxy &amp;lt;-&amp;gt; remote):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;	/* UDP has no TCP_TIME_WAIT state, so we never enter here */
	if (sk &amp;amp;&amp;amp; sk-&amp;gt;sk_state == TCP_TIME_WAIT)
		/* reopening a TIME_WAIT connection needs special handling */
		sk = nf_tproxy_handle_time_wait4(net, skb, laddr, lport, sk);
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;/**
 * nf_tproxy_handle_time_wait4 - handle IPv4 TCP TIME_WAIT reopen redirections
 * @skb:	The skb being processed.
 * @laddr:	IPv4 address to redirect to or zero.
 * @lport:	TCP port to redirect to or zero.
 * @sk:		The TIME_WAIT TCP socket found by the lookup.
 *
 * We have to handle SYN packets arriving to TIME_WAIT sockets
 * differently: instead of reopening the connection we should rather
 * redirect the new connection to the proxy if there&apos;s a listener
 * socket present.
 *
 * nf_tproxy_handle_time_wait4() consumes the socket reference passed in.
 *
 * Returns the listener socket if there&apos;s one, the TIME_WAIT socket if
 * no such listener is found, or NULL if the TCP header is incomplete.
 */
struct sock *
nf_tproxy_handle_time_wait4(struct net *net, struct sk_buff *skb,
			 __be32 laddr, __be16 lport, struct sock *sk)
{
	const struct iphdr *iph = ip_hdr(skb);
	struct tcphdr _hdr, *hp;

	hp = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_hdr), &amp;amp;_hdr);
	if (hp == NULL) {
		inet_twsk_put(inet_twsk(sk));
		return NULL;
	}

	if (hp-&amp;gt;syn &amp;amp;&amp;amp; !hp-&amp;gt;rst &amp;amp;&amp;amp; !hp-&amp;gt;ack &amp;amp;&amp;amp; !hp-&amp;gt;fin) {
		/* SYN to a TIME_WAIT socket, we&apos;d rather redirect it
		 * to a listener socket if there&apos;s one */
		struct sock *sk2;

		sk2 = nf_tproxy_get_sock_v4(net, skb, iph-&amp;gt;protocol,
					    iph-&amp;gt;saddr, laddr ? laddr : iph-&amp;gt;daddr,
					    hp-&amp;gt;source, lport ? lport : hp-&amp;gt;dest,
					    skb-&amp;gt;dev, NF_TPROXY_LOOKUP_LISTENER);
		if (sk2) {
			nf_tproxy_twsk_deschedule_put(inet_twsk(sk));
			sk = sk2;
		}
	}

	return sk;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If no established connection was matched, use the listening-state redirection destination socket:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;	else if (!sk)
		/* no, there&apos;s no established connection, check if
		 * there&apos;s a listener on the redirected addr/port */
		sk = nf_tproxy_get_sock_v4(net, skb, iph-&amp;gt;protocol,
					   iph-&amp;gt;saddr, laddr,
					   hp-&amp;gt;source, lport,
					   skb-&amp;gt;dev, NF_TPROXY_LOOKUP_LISTENER);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, verify that the new socket meets the transparent proxy requirements, then replace the packet skb&apos;s original socket:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;	/* NOTE: assign_sock consumes our sk reference */
	if (sk &amp;amp;&amp;amp; nf_tproxy_sk_is_transparent(sk)) {
		/* This should be in a separate target, but we don&apos;t do multiple
		   targets on the same rule yet */
		skb-&amp;gt;mark = (skb-&amp;gt;mark &amp;amp; ~mark_mask) ^ mark_value;
		nf_tproxy_assign_sock(skb, sk);
		return NF_ACCEPT;
	}

	return NF_DROP;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;/* assign a socket to the skb -- consumes sk */
static inline void nf_tproxy_assign_sock(struct sk_buff *skb, struct sock *sk)
{
	skb_orphan(skb);
	skb-&amp;gt;sk = sk;
	skb-&amp;gt;destructor = sock_edemux;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Socket Matching&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;nf_tproxy_get_sock_v4()&lt;/code&gt; is a simple wrapper around the generic TCP/UDP socket matching methods.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;// https://elixir.bootlin.com/linux/v6.1.34/source/net/ipv4/netfilter/nf_tproxy_ipv4.c#L75
/*
 * This is used when the user wants to intercept a connection matching
 * an explicit iptables rule. In this case the sockets are assumed
 * matching in preference order:
 *
 *   - match: if there&apos;s a fully established connection matching the
 *     _packet_ tuple, it is returned, assuming the redirection
 *     already took place and we process a packet belonging to an
 *     established connection
 *
 *   - match: if there&apos;s a listening socket matching the redirection
 *     (e.g. on-port &amp;amp; on-ip of the connection), it is returned,
 *     regardless if it was bound to 0.0.0.0 or an explicit
 *     address. The reasoning is that if there&apos;s an explicit rule, it
 *     does not really matter if the listener is bound to an interface
 *     or to 0. The user already stated that he wants redirection
 *     (since he added the rule).
 *
 * Please note that there&apos;s an overlap between what a TPROXY target
 * and a socket match will match. Normally if you have both rules the
 * &amp;quot;socket&amp;quot; match will be the first one, effectively all packets
 * belonging to established connections going through that one.
 */
struct sock *
nf_tproxy_get_sock_v4(struct net *net, struct sk_buff *skb,
		      const u8 protocol,
		      const __be32 saddr, const __be32 daddr,
		      const __be16 sport, const __be16 dport,
		      const struct net_device *in,
		      const enum nf_tproxy_lookup_t lookup_type)
{
	struct inet_hashinfo *hinfo = net-&amp;gt;ipv4.tcp_death_row.hashinfo;
	struct sock *sk;
	switch (protocol) {
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;TCP has corresponding matching methods for both states. The only extra step is incrementing the reference count for listening-state sockets to prevent them from being garbage collected:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;	case IPPROTO_TCP: {
		struct tcphdr _hdr, *hp;

		hp = skb_header_pointer(skb, ip_hdrlen(skb),
					sizeof(struct tcphdr), &amp;amp;_hdr);
		if (hp == NULL)
			return NULL;

		switch (lookup_type) {
		case NF_TPROXY_LOOKUP_LISTENER:
			sk = inet_lookup_listener(net, hinfo, skb,
						  ip_hdrlen(skb) + __tcp_hdrlen(hp),
						  saddr, sport, daddr, dport,
						  in-&amp;gt;ifindex, 0);

			if (sk &amp;amp;&amp;amp; !refcount_inc_not_zero(&amp;amp;sk-&amp;gt;sk_refcnt))
				sk = NULL;
			/* NOTE: we return listeners even if bound to
			 * 0.0.0.0, those are filtered out in
			 * xt_socket, since xt_TPROXY needs 0 bound
			 * listeners too
			 */
			break;
		case NF_TPROXY_LOOKUP_ESTABLISHED:
			sk = inet_lookup_established(net, hinfo, saddr, sport,
						     daddr, dport, in-&amp;gt;ifindex);
			break;
		default:
			BUG();
		}
		break;
		}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;UDP requires additional checks to determine whether the match result is usable:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;	case IPPROTO_UDP:
		sk = udp4_lib_lookup(net, saddr, sport, daddr, dport,
				     in-&amp;gt;ifindex);
		if (sk) {
			int connected = (sk-&amp;gt;sk_state == TCP_ESTABLISHED);
			int wildcard = (inet_sk(sk)-&amp;gt;inet_rcv_saddr == 0);

			/* NOTE: we return listeners even if bound to
			 * 0.0.0.0, those are filtered out in
			 * xt_socket, since xt_TPROXY needs 0 bound
			 * listeners too
			 */
			if ((lookup_type == NF_TPROXY_LOOKUP_ESTABLISHED &amp;amp;&amp;amp;
			      (!connected || wildcard)) ||
			    (lookup_type == NF_TPROXY_LOOKUP_LISTENER &amp;amp;&amp;amp; connected)) {
				sock_put(sk);
				sk = NULL;
			}
		}
		break;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are two qualifying conditions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;connected&lt;/code&gt; indicates whether the socket is &amp;quot;connected&amp;quot;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wildcard&lt;/code&gt; indicates whether the &lt;a href=&quot;https://elixir.bootlin.com/linux/v6.1.34/source/include/net/inet_sock.h#L193&quot;&gt;bind address&lt;/a&gt; is INADDR_ANY (&lt;code&gt;0.0.0.0&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, the condition &lt;code&gt;!connected || wildcard&lt;/code&gt; is puzzling, because when &lt;code&gt;connected&lt;/code&gt; is true, &lt;code&gt;wildcard&lt;/code&gt; is necessarily false, making &lt;code&gt;|| wildcard&lt;/code&gt; redundant.&lt;/p&gt;
&lt;p&gt;When a UDP socket &lt;code&gt;connect()&lt;/code&gt;s to a target, it enters the connected state. If it was not previously bound to an exact IP that could be written into the IP packet&apos;s destination address field, then during &lt;code&gt;connect()&lt;/code&gt; the system&apos;s static routing selects a local address to use as both the source address and the local bind address, and assigns it to the &lt;code&gt;inet_rcv_saddr&lt;/code&gt; field. Only a disconnect will set the &lt;code&gt;inet_rcv_saddr&lt;/code&gt; field back to INADDR_ANY:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;// https://elixir.bootlin.com/linux/v6.1.34/source/net/ipv4/datagram.c#L64
int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
{
	//...

	if (!inet-&amp;gt;inet_saddr)
		inet-&amp;gt;inet_saddr = fl4-&amp;gt;saddr;	/* Update source address */
	if (!inet-&amp;gt;inet_rcv_saddr) {
		inet-&amp;gt;inet_rcv_saddr = fl4-&amp;gt;saddr;
		if (sk-&amp;gt;sk_prot-&amp;gt;rehash)
			sk-&amp;gt;sk_prot-&amp;gt;rehash(sk);
	}

	// ...

	sk-&amp;gt;sk_state = TCP_ESTABLISHED;

	// ...
}

int __udp_disconnect(struct sock *sk, int flags)
{
	struct inet_sock *inet = inet_sk(sk);
	/*
	 *	1003.1g - break association.
	 */

	sk-&amp;gt;sk_state = TCP_CLOSE;

	// ...

	if (!(sk-&amp;gt;sk_userlocks &amp;amp; SOCK_BINDADDR_LOCK)) {
		inet_reset_saddr(sk);

	// ...
}

static __inline__ void inet_reset_saddr(struct sock *sk)
{
	inet_sk(sk)-&amp;gt;inet_rcv_saddr = inet_sk(sk)-&amp;gt;inet_saddr = 0;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Therefore, a connected UDP socket&apos;s &lt;code&gt;inet_rcv_saddr&lt;/code&gt; is always an exact IP address and can never be INADDR_ANY.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6006db84a91838813cdad8a6622a4e39efe9ea47&quot;&gt;commit&lt;/a&gt; that added these qualifying conditions mentions that &lt;code&gt;nf_tproxy_get_sock_v4()&lt;/code&gt; is also used by the iptables socket extension. I suspect this might be a historical artifact.&lt;/p&gt;
&lt;h3&gt;Usage&lt;/h3&gt;
&lt;p&gt;Using the iptables &lt;a href=&quot;https://ipset.netfilter.org/iptables-extensions.man.html#lbDW&quot;&gt;TPROXY extension&lt;/a&gt; as an example:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Specify the redirection destination with &lt;code&gt;--on-port&lt;/code&gt;/&lt;code&gt;--on-ip&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Since the packet&apos;s destination address is not modified, the routing decision after PREROUTING will still forward the packet to the FORWARD chain because the destination is not a local address. Therefore, policy routing is needed to steer the packet into the INPUT chain&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;ip rule add fwmark 0x233 table 100
ip route add local default dev lo table 100

iptables -t mangle -A PREROUTING -p udp -j TPROXY --on-ip 127.0.0.1 --on-port 10000 --tproxy-mark 0x233
iptables -t mangle -A PREROUTING -p tcp -j TPROXY --on-ip 127.0.0.1 --on-port 10000 --tproxy-mark 0x233
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This replaces the packet&apos;s original socket with the one bound to &lt;code&gt;:10000&lt;/code&gt;, while also setting the 0x233 fwmark. Policy routing is configured so that all packets with the 0x233 fwmark use routing table 100. The &lt;code&gt;local&lt;/code&gt; type rule in table 100 achieves &lt;code&gt;the destinations are assigned to this host. **The packets are looped back and delivered locally**.&lt;/code&gt; (&lt;a href=&quot;https://www.man7.org/linux/man-pages/man8/ip-route.8.html&quot;&gt;documentation&lt;/a&gt;), &lt;s&gt;and packets sent from the loopback device are all treated as destined for the local host&lt;/s&gt;, thereby preventing them from being forwarded out.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;2025/03/31 Update:&lt;/p&gt;
&lt;p&gt;&amp;quot;Packets sent from the loopback device are all treated as destined for the local host&amp;quot; is incorrect. The real key is the routing rule &lt;code&gt;ip route add local default dev lo table 100&lt;/code&gt;, where &lt;code&gt;local&lt;/code&gt; forces the packet to be received locally. So when the packet comes back out of lo and reaches the Routing decision after PREROUTING again, it is considered destined for the local host and is delivered to INPUT.&lt;/p&gt;
&lt;p&gt;Therefore, the inbound/outbound flow works like this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Inbound traffic -&amp;gt; PREROUTING, fwmark is added to the packet -&amp;gt; Routing decision finds the fwmark matches a routing rule, &lt;code&gt;local&lt;/code&gt; forces local delivery -&amp;gt; forwarded to lo -&amp;gt; comes back out of lo as inbound traffic again -&amp;gt; PREROUTING -&amp;gt; Routing decision determines the packet is destined for the local host -&amp;gt; INPUT&lt;/li&gt;
&lt;li&gt;Outbound traffic -&amp;gt; OUTPUT, fwmark is added -&amp;gt; Routing decision finds the fwmark matches a routing rule... (the rest follows the same flow as inbound)&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h3&gt;Using &lt;code&gt;-m socket&lt;/code&gt; for Traffic Splitting to Improve Performance&lt;/h3&gt;
&lt;p&gt;There is no very clear explanation for this; the following is my personal understanding and speculation.&lt;/p&gt;
&lt;p&gt;The comment in &lt;code&gt;nf_tproxy_get_sock_v4()&lt;/code&gt; mentions this point:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;/*
 * Please note that there&apos;s an overlap between what a TPROXY target
 * and a socket match will match. Normally if you have both rules the
 * &amp;quot;socket&amp;quot; match will be the first one, effectively all packets
 * belonging to established connections going through that one.
*/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After a packet redirected by TProxy establishes a connection, the networking stack has a mapping between the packet&apos;s original 5-tuple and the socket. Subsequent packets for that connection will match the socket through the stack&apos;s normal processing — the same socket that TPROXY&apos;s &lt;code&gt;sk = nf_tproxy_get_sock_v4(...., NF_TPROXY_LOOKUP_ESTABLISHED)&lt;/code&gt; would match — which is already the redirected one, making the subsequent replacement unnecessary.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;2024/06/17 Update&lt;/strong&gt;: Analysis of the performance difference.&lt;/p&gt;
&lt;p&gt;In TProxy, &lt;code&gt;nf_tproxy_assign_sock&lt;/code&gt; is executed to replace the sk. The &lt;code&gt;skb_orphan&lt;/code&gt; call within it invokes the skb destructor &lt;code&gt;sock_edemux&lt;/code&gt;, which calls &lt;a href=&quot;https://github.com/torvalds/linux/blob/v6.0/net/ipv4/inet_hashtables.c#L335&quot;&gt;&lt;code&gt;sock_gen_put&lt;/code&gt;&lt;/a&gt; to decrement the sk&apos;s reference count. But for &amp;quot;already-redirected connections,&amp;quot; this is entirely redundant, because the old and new sk are the same.&lt;/p&gt;
&lt;p&gt;In contrast, the socket module only needs to call &lt;code&gt;sock_gen_put&lt;/code&gt; when the found sk differs from the one associated with the skb.&lt;/p&gt;
&lt;p&gt;Therefore, the redundant and frequent invocations of &lt;code&gt;sock_gen_put&lt;/code&gt; in TProxy can impact performance to some degree.&lt;/p&gt;
&lt;p&gt;Additionally, since TProxy and socket were &lt;a href=&quot;https://github.com/torvalds/linux/commits/d2f26037a38ada4a5d40d1cf0b32bc5289f50312/&quot;&gt;committed together&lt;/a&gt;. I speculate that the developers intended transparent proxying to be a collaborative effort between these two modules: socket handles established connections, while TProxy handles new connections. This also explains why TProxy does not check &lt;code&gt;sk != skb-&amp;gt;sk&lt;/code&gt; when replacing the sk — perhaps precisely because the developers assumed that TProxy mostly handles new connections that have not been redirected yet, and the established connection check is just a safety fallback.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;It is relatively uncommon for proxy programs to &lt;code&gt;connect()&lt;/code&gt; to the client for UDP, so only TCP is used as an example here:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;iptables -t mangle -N tproxy_divert
iptables -t mangle -A tproxy_divert -j MARK --set-mark 0x233
iptables -t mangle -A tproxy_divert -j ACCEPT

iptables -t mangle -A PREROUTING -p tcp -m socket -j tproxy_divert
iptables -t mangle -A PREROUTING -p tcp -j TPROXY --on-port 10000 --on-ip 127.0.0.1 --tproxy-mark 0x233
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Retrieving the Original Destination Address&lt;/h2&gt;
&lt;h3&gt;TCP&lt;/h3&gt;
&lt;p&gt;Use &lt;code&gt;getsockname()&lt;/code&gt; to obtain the &amp;quot;local&amp;quot; address of the client socket, which is the packet&apos;s original destination address:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;client_fd = accept(server_fd, (struct sockaddr*)&amp;amp;client_addr, &amp;amp;addr_len);

getsockname(client_fd, (struct sockaddr*) orig_dst, &amp;amp;addrlen)
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;UDP&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;Use &lt;code&gt;setsockopt(..., SOL_IP, IP_RECVORIGDSTADDR, ...)&lt;/code&gt; to set the socket option so that &lt;code&gt;recvmsg()&lt;/code&gt; provides IP_RECVORIGDST ancillary data, which is the packet&apos;s destination address. Thanks to TProxy not modifying the original packet, this ancillary information is obtained from the IP header:&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;// /net/ipv4/ip_sockglue.c
static void ip_cmsg_recv_dstaddr(struct msghdr *msg, struct sk_buff *skb)
{
	struct sockaddr_in sin;
	const struct iphdr *iph = ip_hdr(skb);
	__be16 *ports = (__be16 *)skb_transport_header(skb);

	if (skb_transport_offset(skb) + 4 &amp;gt; (int)skb-&amp;gt;len)
		return;

	/* All current transport protocols have the port numbers in the
	 * first four bytes of the transport header and this function is
	 * written with this assumption in mind.
	 */

	sin.sin_family = AF_INET;
	sin.sin_addr.s_addr = iph-&amp;gt;daddr;
	sin.sin_port = ports[1];
	memset(sin.sin_zero, 0, sizeof(sin.sin_zero));

	put_cmsg(msg, SOL_IP, IP_ORIGDSTADDR, sizeof(sin), &amp;amp;sin);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;ol start=&quot;2&quot;&gt;
&lt;li&gt;Use &lt;code&gt;recvmsg()&lt;/code&gt; to read the packet and its ancillary data&lt;/li&gt;
&lt;li&gt;The ancillary data with level SOL_IP and type IP_ORIGDSTADDR contains the original destination address&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Complete example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;#include &amp;lt;arpa/inet.h&amp;gt;
#include &amp;lt;netinet/in.h&amp;gt;
#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;stdlib.h&amp;gt;
#include &amp;lt;string.h&amp;gt;
#include &amp;lt;sys/socket.h&amp;gt;
#include &amp;lt;sys/types.h&amp;gt;
#include &amp;lt;unistd.h&amp;gt;

#define MAX_BUF_SIZE 1024
#define SRC_ADDR INADDR_ANY
#define SRC_PORT 9999

int main() {
  int sockfd;
  struct sockaddr_in bind_addr, client_addr;
  char buffer[MAX_BUF_SIZE];

  if ((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) &amp;lt; 0) {
    perror(&amp;quot;socket&amp;quot;);
    exit(EXIT_FAILURE);
  }

  int opt = 1;
  if (setsockopt(sockfd, SOL_IP, IP_TRANSPARENT, &amp;amp;opt, sizeof(opt)) &amp;lt; 0) {
    perror(&amp;quot;IP_TRANSPARENT&amp;quot;);
    exit(EXIT_FAILURE);
  }

  // bind
  memset(&amp;amp;bind_addr, 0, sizeof(bind_addr));
  bind_addr.sin_family = AF_INET;
  bind_addr.sin_addr.s_addr = htonl(SRC_ADDR);
  bind_addr.sin_port = htons(SRC_PORT);
  if (bind(sockfd, (struct sockaddr *)&amp;amp;bind_addr, sizeof(bind_addr)) &amp;lt; 0) {
    perror(&amp;quot;bind&amp;quot;);
    exit(EXIT_FAILURE);
  }

  // recvmsg
  if (setsockopt(sockfd, SOL_IP, IP_RECVORIGDSTADDR, &amp;amp;opt, sizeof(opt)) &amp;lt; 0) {
    perror(&amp;quot;IP_RECVORIGDSTADDR&amp;quot;);
    exit(EXIT_FAILURE);
  }
  while (1) {
    memset(buffer, 0, sizeof(buffer));
    struct msghdr msgh = {0};
    struct iovec iov[1];
    iov[0].iov_base = buffer;
    iov[0].iov_len = sizeof(buffer);
    msgh.msg_iov = iov;
    msgh.msg_iovlen = 1;
    msgh.msg_name = &amp;amp;client_addr;
    msgh.msg_namelen = sizeof(client_addr);
    char cmsgbuf[CMSG_SPACE(sizeof(int))];
    msgh.msg_control = cmsgbuf;
    msgh.msg_controllen = sizeof(cmsgbuf);
    if (recvmsg(sockfd, &amp;amp;msgh, 0) &amp;lt; 0) {
      perror(&amp;quot;recvmsg&amp;quot;);
      continue;
    }

    struct cmsghdr *cmsg;
    for (cmsg = CMSG_FIRSTHDR(&amp;amp;msgh); cmsg != NULL;
         cmsg = CMSG_NXTHDR(&amp;amp;msgh, cmsg)) {
      if (cmsg-&amp;gt;cmsg_level == IPPROTO_IP &amp;amp;&amp;amp; cmsg-&amp;gt;cmsg_type == IP_ORIGDSTADDR) {
        struct sockaddr_in *addr = (struct sockaddr_in *)CMSG_DATA(cmsg);
        printf(&amp;quot;Original DST ADDR: %s\n&amp;quot;, inet_ntoa(addr-&amp;gt;sin_addr));
        break;
      }
    }
    printf(&amp;quot;Data: %s\n&amp;quot;, buffer);
  }

  close(sockfd);

  return 0;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://docs.kernel.org/networking/tproxy.html&quot;&gt;Official documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://powerdns.org/tproxydoc/tproxy.md.html&quot;&gt;Linux transparent proxy support&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://blog.mmf.moe/post/tproxy-investigation/&quot;&gt;TProxy Investigation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://blog.cloudflare.com/how-we-built-spectrum/#revealingthemagictrick&quot;&gt;Abusing Linux&apos;s firewall: the hack that allowed us to build Spectrum&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://ovear.info/post/509&quot;&gt;What is the socket module in iptables-extensions?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://vvl.me/2018/06/from-ss-redir-to-linux-nat/&quot;&gt;From ss-redir&apos;s Implementation to Linux NAT&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://arthurchiao.art/blog/linux-net-stack-implementation-rx-zh&quot;&gt;Linux Networking Stack: Receiving Data (RX): Principles and Kernel Implementation (2022)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;https://stackoverflow.com/a/5814636/12812480&lt;/li&gt;
&lt;li&gt;https://stackoverflow.com/a/44206723/12812480&lt;/li&gt;
&lt;li&gt;https://github.com/kristrev/tproxy-example&lt;/li&gt;
&lt;li&gt;https://github.com/KatelynHaworth/go-tproxy&lt;/li&gt;
&lt;/ul&gt;
</content:encoded></item><item><title>Hijacking Golang Compilation</title><link>https://rook1e.com/en/posts/go-build-hijacking/</link><guid isPermaLink="true">https://rook1e.com/en/posts/go-build-hijacking/</guid><description>This article briefly analyzes the Go compilation process and demonstrates a build hijacking attack based on the go build --toolexec mechanism.</description><pubDate>Wed, 03 Nov 2021 08:22:46 GMT</pubDate><content:encoded>&lt;blockquote&gt;
&lt;p&gt;This article was originally published on &lt;a href=&quot;https://paper.seebug.org/1749/&quot;&gt;Seebug&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A while ago, I studied 0x7F&apos;s &amp;quot;&lt;a href=&quot;https://paper.seebug.org/1713/&quot;&gt;DLL Hijacking and Its Applications&lt;/a&gt;&amp;quot;, which mentioned using DLL hijacking to hijack compilers for supply chain attacks. This reminded me that certain mechanisms in Go could also be leveraged to achieve build hijacking, so I did some research and testing.&lt;/p&gt;
&lt;h2&gt;The Compilation Process&lt;/h2&gt;
&lt;p&gt;First, let&apos;s understand what &lt;code&gt;go build&lt;/code&gt; does.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package main

func main() {
	print(&amp;quot;i&apos;m testapp!&amp;quot;)
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using this simple program as an example, &lt;code&gt;go build -x main.go&lt;/code&gt; compiles and prints the compilation process (due to space constraints, the most basic dependencies are not force-recompiled):&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/14/build-cmd.png&quot; alt=&quot;go build cmd&quot;&gt;&lt;/p&gt;
&lt;p&gt;The above commands can be summarized as:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Create a temporary directory&lt;/li&gt;
&lt;li&gt;Generate configuration files needed by compile, run compile to produce &lt;a href=&quot;https://en.wikipedia.org/wiki/Object_file&quot;&gt;object files&lt;/a&gt; &lt;code&gt;***.a&lt;/code&gt; (other build tools perform similar operations)&lt;/li&gt;
&lt;li&gt;Write the build ID&lt;/li&gt;
&lt;li&gt;Repeat steps 2 and 3 to compile all dependencies&lt;/li&gt;
&lt;li&gt;Generate configuration files needed by link, run link to combine the object files into an executable&lt;/li&gt;
&lt;li&gt;Write the build ID&lt;/li&gt;
&lt;li&gt;Move the linked executable to the current directory and delete the temporary directory&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A few interesting things can be observed from these commands.&lt;/p&gt;
&lt;p&gt;Each compilation stage is handled by a separate &lt;a href=&quot;https://pkg.go.dev/cmd&quot;&gt;tool program&lt;/a&gt;, such as compile, link, and asm. These tool programs can be accessed via &lt;code&gt;go tool&lt;/code&gt;, and I&apos;ll refer to the ones used for compilation as build tools.&lt;/p&gt;
&lt;p&gt;The commands contain large sections of &lt;code&gt;packagefile xxx/xxx=xxx.a&lt;/code&gt; entries that specify the mapping between code dependencies and object files. These mappings are written into &lt;code&gt;importcfg/importcfg.link&lt;/code&gt; as configuration files for compile/link.&lt;/p&gt;
&lt;p&gt;Additionally, temporary directories in the form of &lt;code&gt;$WORK/b001&lt;/code&gt; are created. Before running the build tools, &lt;code&gt;go build&lt;/code&gt; resolves all dependency relationships, creates corresponding actions for each package based on those dependencies, and ultimately forms an action graph. Executing these actions in order completes the compilation, with each action corresponding to a temporary directory. For example, compiling a program with &lt;code&gt;go build -a -work&lt;/code&gt; (&lt;code&gt;-a&lt;/code&gt; forces recompilation of everything, &lt;code&gt;-work&lt;/code&gt; preserves temporary directories):&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/14/build-temp.png&quot; alt=&quot;build temp&quot;&gt;&lt;/p&gt;
&lt;p&gt;The figure shows the temporary directories used by each action. For instance, b062 contains the compilation configuration file &lt;code&gt;importcfg&lt;/code&gt; and the compiled object file &lt;code&gt;_pkg_.a&lt;/code&gt;, while the last action&apos;s directory b001 contains not only compilation artifacts but also the link configuration &lt;code&gt;importcfg.link&lt;/code&gt; and the link result &lt;code&gt;exe/a.out&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In summary, here are the key takeaways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The main work of &lt;code&gt;go build&lt;/code&gt;: analyze dependencies, compile source code into object files, and link object files into an executable&lt;/li&gt;
&lt;li&gt;Object files and configuration files are stored in temporary directories (b001 is the last one and where the executable is produced); temporary directories can be preserved with the &lt;code&gt;-work&lt;/code&gt; flag&lt;/li&gt;
&lt;li&gt;Build tools are invoked to handle different stages of compilation&lt;/li&gt;
&lt;li&gt;Later actions depend on the results of earlier actions&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The compilation process is quite &amp;quot;decentralized,&amp;quot; which creates opportunities for us:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The build tools are &lt;a href=&quot;https://github.com/golang/go/tree/master/src/cmd&quot;&gt;open source&lt;/a&gt;, so they can be modified and replaced in the &lt;code&gt;go env GOTOOLDIR&lt;/code&gt; directory&lt;/li&gt;
&lt;li&gt;Leveraging the &lt;code&gt;go build -toolexec&lt;/code&gt; mechanism&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Both approaches share a similar idea. This article explores the second approach.&lt;/p&gt;
&lt;h2&gt;Hijacking the Build&lt;/h2&gt;
&lt;p&gt;While researching &lt;a href=&quot;https://paper.seebug.org/1586/&quot;&gt;code obfuscation&lt;/a&gt; some time ago, I learned about the &lt;code&gt;-toolexec&lt;/code&gt; mechanism of &lt;code&gt;go build&lt;/code&gt;. Here&apos;s the relevant excerpt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Keen readers may have noticed an interesting detail: the actual target in the assembled command is not the build tool itself, but &lt;code&gt;cfg.BuildToolexec&lt;/code&gt;. Tracing back to its definition reveals that it&apos;s set by the &lt;code&gt;go build -toolexec&lt;/code&gt; parameter. The official description is:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;-toolexec &apos;cmd args&apos;
a program to use to invoke toolchain programs like vet and asm.
For example, instead of running asm, the go command will run
  &apos;cmd args /path/to/asm &amp;lt;arguments for asm&amp;gt;&apos;.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That is, the program specified by &lt;code&gt;-toolexec&lt;/code&gt; is used to run the build tools. This can essentially be seen as a hook mechanism — by specifying our own program with this parameter, we can invoke the build tools through it during compilation, thereby intervening in the build process.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So our goal is to implement a tool similar to garble, which I&apos;ll call a wrapper. By inserting &lt;code&gt;-toolexec &amp;quot;/path/to/wrapper&amp;quot;&lt;/code&gt; into the project&apos;s build script or wherever build commands exist, the wrapper will find a suitable location (tentatively the top of &lt;code&gt;main.main()&lt;/code&gt;) to insert the payload when the build command runs.&lt;/p&gt;
&lt;p&gt;First, we need to locate the target source file.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;/path/to/wrapper /opt/homebrew/Cellar/go/1.17.2/libexec/pkg/tool/darwin_arm64/compile -o $WORK/b042/_pkg_.a -trimpath &amp;quot;$WORK/b042=&amp;gt;&amp;quot; -shared -p strings -std -complete -buildid ygbMG98G6g0UHH5pai26/ygbMG98G6g0UHH5pai26 -goversion go1.17.2 -importcfg $WORK/b042/importcfg -pack /opt/homebrew/Cellar/go/1.17.2/libexec/src/strings/builder.go /opt/homebrew/Cellar/go/1.17.2/libexec/src/strings/compare.go
...(omitted)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is a command executed by &lt;code&gt;go build -toolexec &amp;quot;/path/to/wrapper&amp;quot;&lt;/code&gt;, where the target source file paths for compile are appended at the end. After extracting the file paths, we determine whether a file contains &lt;code&gt;main.main()&lt;/code&gt; based on its content. There are many ways to do this — for instance, simply checking if it starts with &lt;code&gt;package main&lt;/code&gt; and contains &lt;code&gt;func main(){&lt;/code&gt;, or more rigorously by parsing the AST and checking the following characteristics:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/14/main-ast.png&quot; alt=&quot;main.main() ast&quot;&gt;&lt;/p&gt;
&lt;p&gt;Since all files in a single compile command belong to the same package, we can skip the remaining files as soon as one doesn&apos;t meet the criteria.&lt;/p&gt;
&lt;p&gt;In summary, the first step filters by the following conditions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The invoked tool is compile&lt;/li&gt;
&lt;li&gt;The file has a &lt;code&gt;.go&lt;/code&gt; extension&lt;/li&gt;
&lt;li&gt;The AST shows the package name is main, and there exists an &lt;code&gt;ast.FuncDecl&lt;/code&gt; named main in Decls&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With the target source file located, the next step is to insert the payload by modifying the AST.&lt;/p&gt;
&lt;p&gt;Based on the AST diagram from the previous step, each statement in &lt;code&gt;main()&lt;/code&gt; is parsed as an &lt;code&gt;ast.Stmt&lt;/code&gt; interface type, stored in &lt;code&gt;Body.List&lt;/code&gt;. So we construct AST nodes following the format of concrete statements, such as:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;var cmd = `exec.Command(&amp;quot;open&amp;quot;, &amp;quot;/System/Applications/Calculator.app&amp;quot;).Run()`
payloadExpr, err := parser.ParseExpr(cmd)
// handle err
payloadExprStmt := &amp;amp;ast.ExprStmt{
  X: payloadExpr,
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Insert the payload node into &lt;code&gt;main()&lt;/code&gt;&apos;s &lt;code&gt;Body.List&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// Method 1
ast.Inspect(f, func(n ast.Node) bool {
  switch x := n.(type) {
  case *ast.FuncDecl:
    if x.Name.Name == &amp;quot;main&amp;quot; &amp;amp;&amp;amp; x.Recv == nil {
      stmts := make([]ast.Stmt, 0, len(x.Body.List)+1)
      stmts = append(stmts, payloadExprStmt)
      stmts = append(stmts, x.Body.List...)
      x.Body.List = stmts
      return false
    }
  }
  return true
})

// Method 2
pre := func(cursor *astutil.Cursor) bool {
  switch cursor.Node().(type) {
  case *ast.FuncDecl:
    if fd := cursor.Node().(*ast.FuncDecl); fd.Name.Name == &amp;quot;main&amp;quot; &amp;amp;&amp;amp; fd.Recv == nil {
      return true
    }
    return false
  case *ast.BlockStmt:
    return true
  case ast.Stmt:
    if _, ok := cursor.Parent().(*ast.BlockStmt); ok {
      cursor.InsertBefore(payloadExprStmt)
    }
  }
  return true
}
post := func(cursor *astutil.Cursor) bool {
  if _, ok := cursor.Parent().(*ast.BlockStmt); ok {
    return false
  }
  return true
}
f = astutil.Apply(f, pre, post).(*ast.File)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, save the modified AST as a file, replace the file path in the original compile command, and execute the command.&lt;/p&gt;
&lt;p&gt;Simple enough — it seems like everything works smoothly at this point. However, testing reveals an error: &lt;code&gt;os/exec&lt;/code&gt; cannot be found:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;/var/folders/z5/1_qfr0f55x97c63p412hprzw0000gn/T/gobuild_cache_1747406166/main.go:5:2: could not import &amp;quot;os/exec&amp;quot;: open : no such file or directory
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Recall the &amp;quot;Compilation Process&amp;quot; section above: both the compilation and linking stages require the object files that were compiled earlier for their dependencies. Moreover, the dependency analysis and action graph construction are completed by &lt;code&gt;go build&lt;/code&gt; before running the build tools and cannot be hijacked via &lt;code&gt;-toolexec&lt;/code&gt;. So inserting a dependency into the AST&apos;s import nodes doesn&apos;t modify the existing dependency relationships or action graph, meaning there&apos;s no object file available for &lt;code&gt;os/exec&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Since the action graph is missing &lt;code&gt;os/exec&lt;/code&gt; and its dependencies, we can complete the missing actions ourselves — that is, compile the corresponding object files and add them to importcfg.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/14/exec-importcfg-diff.png&quot; alt=&quot;exec-package-diff&quot;&gt;&lt;/p&gt;
&lt;p&gt;Comparing the importcfg files reveals that there are more transitive dependencies than expected. Fortunately, they&apos;re all recorded in importcfg, so we create a new &lt;code&gt;go build&lt;/code&gt; to compile a simplified payload:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package main

import &amp;quot;os/exec&amp;quot;

func main() {
	exec.Command(&amp;quot;xxx&amp;quot;).Run()
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;By adding the &lt;code&gt;-work&lt;/code&gt; flag to preserve the temporary directory from this build, we can read the importcfg in temporary directory b001 to obtain the object file paths for &lt;code&gt;os/exec&lt;/code&gt;&apos;s dependencies, and then append these configuration entries to the original importcfg as needed.&lt;/p&gt;
&lt;p&gt;Trying again, we can see the payload is successfully inserted.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/14/toolexec.gif&quot; alt=&quot;wrapper demo&quot;&gt;&lt;/p&gt;
&lt;p&gt;Additionally, you may notice that the tests above all use the &lt;code&gt;-a&lt;/code&gt; flag. This is because &lt;code&gt;go build&lt;/code&gt; has caching and incremental compilation mechanisms — a normal &lt;code&gt;go build&lt;/code&gt; might hit the cache and not invoke the tools at all. So we need to add the &lt;code&gt;-a&lt;/code&gt; flag to force recompilation of all dependencies, or run &lt;code&gt;go clean -cache&lt;/code&gt; before building to clear the cache, or change the GOCACHE environment variable to point to a new directory.&lt;/p&gt;
&lt;p&gt;Finally, let&apos;s recap the steps:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;During compile:
&lt;ol&gt;
&lt;li&gt;Locate the target file&lt;/li&gt;
&lt;li&gt;Compile a simplified payload to obtain importcfg and its dependency artifacts&lt;/li&gt;
&lt;li&gt;Supplement &lt;code&gt;importcfg&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Insert the payload into the AST and save to a temporary file&lt;/li&gt;
&lt;li&gt;Modify the file path in the original compile command and execute it&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;During link:
&lt;ol&gt;
&lt;li&gt;Locate the target file&lt;/li&gt;
&lt;li&gt;Supplement &lt;code&gt;importcfg.link&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Execute the link command&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The approach demonstrated in this article leverages the &lt;code&gt;-toolexec&lt;/code&gt; mechanism of &lt;code&gt;go build&lt;/code&gt; to let a tool intervene in the compilation process and insert a payload into temporary files.&lt;/p&gt;
&lt;p&gt;From a practical standpoint, many challenges remain — for example, how to covertly insert &lt;code&gt;-toolexec&lt;/code&gt; and &lt;code&gt;-a&lt;/code&gt; into build scripts. Without suitable camouflage techniques, modifying and replacing the build tools compile and link following the approach described in this article might be a better choice.&lt;/p&gt;
&lt;p&gt;The code related to this article is available at &lt;a href=&quot;https://github.com/0x2E/go-build-hijacking&quot;&gt;go-build-hijacking&lt;/a&gt;. I&apos;ll continue to add improvements as new ideas come up. Feel free to reach out via issues or email.&lt;/p&gt;
&lt;h2&gt;Ref&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://maori.geek.nz/how-go-build-works-750bb2ba6d8e&quot;&gt;How &amp;quot;go build&amp;quot; Works&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://hao.io/2020/01/golang%E7%BC%96%E8%AF%91%E5%99%A8%E6%BC%AB%E8%B0%88%EF%BC%881%EF%BC%89%E7%BC%96%E8%AF%91%E5%99%A8%E5%92%8C%E8%BF%9E%E6%8E%A5%E5%99%A8&quot;&gt;golang Compiler Essentials (1): Compiler and Linker&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://xiaomi-info.github.io/2019/11/13/golang-compiler-principle/&quot;&gt;Inside Golang&apos;s Compiler Principles&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content:encoded></item><item><title>A First Look at Golang Code Obfuscation</title><link>https://rook1e.com/en/posts/golang-obfuscation/</link><guid isPermaLink="true">https://rook1e.com/en/posts/golang-obfuscation/</guid><description>This article explores Golang code obfuscation techniques by analyzing the implementation of the burrowers/garble project. Due to the scarcity of related resources, most of the content is based on source code analysis.</description><pubDate>Wed, 19 May 2021 07:22:36 GMT</pubDate><content:encoded>&lt;blockquote&gt;
&lt;p&gt;This article was originally published on &lt;a href=&quot;https://paper.seebug.org/1586/&quot;&gt;Seebug&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In recent years, Golang has surged in popularity. Thanks to its excellent performance, high development efficiency, and cross-platform capabilities, it has been widely adopted in software development. While enjoying the conveniences Golang brings, developers also need to think about how to protect their code and increase the difficulty of reverse engineering.&lt;/p&gt;
&lt;p&gt;Due to mechanisms like reflection in Golang, a large amount of information such as file paths and function names must be packed into the binary. This information cannot be stripped, so we consider obfuscating the code to raise the bar for reverse engineering.&lt;/p&gt;
&lt;p&gt;This article primarily explores Golang code obfuscation techniques by analyzing the implementation of the &lt;a href=&quot;https://github.com/burrowers/garble&quot;&gt;burrowers/garble&lt;/a&gt; project. Due to the scarcity of related resources, most of the content here is based on source code analysis. If there are any errors, please feel free to point them out in the comments or via email.&lt;/p&gt;
&lt;h2&gt;Prerequisites&lt;/h2&gt;
&lt;h3&gt;The Compilation Process&lt;/h3&gt;
&lt;p&gt;Go&apos;s compilation process can be abstracted as:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Lexical analysis: converting a character sequence into a token sequence&lt;/li&gt;
&lt;li&gt;Syntax analysis: parsing tokens into an AST&lt;/li&gt;
&lt;li&gt;Type checking&lt;/li&gt;
&lt;li&gt;Generating intermediate code&lt;/li&gt;
&lt;li&gt;Generating machine code&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This article will not delve into compiler theory in detail. For further reading, I recommend &lt;a href=&quot;https://draveness.me/golang/docs/part1-prerequisite/ch02-compile/golang-compile-intro/&quot;&gt;Go Language Design and Implementation - Compilation Principles&lt;/a&gt; and &lt;a href=&quot;https://github.com/golang/go/tree/master/src/cmd/compile&quot;&gt;Introduction to the Go compiler&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Let&apos;s explore the compilation process more intuitively from the source code perspective. The implementation of &lt;code&gt;go build&lt;/code&gt; is in &lt;code&gt;src/cmd/go/internal/work/build.go&lt;/code&gt;. Ignoring the handling of compiler type selection, environment information, etc., we focus only on the core part:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func runBuild(ctx context.Context, cmd *base.Command, args []string) {
	...
  var b Builder
  ...
  pkgs := load.PackagesAndErrors(ctx, args)
  ...
	a := &amp;amp;Action{Mode: &amp;quot;go build&amp;quot;}
	for _, p := range pkgs {
		a.Deps = append(a.Deps, b.AutoAction(ModeBuild, depMode, p))
	}
	...
	b.Do(ctx, a)
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;Action&lt;/code&gt; struct here represents a single action. Each action has a description, an associated package, dependencies (Deps), and other information. All related actions together form an action graph.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// An Action represents a single action in the action graph.
type Action struct {
	Mode     string         // description of action operation
	Package  *load.Package  // the package this action works on
	Deps     []*Action      // actions that must happen before this one
	Func     func(*Builder, context.Context, *Action) error // the action itself (nil = no-op)
	...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After creating action &lt;code&gt;a&lt;/code&gt; as the &amp;quot;root vertex,&amp;quot; it iterates over the packages specified for compilation, creating an action for each one. This creation process is recursive — during creation, it analyzes each package&apos;s dependencies and creates actions for them as well. For example, the &lt;code&gt;src/cmd/go/internal/work/action.go (b *Builder) CompileAction&lt;/code&gt; method:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;for _, p1 := range p.Internal.Imports {
	a.Deps = append(a.Deps, b.CompileAction(depMode, depMode, p1))
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The final &lt;code&gt;a.Deps&lt;/code&gt; serves as the &amp;quot;starting points&amp;quot; of the action graph. Once the action graph is constructed, action &lt;code&gt;a&lt;/code&gt; is used as the &amp;quot;root&amp;quot; for a depth-first traversal, where dependent actions are sequentially added to the task queue and then executed concurrently via &lt;code&gt;action.Func&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Each type of action has a designated method for its &lt;code&gt;Func&lt;/code&gt;, which is the core part of the action. For example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;a := &amp;amp;Action{
  Mode: &amp;quot;build&amp;quot;,
  Func: (*Builder).build,
  ...
}

a := &amp;amp;Action{
  Mode: &amp;quot;link&amp;quot;,
  Func: (*Builder).link,
  ...
}
...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Digging further, you&apos;ll find that aside from some necessary preprocessing, &lt;code&gt;(*Builder).link&lt;/code&gt; calls the &lt;code&gt;BuildToolchain.ld&lt;/code&gt; method, and &lt;code&gt;(*Builder).build&lt;/code&gt; calls methods like &lt;code&gt;BuildToolchain.symabis&lt;/code&gt;, &lt;code&gt;BuildToolchain.gc&lt;/code&gt;, &lt;code&gt;BuildToolchain.asm&lt;/code&gt;, and &lt;code&gt;BuildToolchain.pack&lt;/code&gt; to implement the core functionality. &lt;code&gt;BuildToolchain&lt;/code&gt; is of the &lt;code&gt;toolchain&lt;/code&gt; interface type, which defines the following methods:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// src/cmd/go/internal/work/exec.go
type toolchain interface {
	// gc runs the compiler in a specific directory on a set of files
	// and returns the name of the generated output file.
	gc(b *Builder, a *Action, archive string, importcfg, embedcfg []byte, symabis string, asmhdr bool, gofiles []string) (ofile string, out []byte, err error)
	// cc runs the toolchain&apos;s C compiler in a directory on a C file
	// to produce an output file.
	cc(b *Builder, a *Action, ofile, cfile string) error
	// asm runs the assembler in a specific directory on specific files
	// and returns a list of named output files.
	asm(b *Builder, a *Action, sfiles []string) ([]string, error)
	// symabis scans the symbol ABIs from sfiles and returns the
	// path to the output symbol ABIs file, or &amp;quot;&amp;quot; if none.
	symabis(b *Builder, a *Action, sfiles []string) (string, error)
	// pack runs the archive packer in a specific directory to create
	// an archive from a set of object files.
	// typically it is run in the object directory.
	pack(b *Builder, a *Action, afile string, ofiles []string) error
	// ld runs the linker to create an executable starting at mainpkg.
	ld(b *Builder, root *Action, out, importcfg, mainpkg string) error
	// ldShared runs the linker to create a shared library containing the pkgs built by toplevelactions
	ldShared(b *Builder, root *Action, toplevelactions []*Action, out, importcfg string, allactions []*Action) error

	compiler() string
	linker() string
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Go implements this interface separately for the gc and gccgo compilers. &lt;code&gt;go build&lt;/code&gt; selects between them during program initialization:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func init() {
	switch build.Default.Compiler {
	case &amp;quot;gc&amp;quot;, &amp;quot;gccgo&amp;quot;:
		buildCompiler{}.Set(build.Default.Compiler)
	}
}

func (c buildCompiler) Set(value string) error {
	switch value {
	case &amp;quot;gc&amp;quot;:
		BuildToolchain = gcToolchain{}
	case &amp;quot;gccgo&amp;quot;:
		BuildToolchain = gccgoToolchain{}
  ...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here we only look at the gc compiler portion in &lt;code&gt;src/cmd/go/internal/work/gc.go&lt;/code&gt;. Taking the &lt;code&gt;gc&lt;/code&gt; method as an example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func (gcToolchain) gc(b *Builder, a *Action, archive string, importcfg, embedcfg []byte, symabis string, asmhdr bool, gofiles []string) (ofile string, output []byte, err error) {
	// ...
	// Assemble arguments
	// ...

	args := []interface{}{cfg.BuildToolexec, base.Tool(&amp;quot;compile&amp;quot;), &amp;quot;-o&amp;quot;, ofile, &amp;quot;-trimpath&amp;quot;, a.trimpath(), gcflags, gcargs, &amp;quot;-D&amp;quot;, p.Internal.LocalPrefix}

	// ...

	output, err = b.runOut(a, base.Cwd, nil, args...)
	return ofile, output, err
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;At a high level, the &lt;code&gt;gc&lt;/code&gt; method doesn&apos;t actually perform the compilation work itself. Its main role is to assemble a command that invokes the binary located at &lt;code&gt;base.Tool(&amp;quot;compile&amp;quot;)&lt;/code&gt;. These programs can be called Go compilation tools, located in the &lt;code&gt;pkg/tool&lt;/code&gt; directory with source code in &lt;code&gt;src/cmd&lt;/code&gt;. Similarly, the other methods also call their respective compilation tools to perform the actual compilation work.&lt;/p&gt;
&lt;p&gt;Attentive readers may notice an interesting detail: the actual executable in the assembled command is not the compilation tool itself, but &lt;code&gt;cfg.BuildToolexec&lt;/code&gt;. Tracing this to its definition reveals it is set by the &lt;code&gt;go build -toolexec&lt;/code&gt; flag. The official description is:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;-toolexec &apos;cmd args&apos;
  a program to use to invoke toolchain programs like vet and asm.
  For example, instead of running asm, the go command will run
  &apos;cmd args /path/to/asm &amp;lt;arguments for asm&amp;gt;&apos;.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In other words, &lt;code&gt;-toolexec&lt;/code&gt; specifies a program to run the compilation tools. This can be thought of as a hook mechanism — by using this flag to specify our own program, we can intervene in the compilation process by having our program invoke the compilation tools. The garble project analyzed below uses exactly this approach. Here&apos;s a command excerpt from the compilation process (&lt;code&gt;go build -n&lt;/code&gt; outputs the executed commands) to help illustrate. For example, if we specify &lt;code&gt;-toolexec=/home/atom/go/bin/garble&lt;/code&gt;, then the actual command executed during compilation is:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;/home/atom/go/bin/garble /usr/local/go/pkg/tool/linux_amd64/compile -o $WORK/b016/_pkg_.a -trimpath &amp;quot;/usr/local/go/src/sync=&amp;gt;sync;$WORK/b016=&amp;gt;&amp;quot; -p sync -std -buildid FRNt7EHDh77qHujLKnmK/FRNt7EHDh77qHujLKnmK -goversion go1.16.4 -D &amp;quot;&amp;quot; -importcfg $WORK/b016/importcfg -pack -c=4 /usr/local/go/src/sync/cond.go /usr/local/go/src/sync/map.go /usr/local/go/src/sync/mutex.go /usr/local/go/src/sync/once.go /usr/local/go/src/sync/pool.go /usr/local/go/src/sync/poolqueue.go /usr/local/go/src/sync/runtime.go /usr/local/go/src/sync/runtime2.go /usr/local/go/src/sync/rwmutex.go /usr/local/go/src/sync/waitgroup.go
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To summarize, &lt;code&gt;go build&lt;/code&gt; invokes compilation tools like &lt;code&gt;compile&lt;/code&gt; by assembling commands, and we can use the &lt;code&gt;go build -toolexec&lt;/code&gt; flag to specify a program that &amp;quot;intervenes&amp;quot; in the compilation process.&lt;/p&gt;
&lt;h3&gt;go/ast&lt;/h3&gt;
&lt;p&gt;In Golang, AST types and methods are defined by the &lt;code&gt;go/ast&lt;/code&gt; standard library. The garble project analyzed later involves extensive type assertions and type switches with &lt;code&gt;go/ast&lt;/code&gt;, so it&apos;s important to have a general understanding of these types. Most types are defined in &lt;code&gt;src/go/ast/ast.go&lt;/code&gt;, where the comments are quite detailed. For convenience, I&apos;ve put together a relationship diagram. The branches in the diagram represent inheritance relationships, and all types are based on the &lt;code&gt;Node&lt;/code&gt; interface:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/13/go-ast-types.png&quot; alt=&quot;go/ast types&quot;&gt;&lt;/p&gt;
&lt;p&gt;This article doesn&apos;t intend to dive deep into ASTs, but I believe a basic understanding should be sufficient for the rest of this article. If you find it difficult to follow, I recommend reading &lt;a href=&quot;https://github.com/chai2010/go-ast-book/&quot;&gt;Introduction to Go Syntax Trees — A Journey into Building Your Own Programming Language and Compiler!&lt;/a&gt; to fill in any gaps, or using the online tool &lt;a href=&quot;https://yuroyoro.github.io/goast-viewer/index.html&quot;&gt;goast-viewer&lt;/a&gt; to visualize ASTs for analysis.&lt;/p&gt;
&lt;h2&gt;Tool Analysis&lt;/h2&gt;
&lt;p&gt;Among open-source Go code obfuscation projects, the two with the most stars are &lt;a href=&quot;https://github.com/burrowers/garble&quot;&gt;burrowers/garble&lt;/a&gt; and &lt;a href=&quot;https://github.com/unixpickle/gobfuscate&quot;&gt;unixpickle/gobfuscate&lt;/a&gt;. The former has more up-to-date features, so this article primarily analyzes garble, version &lt;a href=&quot;https://github.com/burrowers/garble/tree/8edde922ee5189f1d049edb9487e6090dd9d45bd&quot;&gt;8edde922ee5189f1d049edb9487e6090dd9d45bd&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;Features&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Supports modules, Go 1.16+&lt;/li&gt;
&lt;li&gt;Does not handle the following cases:
&lt;ul&gt;
&lt;li&gt;CGO&lt;/li&gt;
&lt;li&gt;Items marked as &lt;code&gt;ignoreObjects&lt;/code&gt;:
&lt;ul&gt;
&lt;li&gt;Types of arguments passed to &lt;code&gt;reflect.ValueOf&lt;/code&gt; or &lt;code&gt;reflect.TypeOf&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Functions used in &lt;code&gt;go:linkname&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Exported methods&lt;/li&gt;
&lt;li&gt;Types and variables imported from unobfuscated packages&lt;/li&gt;
&lt;li&gt;Constants&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;The runtime package and its dependencies (&lt;a href=&quot;https://github.com/burrowers/garble/issues/193&quot;&gt;support obfuscating the runtime package #193&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Go plugins&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Hashes the names of eligible packages, functions, variables, types, etc.&lt;/li&gt;
&lt;li&gt;Replaces strings with anonymous functions&lt;/li&gt;
&lt;li&gt;Removes debug information and symbol tables&lt;/li&gt;
&lt;li&gt;Can output obfuscated Go code via the &lt;code&gt;-debugdir&lt;/code&gt; option&lt;/li&gt;
&lt;li&gt;Can specify different seeds to produce different obfuscation results&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At a high level, garble can be divided into two modes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Active mode&lt;/strong&gt;: When the first command argument matches one of garble&apos;s presets, it means garble was invoked directly by the user. In this phase, it configures settings based on arguments, retrieves dependency package information, and then persists the configuration. If the command is &lt;code&gt;build&lt;/code&gt; or &lt;code&gt;test&lt;/code&gt;, it adds &lt;code&gt;-toolexec=path/to/garble&lt;/code&gt; to set itself as the launcher for compilation tools, leading to launcher mode.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Launcher mode&lt;/strong&gt;: It &amp;quot;intercepts&amp;quot; the three tools — compile/asm/link — performing source code obfuscation and modifying runtime arguments before the compilation tools run, then finally runs the tools to compile the obfuscated code.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Fetching and modifying arguments takes up a significant amount of code. For easier analysis, later sections will gloss over these details. Interested readers can consult the official documentation to learn about each argument&apos;s purpose.&lt;/p&gt;
&lt;h3&gt;Constructing the Target List&lt;/h3&gt;
&lt;p&gt;The target list is constructed in active mode. Here&apos;s an excerpt of the key code:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// listedPackage contains the &apos;go list -json -export&apos; fields obtained by the
// root process, shared with all garble sub-processes via a file.
type listedPackage struct {
	Name       string
	ImportPath string
	ForTest    string
	Export     string
	BuildID    string
	Deps       []string
	ImportMap  map[string]string
	Standard   bool

	Dir     string
	GoFiles []string

	// The fields below are not part of &apos;go list&apos;, but are still reused
	// between garble processes. Use &amp;quot;Garble&amp;quot; as a prefix to ensure no
	// collisions with the JSON fields from &apos;go list&apos;.

	GarbleActionID []byte

	Private bool
}

func setListedPackages(patterns []string) error {
  args := []string{&amp;quot;list&amp;quot;, &amp;quot;-json&amp;quot;, &amp;quot;-deps&amp;quot;, &amp;quot;-export&amp;quot;, &amp;quot;-trimpath&amp;quot;}
  args = append(args, cache.BuildFlags...)
  args = append(args, patterns...)
  cmd := exec.Command(&amp;quot;go&amp;quot;, args...)
  ...
  cache.ListedPackages = make(map[string]*listedPackage)
  for ...{
    var pkg listedPackage
    ...
    cache.ListedPackages[pkg.ImportPath] = &amp;amp;pkg
    ...
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The core mechanism uses the &lt;code&gt;go list&lt;/code&gt; command, where the &lt;code&gt;-deps&lt;/code&gt; flag is officially described as:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The -deps flag causes list to iterate over not just the named packages but also all their dependencies. It visits them in a depth-first post-order traversal, so that a package is listed only after all its dependencies. Packages not explicitly listed on the command line will have the DepOnly field set to true.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This traversal is actually quite similar to how &lt;code&gt;go build&lt;/code&gt; creates actions, as analyzed earlier. Through this command, garble can obtain all dependency information for the project (including transitive dependencies), iterating over and storing them in &lt;code&gt;cache.ListedPackages&lt;/code&gt;. Additionally, it marks whether each dependency package is under the &lt;code&gt;env.GOPRIVATE&lt;/code&gt; directory — only files under this directory will be obfuscated (with the exception that some parts of runtime are processed when the &lt;code&gt;-tiny&lt;/code&gt; flag is used). You can set the environment variable &lt;code&gt;GOPRIVATE=&amp;quot;*&amp;quot;&lt;/code&gt; to expand the scope for better obfuscation results. Regarding the scope of obfuscation, garble&apos;s author is also working on improvements: &lt;a href=&quot;https://github.com/burrowers/garble/issues/276&quot;&gt;idea: break away from GOPRIVATE? #276&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;At this point, the obfuscation targets have been identified. Along with some configuration-saving operations, the active mode&apos;s tasks are essentially complete, and it can then execute the assembled command, leading to launcher mode.&lt;/p&gt;
&lt;p&gt;In launcher mode, the three compilation tools — compile/asm/link — are intercepted to &amp;quot;intervene in the compilation process.&amp;quot; The quotes are intentional because garble doesn&apos;t actually perform any compilation work itself. Like &lt;code&gt;go build&lt;/code&gt;, it acts as a middleman, modifying source code or the arguments passed to the compilation tools, ultimately relying on these three tools to do the actual compilation. Let&apos;s analyze each one.&lt;/p&gt;
&lt;h3&gt;compile&lt;/h3&gt;
&lt;p&gt;The implementation is in the &lt;code&gt;main.go transformCompile&lt;/code&gt; function. Its main job is processing Go files and modifying command arguments. The &lt;code&gt;go build -n&lt;/code&gt; flag outputs the executed commands, and we can pass this flag when using garble to get a more intuitive view of the compilation process. Here&apos;s an excerpt:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;/home/atom/go/bin/garble /usr/local/go/pkg/tool/linux_amd64/compile -o $WORK/b016/_pkg_.a -trimpath &amp;quot;/usr/local/go/src/sync=&amp;gt;sync;$WORK/b016=&amp;gt;&amp;quot; -p sync -std -buildid FRNt7EHDh77qHujLKnmK/FRNt7EHDh77qHujLKnmK -goversion go1.16.4 -D &amp;quot;&amp;quot; -importcfg $WORK/b016/importcfg -pack -c=4 /usr/local/go/src/sync/cond.go /usr/local/go/src/sync/map.go /usr/local/go/src/sync/mutex.go /usr/local/go/src/sync/once.go /usr/local/go/src/sync/pool.go /usr/local/go/src/sync/poolqueue.go /usr/local/go/src/sync/runtime.go /usr/local/go/src/sync/runtime2.go /usr/local/go/src/sync/rwmutex.go /usr/local/go/src/sync/waitgroup.go
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This command uses the &lt;code&gt;compile&lt;/code&gt; tool to compile files like &lt;code&gt;cond.go&lt;/code&gt; into intermediate code. When garble detects that the current compilation tool is &lt;code&gt;compile&lt;/code&gt;, it &amp;quot;intercepts&amp;quot; it and performs obfuscation and other tasks before the tool runs. Let&apos;s analyze the key parts.&lt;/p&gt;
&lt;p&gt;First, the input Go files are parsed into ASTs:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;var files []*ast.File
for _, path := range paths {
  file, err := parser.ParseFile(fset, path, nil, parser.ParseComments)
  if err != nil {
    return nil, err
  }
  files = append(files, file)
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then type checking is performed — this is also a step in normal compilation. If type checking fails, it means the files cannot be compiled successfully, and the program exits.&lt;/p&gt;
&lt;p&gt;Since the type names of nodes involved in reflection (&lt;code&gt;reflect.ValueOf&lt;/code&gt; / &lt;code&gt;reflect.TypeOf&lt;/code&gt;) may be used in subsequent logic, their names cannot be obfuscated:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;if fnType.Pkg().Path() == &amp;quot;reflect&amp;quot; &amp;amp;&amp;amp; (fnType.Name() == &amp;quot;TypeOf&amp;quot; || fnType.Name() == &amp;quot;ValueOf&amp;quot;) {
  for _, arg := range call.Args {
    argType := tf.info.TypeOf(arg)
    tf.recordIgnore(argType, tf.pkg.Path())
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This introduces an important map that persists throughout each compile lifecycle, recording all objects that cannot be obfuscated: types used in reflection arguments, identifiers used in constant expressions and &lt;code&gt;go:linkname&lt;/code&gt;, and variables and types imported from unobfuscated packages:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// ignoreObjects records all the objects we cannot obfuscate. An object
// is any named entity, such as a declared variable or type.
//
// So far, this map records:
//
//  * Types which are used for reflection; see recordReflectArgs.
//  * Identifiers used in constant expressions; see RecordUsedAsConstants.
//  * Identifiers used in go:linkname directives; see handleDirectives.
//  * Types or variables from external packages which were not
//    obfuscated, for caching reasons; see transformGo.
ignoreObjects map[types.Object]bool
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&apos;s use the case of identifying &amp;quot;identifiers used in constant expressions&amp;quot; with the &lt;code&gt;ast.GenDecl&lt;/code&gt; type as an example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// RecordUsedAsConstants records identifieres used in constant expressions.
func RecordUsedAsConstants(node ast.Node, info *types.Info, ignoreObj map[types.Object]bool) {
	visit := func(node ast.Node) bool {
		ident, ok := node.(*ast.Ident)
		if !ok {
			return true
		}

		// Only record *types.Const objects.
		// Other objects, such as builtins or type names,
		// must not be recorded as they would be false positives.
		obj := info.ObjectOf(ident)
		if _, ok := obj.(*types.Const); ok {
			ignoreObj[obj] = true
		}

		return true
	}

	switch x := node.(type) {
	...
	// in a const declaration all values must be constant representable
	case *ast.GenDecl:
		if x.Tok != token.CONST {
			break
		}
		for _, spec := range x.Specs {
			spec := spec.(*ast.ValueSpec)

			for _, val := range spec.Values {
				ast.Inspect(val, visit)
			}
		}
	}
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Suppose the code to be obfuscated is:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package obfuscate

const (
	H2 string = &amp;quot;a&amp;quot;
	H4 string = &amp;quot;a&amp;quot; + H2
	H3 int    = 123
	H5 string = &amp;quot;a&amp;quot;
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can see that the identifier used in a constant expression is &lt;code&gt;H2&lt;/code&gt;. Let&apos;s walk through the determination process in the code. First, the entire &lt;code&gt;const&lt;/code&gt; block matches the &lt;code&gt;ast.GenDecl&lt;/code&gt; type. Then it iterates over its Specs (each definition), and for each spec, iterates over its Values (the expressions on the right side of the equals sign). It then uses &lt;code&gt;ast.Inspect()&lt;/code&gt; to traverse each element in &lt;code&gt;val&lt;/code&gt;, executing &lt;code&gt;visit()&lt;/code&gt;. If an element node&apos;s type is &lt;code&gt;ast.Ident&lt;/code&gt; and the object it points to is of type &lt;code&gt;types.Const&lt;/code&gt;, that object is recorded in &lt;code&gt;tf.recordIgnore&lt;/code&gt;. It&apos;s a bit convoluted, so let&apos;s print the AST:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/13/ignoreObjects-example.png&quot; alt=&quot;ignoreObjects-example&quot;&gt;&lt;/p&gt;
&lt;p&gt;We can clearly see that &lt;code&gt;H2&lt;/code&gt; in &lt;code&gt;H4 string = &amp;quot;a&amp;quot; + H2&lt;/code&gt; fully meets the criteria and should be recorded in &lt;code&gt;tf.recordIgnore&lt;/code&gt;. The upcoming analysis will involve many type assertions and type switches, which may look complex but are fundamentally similar to the process we just analyzed — we just need to write a demo and print the AST to understand it easily.&lt;/p&gt;
&lt;p&gt;Back to &lt;code&gt;main.go transformCompile&lt;/code&gt;. Next, the current package name is obfuscated and written into the command arguments and source files, provided the file is neither in the &lt;code&gt;main&lt;/code&gt; package nor outside the &lt;code&gt;env.GOPRIVATE&lt;/code&gt; directory. The next step processes comments and source code. There&apos;s special handling for runtime and CGO here, which we can safely ignore, and look directly at the handling for regular Go code:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// transformGo obfuscates the provided Go syntax file.
func (tf *transformer) transformGo(file *ast.File) *ast.File {
	if opts.GarbleLiterals {
		file = literals.Obfuscate(file, tf.info, fset, tf.ignoreObjects)
	}

	pre := func(cursor *astutil.Cursor) bool {...}
	post := func(cursor *astutil.Cursor) bool {...}

	return astutil.Apply(file, pre, post).(*ast.File)
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;First it obfuscates literals, then recursively processes each node of the AST, and finally returns the processed AST. These parts share a similar approach, all using &lt;code&gt;astutil.Apply(file, pre, post)&lt;/code&gt; for recursive AST processing, where &lt;code&gt;pre&lt;/code&gt; and &lt;code&gt;post&lt;/code&gt; functions are called before and after visiting child nodes, respectively. Much of this code consists of rather tedious filtering operations, so here&apos;s just a brief analysis:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;literals.Obfuscate pre&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Skips the following cases: values that need to be inferred, those containing non-basic types, types that need to be inferred (implicit type definitions), and constants marked in &lt;code&gt;ignoreObj&lt;/code&gt;. For constants that pass the filter, their token is changed from &lt;code&gt;const&lt;/code&gt; to &lt;code&gt;var&lt;/code&gt; to facilitate later replacement with anonymous functions. However, if any single constant in a &lt;code&gt;const&lt;/code&gt; block cannot be changed to &lt;code&gt;var&lt;/code&gt;, the entire block remains unmodified.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;literals.Obfuscate post&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Replaces string, byte slice, or array values with anonymous functions. The effect is shown below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/13/obfuscated-literals.png&quot; alt=&quot;obfuscated-literals&quot;&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;transformGo pre&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Skips nodes with names containing &lt;code&gt;_&lt;/code&gt; (unnamed) or &lt;code&gt;_C / _cgo&lt;/code&gt; (cgo code). For embedded fields, it finds the actual object to process, then further filters based on the object&apos;s type:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;types.Var&lt;/code&gt;: Skips non-global variables. For fields, the struct&apos;s type name is used as a hash salt. If the field&apos;s parent struct is unobfuscated, it&apos;s recorded in &lt;code&gt;tf.ignoreObjects&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;types.TypeName&lt;/code&gt;: Skips non-global types. If the type was not obfuscated at its definition site, it&apos;s skipped.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;types.Func&lt;/code&gt;: Skips exported methods, &lt;code&gt;main&lt;/code&gt;/&lt;code&gt;init&lt;/code&gt;/&lt;code&gt;TestMain&lt;/code&gt; functions, and test functions.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If a node passes the filter, its name is hashed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;transformGo post&lt;/code&gt;: Hashes import paths.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At this point, the source code obfuscation is complete. All that remains is to write the new code to a temporary directory and splice the address into the command to replace the original file paths. A new compile command is now ready, and executing it compiles the obfuscated code using the compilation tools.&lt;/p&gt;
&lt;h3&gt;asm&lt;/h3&gt;
&lt;p&gt;This is relatively simple and only applies to private packages. The core operations are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Adding the temporary directory path to the beginning of the &lt;code&gt;-trimpath&lt;/code&gt; argument&lt;/li&gt;
&lt;li&gt;Replacing called function names with their obfuscated versions. In Go assembly files, called function names are preceded by &lt;code&gt;·&lt;/code&gt;, which is used as the search pattern.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;link&lt;/h3&gt;
&lt;p&gt;This is also relatively simple. The core operations are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Replacing the package name (&lt;code&gt;pkg&lt;/code&gt;) and variable name (&lt;code&gt;name&lt;/code&gt;) marked by the &lt;code&gt;-X pkg.name=str&lt;/code&gt; argument with their obfuscated versions&lt;/li&gt;
&lt;li&gt;Clearing the &lt;code&gt;-buildid&lt;/code&gt; argument to prevent build ID leakage&lt;/li&gt;
&lt;li&gt;Adding the &lt;code&gt;-w -s&lt;/code&gt; flags to remove debug information, the symbol table, and the DWARF symbol table&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Obfuscation Results&lt;/h2&gt;
&lt;p&gt;Let&apos;s write a small piece of code and compile it twice: once with &lt;code&gt;go build .&lt;/code&gt; and once with &lt;code&gt;go env -w GOPRIVATE=&amp;quot;*&amp;quot; &amp;amp;&amp;amp; garble -literals build .&lt;/code&gt;. As you can see, the simple code on the left becomes much harder to read after obfuscation:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/13/obfuscated-show-1.png&quot; alt=&quot;obfuscated-show-1&quot;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/13/obfuscated-show-2.png&quot; alt=&quot;obfuscated-show-2&quot;&gt;&lt;/p&gt;
&lt;p&gt;Let&apos;s also load them into IDA and parse with &lt;a href=&quot;https://github.com/0xjiayu/go_parser&quot;&gt;go_parser&lt;/a&gt;. In the unobfuscated file, information like file names and function names is clearly visible, and the code logic is fairly clean:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/13/obfuscated-show-ida-1.png&quot; alt=&quot;obfuscated-show-ida-1&quot;&gt;&lt;/p&gt;
&lt;p&gt;After obfuscation, function names and other information are replaced with garbled text. Moreover, since strings have been replaced with anonymous functions, the code logic is much more confusing:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.rook1e.com/posts/13/obfuscated-show-ida-2.png&quot; alt=&quot;obfuscated-show-ida-2&quot;&gt;&lt;/p&gt;
&lt;p&gt;When dealing with larger projects with more dependencies, the chaos introduced by code obfuscation becomes even more severe. Since third-party dependency packages are also obfuscated, reverse engineers can no longer guess the code logic based on imported third-party packages.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This article explored the general workflow of how Golang&apos;s compilation process invokes the toolchain, as well as the &lt;a href=&quot;https://github.com/burrowers/garble&quot;&gt;burrowers/garble&lt;/a&gt; project, from a source code implementation perspective. We learned how to use &lt;code&gt;go/ast&lt;/code&gt; to perform code obfuscation. Through obfuscation, the code&apos;s logical structure and the information retained in the binary become much harder to read, significantly increasing the difficulty of reverse engineering.&lt;/p&gt;
</content:encoded></item></channel></rss>