Mar 25, 2026
5 min
Imagine you are building an application that relies on a word bank. One design choice that might arise is whether this data should live in an external text file or should be baked directly into the compiled binary. While modern development often favors the flexibility of external assets, performance purists out there will argue in favor of the raw speeds of internal embeddings.
We know intuitively that memory access beats disk I/O, but by how much? More importantly, is the performance difference significant enough to choose one solution over the other? Let’s visualize the results quantitatively and put those numbers into perspective.
First, to clarify we are only measuring data availability latency: the time elapsed from the moment of process execution until the entire word bank is accessible as a contiguous byte slice in memory.
package main
import (
_ "embed"
"fmt"
"os"
"time"
)
var embeddedData []byte
func main() {
startEmbed := time.Now()
var embedChecksum byte
for _, data := range embeddedData {
embedChecksum += data
}
durationEmbed := time.Since(startEmbed).Nanoseconds()
startExt := time.Now()
externalData, err := os.ReadFile("wordbank.txt")
if err != nil {
os.Exit(1)
}
var externalChecksum byte
for _, data := range externalData {
externalChecksum += data
}
durationExt := time.Since(startExt).Nanoseconds()
fmt.Printf("%d %d\n", durationEmbed, durationExt)
_ = embedChecksum
_ = externalChecksum
}
| Size | Embedded (avg) | External (avg) | Speedup |
|---|---|---|---|
| 1KB | 0.19 us | 28.88 us | 14792.90% |
| 100KB | 0.18 us | 70.77 us | 38619.20% |
| 1MB | 0.17 us | 529.80 us | 316259.20% |
| 10MB | 0.30 us | 4753.46 us | 1598240.60% |
It is important to note that hardware and software state is the largest variable here. Swap the storage to use an NVMe SSD or disable cache file system and the numbers here might look significantly different. However, these results are significant enough to where regardless of such variables we can probably assume there’s at least a 1000% speedup at the very least (very very conservative lower bound).
Now that we have the results, let’s ground them in reality. While there is an order-of-magnitude difference in performance, especially as the resource size increases, it often matters less than the numbers suggest.
Consider a comprehensive English dictionary of ~450,000 words. At an average of 5-8 characters per word, the file sits at roughly 2 MB to 4 MB. Even on modest hardware, an external fetch of a 4 MB file might take only 2–5 ms.
Here are some latency numbers put into perspective:
Since loading a word bank is typically a one-time upfront cost at startup, a 5 ms delay is invisible to the user.
If performance were the only metric, embedding would win every time. However, the decision is much more nuanced. Embedding creates bloat in the compiled binary, but also simplifies distribution. Loading external files adds more latency, but it also allows for more flexibility, user configuration, and manageable resources.
This is not to say that being blazingly fast does not matter, rather the lesson here is to carefully consider the tradeoffs of each decision and understand what your project needs to prioritize. In an age where production is cheap, critically evaluating a problem and having the knowledge to make a decision becomes an invaluable skill.