Skip to content

feat: add line-by-line mode as default, stream without loading files into memory#328

Merged
oriongonza merged 8 commits into
masterfrom
optimize
Feb 25, 2026
Merged

feat: add line-by-line mode as default, stream without loading files into memory#328
oriongonza merged 8 commits into
masterfrom
optimize

Conversation

@oriongonza

@oriongonza oriongonza commented Feb 21, 2026

Copy link
Copy Markdown
Collaborator

Implements line-by-line processing as the default mode, replacing the previous behavior of loading entire files into memory via mmap. Adds --across (-A) for the old whole-file behavior when needed.

Changes

  • Default mode is now line-by-line: processes files as a stream, never loading the whole thing into memory
  • New --across / -A flag: opt-in to the old whole-file mmap behavior, which is faster when memory isn't a concern
  • Chunked I/O: reads in 8KB chunks with a line buffer, so performance is good and memory use is bounded
  • Refactored codebase: split into sd (library) and sd-cli (binary) crates

Closes

Closes #96 — massive memory usage on large files
Closes #100 — stdin now streams, works with journalctl -f | sd ... and similar
Closes #154 — memory allocation failure on files too large to fit in RAM
Closes #286sd now streams by default; the documented caveat no longer applies
Closes #302 — output is emitted as lines are processed, not buffered until EOF

Also closes #290. I'm back.

Benchmarks

1M lines (~36MB), foo → qux:

Command Time
sd -A 'foo' 'qux' (across, whole-file) 33ms
sd 'foo' 'qux' (line-by-line, default) 106ms
sed s/foo/qux/g 120ms

Line-by-line is faster than sed while using a fraction of the memory.

@oriongonza oriongonza changed the title perf: optimize line-by-line mode with chunked reading feat: add line-by-line mode as default, stream without loading files into memory Feb 21, 2026
oriongonza and others added 7 commits February 25, 2026 12:08
This made no sense because we don't intend to ever release `sd` as a crate
Add a new processing mode that handles input line by line instead of
reading entire files into memory. This fixes several long-standing issues:
- OOM on large files (O(line_size) memory instead of O(file_size))
- stdin waits for EOF (output now flushed per line, enables streaming)
- `^` matches phantom empty line after trailing `\n`
- `\s+$` eats newlines because `\s` sees `\n` across line boundaries

The implementation strips `\n` before passing each line to the replacer,
then restores it, so regex never sees newline characters. Files without
trailing newlines are preserved as-is. In-place file modification uses
the same atomic temp-file-and-rename pattern as the existing code path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Line-by-line processing is now the default behavior. This provides
better defaults for common use cases: lower memory usage, streaming
stdin output, and predictable regex anchor behavior.

For patterns that need to match across line boundaries (e.g. replacing
\n or multi-line patterns), use the new --across / -A flag which
restores the previous whole-file behavior.

Pre-validates all input files before modifying any, matching the
atomicity guarantees of the mmap-based code path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add benchmark results comparing line-by-line (default) and across (-A)
modes on a 1M line (~36MB) test file:
- Line-by-line is ~2-3x slower than across mode for throughput
- Still faster than sed for regex replacements
- Memory usage: 3 MB (line-by-line) vs 74 MB (across)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace per-line read_until() calls with chunked reading (8KB chunks)
and a line buffer that spans chunk boundaries. This reduces syscall
overhead and improves CPU cache locality.

Benchmark results on 1M line file (~36MB):
- Before: 357ms (2.84x slower than across mode, slower than sed)
- After:  106ms (3.19x slower than across mode, 1.1x faster than sed)

The trade-off between modes is:
- Across mode: fastest (33ms), uses more memory (~74MB)
- Line-by-line: now much faster (106ms), bounded memory usage
- Line-by-line still respects memory limits for streaming use cases

fix build, tests, and lint regressions

remove file-mapping code paths and dependency
@oriongonza oriongonza merged commit 4a7b216 into master Feb 25, 2026
4 of 9 checks passed
@ofek

ofek commented Feb 25, 2026

Copy link
Copy Markdown

Awesome work!

@varenc

varenc commented Feb 26, 2026

Copy link
Copy Markdown

Thank you for this!

cstyles added a commit to cstyles/dotfiles that referenced this pull request Jun 30, 2026
`sd` was updated recently to process line-by-line instead of over the
entire file:

chmln/sd#328

As a result, the todo file after `sd` processes it contains a `break`
for every single line of the original, instead of just a single `break`
before the first line. The simplest fix is to use the new `-A` flag
which re-enables the old whole-file behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants