This file provides guidance LLM agents and contributors when working with code in this repository.
Ark is an R kernel for Jupyter applications, primarily created to serve as the interface between R and the Positron IDE. It is compatible with all frontends implementing the Jupyter protocol.
The project includes:
- A Jupyter kernel for structured interaction between R and a frontend
- An LSP server for intellisense features (completions, jump-to-definition, diagnostics)
- A DAP server for step-debugging of R functions
The codebase is organized as a Rust workspace containing multiple crates:
- ark: The main R Kernel implementation
- harp: Rust wrappers for R objects and interfaces
- libr: Bindings to R (dynamically loaded using
dlopen/LoadLibrary) - amalthea: A Rust framework for building Jupyter and Positron kernels
- echo: A toy kernel for testing the kernel framework
- stdext: Extensions to Rust's standard library used by the other projects
# Build the entire project
cargo build
# Build in release mode
cargo build --release
# On Windows: If Positron is running with a debug build of ark,
# Windows file locking prevents overwriting ark.exe. For interim "progress"
# checks during development, just check the specific crate you're working on
# instead:
cargo check --package ark
# You'll have to quit Positron to do `cargo build`, though, on Windows.We use nextest to run tests, which we invoke through our just runner. just test expands to cargo nextest run. Arguments are forwarded and nextest generally supports the same arguments as cargo test.
# Run all tests
just test
# Run specific tests
# Much better output than `cargo test <testname>`
just test <test_name>
# Run tests for a specific crate
just test -p arkjust clippycargo +nightly fmt --allThis requires the nightly toolchain because .rustfmt.toml uses nightly-only options.
Integration tests for the kernel and DAP server live in crates/ark/tests/ and use the test utilities from crates/ark_test/.
Key components:
-
DummyArkFrontend: A mock Jupyter frontend that communicates with the kernel over ZMQ sockets. UseDummyArkFrontend::lock()to acquire it (only one per process). -
DapClient: A DAP client for testing the debugger. Obtained viafrontend.dap_client()after starting the kernel.
Stream handling:
All recv_iopub_* methods automatically skip and buffer stream messages. This means:
- Non-stream assertions read cleanly without worrying about interleaved streams
- Stream content is accumulated in internal buffers
- Use
assert_stream_stdout_contains()orassert_stream_stderr_contains()to check stream content - Stream assertions must be placed BEFORE
recv_iopub_idle()within each busy/idle window recv_iopub_idle()acts as a synchronization point that flushes stream buffers and panics if streams were received but not asserted
Common patterns:
// Lock the frontend and send an execute request
let frontend = DummyArkFrontend::lock();
frontend.send_execute_request("1 + 1", ExecuteRequestOptions::default());
frontend.recv_iopub_busy();
frontend.recv_iopub_execute_input();
frontend.recv_iopub_execute_result();
frontend.recv_iopub_idle();
frontend.recv_shell_execute_reply();
// Asserting on stream output (order-sensitive: assert in expected order)
frontend.send_execute_request("cat('hello\\nworld\\n')", ExecuteRequestOptions::default());
frontend.recv_iopub_busy();
frontend.recv_iopub_execute_input();
frontend.assert_stream_stdout_contains("hello");
frontend.assert_stream_stdout_contains("world");
frontend.recv_iopub_idle(); // Flushes stream buffers, panics on unasserted streams
frontend.recv_shell_execute_reply();
// Use drain_streams() to consume buffered stream output so `recv_iopub_idle()` won't panic
let streams = frontend.drain_streams();
assert!(streams.stdout().contains("expected"));Debugging tests:
Log messages (from the log crate) are not shown in test output. Use eprintln! for printf-style debugging.
Enable message tracing to see timestamped DAP and IOPub message flows:
ARK_TEST_TRACE=1 just test test_name # All messages
ARK_TEST_TRACE=dap just test test_name # DAP events only
ARK_TEST_TRACE=iopub just test test_name # IOPub messages onlyThe following R packages are required for tests:
- data.table
- dplyr
- rstudioapi
- tibble
- haven
- R6
After building, you can install the Jupyter kernel specification with:
./target/debug/ark --install
# or in release mode
./target/release/ark --installSome of the files below crates/amalthea/src/comm/ are automatically generated from comms specified in the Positron front end.
Such files always have // @generated at the top and SHOULD NEVER be edited "by hand".
If changes are needed in these files, that must happen in the separate Positron source repository and the comms for R and Python must be regenerated.
-
Do not use
bail!. Instead use an explicitreturn Err(anyhow!(...)). -
When a
matchexpression is the last expression in a function, omitreturnkeywords in match arms. Let the expression evaluate to the function's return value. -
For error messages and logging, prefer direct formatting syntax:
Err(anyhow!("Message: {err}"))instead ofErr(anyhow!("Message: {}", err)). This also applies tolog::error!andlog::warn!andlog::info!macros. For logging errors specifically, use Debug formatting{err:?}to get more detailed error information. -
Use
log::trace!instead oflog::debug!. -
Use fully qualified result types (
anyhow::Result) instead of importing them. -
You can log
Result::Errby using the.log_err()method from the extension traitstdext::ResultExt. Add some.context()if that would be helpful, but never do it for errors that are quite unexpected, such as from.send()to a channel (that would be too verbose). -
Avoid
.unwrap()and.expect(). For truly unrecoverable errors, use an explicit match with apanic!branch. For recoverable errors, use.log_err()or propagate with?. -
Avoid unnecessary
.clone(). Prefer returning&stroverStringfrom accessors so callers only allocate when they need to. Reorder operations to avoid cloning (e.g., build a response before consuming the source data). ForArc, useArc::clone(&x)instead ofx.clone()to make the cheap clone obvious. -
Keep
Cargo.tomldependencies in alphabetical order. -
All dependencies must be declared in the workspace root's
Cargo.toml. Individual crates reference them withdep.workspace = true. -
When writing tests, prefer simple assertion macros without custom error messages:
- Use
assert_eq!(actual, expected);instead ofassert_eq!(actual, expected, "custom message"); - Use
assert!(condition);instead ofassert!(condition, "custom message");
- Use
-
In tests, prefer exact assertions over fuzzy ones. Use
assert_eq!(stack[0].name, "foo()")rather thanassert!(names.contains(&"foo()"))when ordering and completeness matter. -
When you extract code in a function (or move things around) that function goes below the calling function. A general goal is to be able to read linearly from top to bottom with the relevant context and main logic first. The code should be organised like a call stack. Of course that's not always possible, use best judgement to produce the clearest code organization.
-
Keep the main logic as unnested as possible. Favour Rust's
let ... elsesyntax to return early or continue a loop in theelseclause, overif let. -
Always prefer importing with
useinstead of qualifying with::, unless specifically requested in these instructions or by the user, or you see existing::usages in the file you're editing. -
When two code paths do analogous things, make them structurally parallel so the symmetry is visible.
-
Don't let comments drift from the code. If behaviour changes, update nearby comments. If a file is renamed, update its header comment.
-
Use the new async closure syntax, e.g.
async move || { ... }instead of|| async move { ... }.