GitHub - anvaka/ngraph.leiden: Leiden community detection algorithm in graphs

ngraph.leiden

Leiden/Louvain community detection for ngraph.graph. Fast, deterministic (seeded), and flexible: undirected/directed modularity, CPM with resolution, multilayer aggregation, fixed nodes, and custom weights/sizes.

Quick online demo: https://anvaka.github.io/ngraph.leiden/

If you prefer command line: cat graph.dot | npx ngraph.leiden - prints node/community membership

Install

Install the published package in your project:

npm i ngraph.leiden

This repository uses Node 18+ for development. To work on the repo locally:

npm install
npm test
npm run build

Quick start

import createGraph from 'ngraph.graph'
import { detectClusters } from 'ngraph.leiden'

const g = createGraph()
// Undirected graphs: just add a single link; adapter will symmetrize for you
g.addNode('a'); g.addNode('b'); g.addNode('c'); g.addNode('d')
g.addLink('a','b')
g.addLink('c','d')

const result = detectClusters(g, { randomSeed: 42 })

// Get a community id for a node
console.log('a ->', result.getClass('a'))

// Group membership as Map<communityId, nodeId[]>
console.log(result.getCommunities())

// Partition quality for the chosen objective
console.log('Q =', result.quality())

CommonJS:

const createGraph = require('ngraph.graph')
const { detectClusters } = require('ngraph.leiden')

API

Signature

detectClusters(input, options?) => Clusters

Input

input: either
- an ngraph.graph instance; or
- a multilayer array: [{ graph, weight?, linkWeight?, nodeSize? }] — all layers must share the same node ids (see Multilayer graphs).

Return value: Clusters

getClass(nodeId): number — community id for nodeId.
getCommunities(): Map<number, string[]> — nodes grouped by community id.
quality(): number — objective value for the final partition. With quality: 'cpm', see CPM on how it’s computed and options.cpmMode.
toJSON(): { membership: Record<id, number>, meta: { levels: number, quality: number, options: object } }.

Options

Each option below is optional. Defaults are shown in backticks.

quality

What: Objective to optimize.
Values: 'modularity' | 'cpm'. Default: 'modularity'.
Why: Use 'modularity' for classic community detection; use 'cpm' (Constant Potts Model) when you need explicit control over community granularity via resolution or when modularity’s resolution limit is an issue.
When: Prefer 'cpm' for very large graphs or when you want consistent small communities; otherwise 'modularity' is a strong default.

resolution (CPM)

What: The gamma parameter for CPM.
Value: number. Default: 1.0.
Why: Larger resolution yields more/smaller communities; smaller yields fewer/larger.
When: Tune to match desired granularity; try 0.1, 1.0, 5.0 as starting points.

directed

What: Enable Leicht–Newman directed modularity.
Value: boolean. Default: false.
Why: For directed networks where in/out strengths matter.
When: Set true for directed graphs. For undirected, keep false and add reciprocal links in your graph.

randomSeed

What: Seed for the internal RNG used for node order shuffles.
Value: number. Default: 42.
Why: Ensures deterministic, reproducible partitions.
When: Set explicitly for reproducible experiments or tests.

candidateStrategy

What: Controls which target communities are considered when moving a node.
Values: 'neighbors' (default), 'all', 'random', 'random-neighbor'.
Why: Trade-off between speed and search breadth.
When:
- 'neighbors': fastest and typical; consider only adjacent communities.
- 'all': exhaustive but slow on large graphs.
- 'random': sample from all communities (broader search at bounded cost).
- 'random-neighbor': sample from neighbor communities (cheap, slightly more exploratory).

allowNewCommunity

What: Allow moves that create a fresh singleton community.
Value: boolean. Default: false.
Why: Can help escape local minima in specific structures.
When: Rarely needed; try enabling if you see over-merged communities.

maxCommunitySize

What: Upper bound on a community’s total size.
Value: number. Default: Infinity.
How measured: Uses nodeSize (default 1 per node). A move that would exceed this bound is skipped.
Why: Enforce capacity or balance constraints.
When: Useful for constrained clustering or to avoid giant communities.

refine

What: Enable Leiden-style refinement between coarsening levels.
Value: boolean. Default: true.
Why: Improves partitions by re-optimizing within coarse communities.
When: Keep true for best quality; turn off for speed-sensitive runs.

fixedNodes

What: Nodes that must remain in their initial communities at the finest level.
Value: Set or Array of node ids.
Why: Respect domain constraints or anchor communities.
When: Use to pin landmarks, seeds, or known group members.

preserveLabels

What: Control how community ids are compacted after renumbering.
Values: false (default), true (preserve old order), or Map<oldId, order>.
Why: Stable ids across runs or alignment to a predefined ordering.
When: Advanced; mostly for downstream integration/visualization.

linkWeight

What: Function to read an edge’s weight.
Signature: (link) => number. Default: link.data?.weight ?? 1.
Why: Use custom attributes (e.g., frequency, strength) as weights.

Example:

const res = detectClusters(g, { linkWeight: l => Math.max(0, l.data?.w ?? 0) })

nodeSize

What: Function to read a node’s size (used by CPM and maxCommunitySize).
Signature: (node) => number. Default: node.data?.size ?? 1.
Why: Make CPM size-aware or weight capacity by node importance.

Example:

const res = detectClusters(g, { nodeSize: n => n.data?.pop ?? 1, quality: 'cpm' })

maxLevels, maxLocalPasses

What: Internal performance knobs for the multi-level loop and local passes.
Values: number. Defaults are conservative and usually fine.
Why/When: Only tune if you profile and identify a need.

Vocabulary

Modularity: Measures how much more densely connected nodes are within communities than expected at random. Directed modularity (Leicht–Newman) uses in/out strengths. Higher is better; can suffer from a resolution limit on very large graphs.
CPM (Constant Potts Model): Maximizes internal edge weight minus gamma × a penalty for community size. Gamma (resolution) controls granularity: larger gamma → more/smaller communities; smaller gamma → fewer/larger. Supports custom node sizes via nodeSize.

CPM

CPM is a resolution-tunable objective. During optimization this package uses node sizes (via nodeSize), so gains reflect community size; with default nodeSize=1 this matches “unit-count” CPM. If you supply custom node sizes, the optimization becomes size-aware. The quality() reporter supports:

options.cpmMode: 'unit' | 'size-aware' (affects only how quality() is computed, not the move heuristic). If you use custom node sizes and want "unit-count" reporting, keep cpmMode='unit'.

Multilayer graphs

Pass an array of layers: [{ graph, weight?, linkWeight?, nodeSize? }]. Edges are aggregated by summing weight * linkWeight(link) per layer. All layers must have the same set of node ids. Node sizes default to the first layer’s nodeSize.

Example:

const result = detectClusters([
  { graph: layer1, weight: 1.0 },
  { graph: layer2, weight: 0.2 },
], { quality: 'modularity', randomSeed: 7 })

When to use it

Multiplex networks: combine different relation types (e.g., friendship + collaboration).
Temporal smoothing: blend snapshots with weights to reduce short-term noise.
Heterogeneous signals: fuse a strong–sparse layer with a weak–dense layer.
Denoising: add a lightly weighted prior layer to stabilize communities.

Directed graphs

Set { directed: true } to use Leicht–Newman directed modularity. Edges are taken as-is; no reciprocal links are needed.

Undirected handling

If directed: false (default), the adapter symmetrizes edges so you don’t need to add reciprocal links to your ngraph.graph.
If both directions exist between two nodes with possibly different weights, the adapter averages them: w_undirected = (w_ij + w_ji) / 2, then emits symmetric edges with that weight. This preserves total weight and keeps results consistent whether you supplied one or both directions.

Constraints and ergonomics

Fixed nodes: keep given nodes in their initial communities at the finest level (also respected during refinement).
Community size limit: maxCommunitySize prevents moves that would exceed the given total nodeSize.
Determinism: randomSeed controls shuffle order; repeated runs with the same seed and inputs are deterministic.
Negative weights and self-loops are supported; use negative weights with care as modularity assumptions may not hold theoretically.

Examples

Modularity with custom weights

const g = createGraph()
g.addNode('x'); g.addNode('y'); g.addNode('z')
g.addLink('x','y', { weight: 2 }); g.addLink('y','x', { weight: 2 })
g.addLink('y','z', { weight: 0.3 }); g.addLink('z','y', { weight: 0.3 })

const res = detectClusters(g, { quality: 'modularity', randomSeed: 1, linkWeight: l => l.data?.weight ?? 1 })

CPM with resolution and node sizes

const res = detectClusters(g, {
  quality: 'cpm',
  resolution: 0.5,
  nodeSize: n => n.data?.size ?? 1,
  randomSeed: 3,
})

Fix a subset of nodes

const res = detectClusters(g, { fixedNodes: new Set(['x','z']), randomSeed: 11, refine: true })

Build and test

Build library bundles: npm run build
Run tests: npm test

CLI

This package ships a small CLI for quick community detection (via npx or installed locally).

Input format is auto-detected by file extension (.dot/.gv/.json) or by content (tries JSON first, then DOT). Use --format to override.

Usage examples

From a DOT file

npx ngraph.leiden --in graph.dot --out membership.json

From stdin

cat graph.dot | npx ngraph.leiden --membership-only

From a JSON edgelist (either an array of {source,target,weight?} or {nodes,links})

cat edges.json | npx ngraph.leiden > out.json

Options

--directed — treat input as directed
--quality modularity|cpm (default modularity)
--resolution <gamma> — CPM resolution parameter
--candidate-strategy neighbors|all|random|random-neighbor
--max-levels, --max-local-passes, --random-seed
--max-community-size <num>
--allow-new-community, --no-refine
--fixed <file> — newline-separated node ids to keep fixed at level 0
--membership-only — print only the id->community map

Output

Defaults to JSON on stdout; use --out <file> to write to a file.
Formats:
- JSON (default): full object {membership, meta} or mapping only with --membership-only.
- CSV: --out-format csv prints header nodeId,communityId.
- DOT: --out-format dot overlays community="..." on nodes using ngraph.todot.

Quality-only evaluation

Evaluate modularity/CPM for an existing membership mapping without running detection:

npx ngraph.leiden \
  --in graph.dot --format dot \
  --evaluate --membership membership.json \
  --quality modularity

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
benchmarks		benchmarks
bin		bin
demo		demo
src		src
test		test
.gitignore		.gitignore
README.md		README.md
deploy.sh		deploy.sh
package-lock.json		package-lock.json
package.json		package.json
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ngraph.leiden

Install

Quick start

API

quality

resolution (CPM)

directed

randomSeed

candidateStrategy

allowNewCommunity

maxCommunitySize

refine

fixedNodes

preserveLabels

linkWeight

nodeSize

maxLevels, maxLocalPasses

Vocabulary

CPM

Multilayer graphs

Directed graphs

Constraints and ergonomics

Examples

Build and test

CLI

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ngraph.leiden

Install

Quick start

API

quality

resolution (CPM)

directed

randomSeed

candidateStrategy

allowNewCommunity

maxCommunitySize

refine

fixedNodes

preserveLabels

linkWeight

nodeSize

maxLevels, maxLocalPasses

Vocabulary

CPM

Multilayer graphs

Directed graphs

Constraints and ergonomics

Examples

Build and test

CLI

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages