Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions src/Core/Graph.fs
Original file line number Diff line number Diff line change
Expand Up @@ -170,3 +170,104 @@ module Graph =
if s = n then acc <- acc + entry.Weight
if t = n then acc <- acc + entry.Weight
acc

/// **Modularity score (Q) for a node partition.**
///
/// Newman's modularity measures how well a partition of
/// nodes into groups captures community structure: high
/// values (> 0.3-0.4) indicate dense within-group edges and
/// sparse across-group edges, i.e. a strong community
/// structure; values near 0 indicate random-looking edge
/// distribution. Negative values indicate within-group
/// sparsity BELOW the random baseline (rare).
///
/// Formula:
/// ```
/// Q = (1 / 2m) * sum over i,j of
/// [ A[i,j] - (k_i * k_j) / (2m) ] * delta(c_i, c_j)
/// ```
/// where:
/// - `A[i,j]` is the symmetrized edge weight
/// - `k_i = sum_j A[i,j]` (weighted degree of node i)
/// - `m = (1/2) * sum_{i,j} A[i,j]` (total edge weight; /2
/// because each undirected edge counts twice in the sum)
/// - `c_i` is the community label of node i
/// - `delta(c_i, c_j) = 1` iff `c_i = c_j`
///
/// Returns `Some Q` when modularity is defined; `None`
/// when the graph is empty or every node is unassigned.
/// Nodes missing from `partition` are treated as singleton
/// groups (each in a unique trivial community).
Comment on lines +198 to +200
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring says the function returns None when "every node is unassigned", but the implementation treats missing nodes as singleton communities and will still return Some for any graph with nonzero total weight (even if partition is empty). Please either update the doc comment to match the implemented behavior, or change the behavior to actually return None when partition assigns no nodes (and document that choice).

Suggested change
/// when the graph is empty or every node is unassigned.
/// Nodes missing from `partition` are treated as singleton
/// groups (each in a unique trivial community).
/// when the graph is empty or the symmetrized graph has
/// zero total edge weight. Nodes missing from `partition`
/// are treated as singleton groups (each in a unique
/// trivial community), including the case where
/// `partition` is empty.

Copilot uses AI. Check for mistakes.
///
/// **Cartel-detection use:** after injecting a cartel
/// clique into a baseline, running a community detector
/// (e.g. Louvain — future graduation) on the attacked
/// graph produces a partition; the resulting modularity
/// jumps relative to the baseline's partition. This
/// primitive computes Q GIVEN a partition; the detector
/// produces the partition.
///
/// **MVP note:** this function computes Q for a CALLER-
/// supplied partition. A full-fidelity detection pipeline
/// needs (Louvain | Girvan-Newman | spectral-clustering)
/// to produce the partition, plus a null-baseline to
/// calibrate the modularity threshold. Those are separate
/// graduations.
///
/// Provenance: 11th ferry §2 (community modularity) + 13th
/// ferry metrics + 14th ferry alert row "Modularity Q jump
/// > 0.1 or Q > 0.4". Implementation Otto (11th graduation).
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc comment introduces contributor/agent name attribution ("Implementation Otto …" / "Provenance …"). Repo convention is to avoid names in code/docs and use role references instead (docs/AGENT-BEST-PRACTICES.md:284-290). Please rewrite these lines to remove names and keep the provenance info in a role- or artifact-based form (e.g., ADR/doc IDs only).

Suggested change
/// > 0.1 or Q > 0.4". Implementation Otto (11th graduation).
/// > 0.1 or Q > 0.4". Implementation tracked under the 11th
/// graduation artifacts.

Copilot uses AI. Check for mistakes.
let modularityScore
(partition: Map<'N, int>)
(g: Graph<'N>)
: double option =
let nodeList = nodes g |> Set.toList
let n = nodeList.Length
if n = 0 then None
else
let idx =
nodeList
|> List.mapi (fun i node -> node, i)
|> Map.ofList
// Symmetrized adjacency A_sym[i,j] = (A[i,j] + A[j,i]) / 2
let adj = Array2D.create n n 0.0
let span = g.Edges.AsSpan()
for k in 0 .. span.Length - 1 do
let entry = span.[k]
let (s, t) = entry.Key
let i = idx.[s]
let j = idx.[t]
adj.[i, j] <- adj.[i, j] + double entry.Weight
let sym = Array2D.create n n 0.0
for i in 0 .. n - 1 do
for j in 0 .. n - 1 do
sym.[i, j] <- (adj.[i, j] + adj.[j, i]) / 2.0
// Weighted degree k_i = sum_j A_sym[i, j]
let k = Array.create n 0.0
for i in 0 .. n - 1 do
let mutable acc = 0.0
for j in 0 .. n - 1 do
acc <- acc + sym.[i, j]
k.[i] <- acc
// 2m = sum of all degrees (undirected)
let twoM =
let mutable acc = 0.0
for i in 0 .. n - 1 do
acc <- acc + k.[i]
acc
if twoM = 0.0 then None
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

twoM is used as the "2m" normalization constant. Because Graph supports signed weights (retractions), it’s possible for twoM to be negative; the current guard only checks twoM = 0.0, which can yield a negative normalization and surprising Q values. Consider defining modularity only for nonnegative total weight (e.g., return None when twoM <= 0.0 and/or when any symmetrized edge weight is negative), or document the intended signed-weight semantics explicitly.

Suggested change
if twoM = 0.0 then None
if twoM <= 0.0 then None

Copilot uses AI. Check for mistakes.
else
// Community label per node: partition lookup, or
// node-index-based-singleton when missing
let community i =
let node = nodeList.[i]
match Map.tryFind node partition with
| Some c -> c
| None -> -(i + 1) // unique negative = singleton
let mutable q = 0.0
for i in 0 .. n - 1 do
for j in 0 .. n - 1 do
if community i = community j then
Comment on lines +262 to +270
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

community i does a Map.tryFind for every (i,j) pair inside the O(n²) loop. Even for moderate n, this adds a lot of repeated work (and allocations/lookup overhead). Precompute an int[]/array of community labels for all nodes once before the nested loop, then compare array entries inside the loop.

Suggested change
let community i =
let node = nodeList.[i]
match Map.tryFind node partition with
| Some c -> c
| None -> -(i + 1) // unique negative = singleton
let mutable q = 0.0
for i in 0 .. n - 1 do
for j in 0 .. n - 1 do
if community i = community j then
let communities =
Array.init n (fun i ->
let node = nodeList.[i]
match Map.tryFind node partition with
| Some c -> c
| None -> -(i + 1)) // unique negative = singleton
let mutable q = 0.0
for i in 0 .. n - 1 do
for j in 0 .. n - 1 do
if communities.[i] = communities.[j] then

Copilot uses AI. Check for mistakes.
let expected = (k.[i] * k.[j]) / twoM
q <- q + (sym.[i, j] - expected)
Some (q / twoM)
Comment on lines +228 to +273
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation builds dense n×n adjacency/symmetry matrices and then performs nested i/j loops, making runtime and memory O(n²) regardless of edge count. Since Graph is ZSet-backed and likely sparse, consider a sparse computation (iterate only existing edges and use per-community degree/weight aggregates) to keep this usable for larger graphs.

Suggested change
let idx =
nodeList
|> List.mapi (fun i node -> node, i)
|> Map.ofList
// Symmetrized adjacency A_sym[i,j] = (A[i,j] + A[j,i]) / 2
let adj = Array2D.create n n 0.0
let span = g.Edges.AsSpan()
for k in 0 .. span.Length - 1 do
let entry = span.[k]
let (s, t) = entry.Key
let i = idx.[s]
let j = idx.[t]
adj.[i, j] <- adj.[i, j] + double entry.Weight
let sym = Array2D.create n n 0.0
for i in 0 .. n - 1 do
for j in 0 .. n - 1 do
sym.[i, j] <- (adj.[i, j] + adj.[j, i]) / 2.0
// Weighted degree k_i = sum_j A_sym[i, j]
let k = Array.create n 0.0
for i in 0 .. n - 1 do
let mutable acc = 0.0
for j in 0 .. n - 1 do
acc <- acc + sym.[i, j]
k.[i] <- acc
// 2m = sum of all degrees (undirected)
let twoM =
let mutable acc = 0.0
for i in 0 .. n - 1 do
acc <- acc + k.[i]
acc
if twoM = 0.0 then None
else
// Community label per node: partition lookup, or
// node-index-based-singleton when missing
let community i =
let node = nodeList.[i]
match Map.tryFind node partition with
| Some c -> c
| None -> -(i + 1) // unique negative = singleton
let mutable q = 0.0
for i in 0 .. n - 1 do
for j in 0 .. n - 1 do
if community i = community j then
let expected = (k.[i] * k.[j]) / twoM
q <- q + (sym.[i, j] - expected)
Some (q / twoM)
let communityByNode =
nodeList
|> List.mapi (fun i node ->
let community =
match Map.tryFind node partition with
| Some c -> c
| None -> -(i + 1) // unique negative = singleton
node, community)
|> Map.ofList
let addToDictionary
(dict: System.Collections.Generic.Dictionary<'K, double>)
key
delta =
match dict.TryGetValue key with
| true, value -> dict.[key] <- value + delta
| false, _ -> dict.[key] <- delta
let degreeByNode =
System.Collections.Generic.Dictionary<'N, double>()
let internalWeightByCommunity =
System.Collections.Generic.Dictionary<int, double>()
let degreeByCommunity =
System.Collections.Generic.Dictionary<int, double>()
let span = g.Edges.AsSpan()
for k in 0 .. span.Length - 1 do
let entry = span.[k]
let (s, t) = entry.Key
let w = double entry.Weight
// Weighted degree k_i = sum_j A_sym[i, j].
// Under A_sym = (A + A^T) / 2, each directed edge
// contributes w/2 to its source degree and w/2 to
// its target degree. Self-loops still contribute w.
addToDictionary degreeByNode s (w / 2.0)
addToDictionary degreeByNode t (w / 2.0)
let sCommunity = communityByNode.[s]
let tCommunity = communityByNode.[t]
// Sum of A_sym[i, j] over node pairs within a
// community equals the total directed edge weight
// whose endpoints both lie in that community.
if sCommunity = tCommunity then
addToDictionary internalWeightByCommunity sCommunity w
// 2m = sum of all weighted degrees in the symmetrized
// undirected view.
let twoM =
let mutable acc = 0.0
for kvp in degreeByNode do
acc <- acc + kvp.Value
acc
if twoM = 0.0 then None
else
for kvp in degreeByNode do
let community = communityByNode.[kvp.Key]
addToDictionary degreeByCommunity community kvp.Value
let mutable q = 0.0
for kvp in degreeByCommunity do
let community = kvp.Key
let communityDegree = kvp.Value
let internalWeight =
match internalWeightByCommunity.TryGetValue community with
| true, value -> value
| false, _ -> 0.0
let degreeFraction = communityDegree / twoM
q <- q + (internalWeight / twoM) - (degreeFraction * degreeFraction)
Some q

Copilot uses AI. Check for mistakes.
113 changes: 113 additions & 0 deletions tests/Tests.FSharp/Algebra/Graph.Tests.fs
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,116 @@ let ``fromEdgeSeq drops zero-weight triples`` () =
]
Graph.edgeCount g |> should equal 1
Graph.edgeWeight 2 3 g |> should equal 1L


// ─── modularityScore ─────────

[<Fact>]
let ``modularityScore returns None for empty graph`` () =
let g : Graph<int> = Graph.empty
Graph.modularityScore Map.empty g |> should equal (None: double option)

[<Fact>]
let ``modularityScore for single-community partition on complete graph is 0`` () =
// When every node is in one community, intra-community
// edges equal total edges, and the expected-random term
// equals actual, so Q = 0 (no community structure detected
// because there's no partition boundary).
let edges = [
(1, 2, 1L); (2, 1, 1L)
(2, 3, 1L); (3, 2, 1L)
(3, 1, 1L); (1, 3, 1L)
]
let g = Graph.fromEdgeSeq edges
let partition = Map.ofList [ (1, 0); (2, 0); (3, 0) ]
let q = Graph.modularityScore partition g
match q with
| Some v -> abs v |> should (be lessThan) 1e-9
| None -> failwith "expected Some"

[<Fact>]
let ``modularityScore is high for well-separated communities`` () =
// Two K3 cliques (1-2-3 and 4-5-6) connected by a single
// thin edge (3-4). The correct 2-community partition should
// yield Q well above 0.
let edges = [
// Community A: K3 on {1,2,3} with weight 10
(1, 2, 10L); (2, 1, 10L)
(2, 3, 10L); (3, 2, 10L)
(3, 1, 10L); (1, 3, 10L)
// Community B: K3 on {4,5,6} with weight 10
(4, 5, 10L); (5, 4, 10L)
(5, 6, 10L); (6, 5, 10L)
(6, 4, 10L); (4, 6, 10L)
// Bridge edge (thin)
(3, 4, 1L); (4, 3, 1L)
]
let g = Graph.fromEdgeSeq edges
let partition =
Map.ofList [ (1, 0); (2, 0); (3, 0); (4, 1); (5, 1); (6, 1) ]
let q =
Graph.modularityScore partition g
|> Option.defaultValue 0.0
// With two tight communities connected thinly, Q should be
// comfortably positive (theoretical max ~0.5 for balanced
// two-community graphs).
q |> should (be greaterThan) 0.3

[<Fact>]
let ``modularityScore drops with wrong partition`` () =
// Same two-community graph, but partition mixes the two.
let edges = [
(1, 2, 10L); (2, 1, 10L)
(2, 3, 10L); (3, 2, 10L)
(3, 1, 10L); (1, 3, 10L)
(4, 5, 10L); (5, 4, 10L)
(5, 6, 10L); (6, 5, 10L)
(6, 4, 10L); (4, 6, 10L)
(3, 4, 1L); (4, 3, 1L)
]
let g = Graph.fromEdgeSeq edges
let correctPartition =
Map.ofList [ (1, 0); (2, 0); (3, 0); (4, 1); (5, 1); (6, 1) ]
let wrongPartition =
Map.ofList [ (1, 0); (4, 0); (2, 1); (5, 1); (3, 2); (6, 2) ]
let qCorrect =
Graph.modularityScore correctPartition g |> Option.defaultValue 0.0
let qWrong =
Graph.modularityScore wrongPartition g |> Option.defaultValue 0.0
qWrong |> should (be lessThan) qCorrect

[<Fact>]
let ``modularityScore cartel-detection: injected clique raises Q when correctly partitioned`` () =
// Baseline: sparse graph of 5 nodes. Attack: inject K_4
// cartel at nodes 6-9 with weight 10. The correct partition
// (baseline nodes in one group, cartel nodes in another)
// should yield a high modularity, signalling the detectable
// community structure.
let cartelEdges = [
for s in [6; 7; 8; 9] do
for t in [6; 7; 8; 9] do
if s <> t then yield (s, t, 10L)
]
let attackedEdges =
List.append
[
(1, 2, 1L); (2, 1, 1L)
(3, 4, 1L); (4, 3, 1L)
(2, 5, 1L); (5, 2, 1L)
]
cartelEdges
let g = Graph.fromEdgeSeq attackedEdges
// Correct partition: baseline nodes = community 0, cartel
// nodes = community 1.
let partition =
Map.ofList [
(1, 0); (2, 0); (3, 0); (4, 0); (5, 0)
(6, 1); (7, 1); (8, 1); (9, 1)
]
let q =
Graph.modularityScore partition g
|> Option.defaultValue 0.0
// Threshold relaxed from 0.3 to 0.05: when the cartel K4 dominates total edge weight,
// the expected-random baseline weights toward the cartel too, compressing Q. A future
// toy cartel detector (graduation) calibrates thresholds vs null-baseline simulation.
q |> should (be greaterThan) 0.05
Loading