Skip to content

Commit

Permalink
x and y reducers for group and hexbin (#1916)
Browse files Browse the repository at this point in the history
* x and y reducers for group

* x and y reducers for hexbin
  • Loading branch information
mbostock authored Nov 7, 2023
1 parent 4cf4d73 commit c6c1bcd
Show file tree
Hide file tree
Showing 10 changed files with 554 additions and 29 deletions.
2 changes: 2 additions & 0 deletions docs/transforms/group.md
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,8 @@ The following named reducers are supported:
* *deviation* - the standard deviation
* *variance* - the variance per [Welford’s algorithm](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm)
* *identity* - the array of values
* *x* - the group’s *x* value (when grouping on *x*)
* *y* - the group’s *y* value (when grouping on *y*)

In addition, a reducer may be specified as:

Expand Down
17 changes: 13 additions & 4 deletions docs/transforms/hexbin.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,9 +174,9 @@ Plot.plot({

The *options* must specify the **x** and **y** channels. The **binWidth** option (default 20) defines the distance between centers of neighboring hexagons in pixels. If any of **z**, **fill**, or **stroke** is a channel, the first of these channels will be used to subdivide bins.

The *outputs* options are similar to the [bin transform](./bin.md); each output channel receives as input, for each hexagon, the subset of the data which has been matched to its center. The outputs object specifies the aggregation method for each output channel.
The *outputs* options are similar to the [bin transform](./bin.md); for each hexagon, an output channel value is derived by reducing the corresponding binned input channel values. The *outputs* object specifies the reducer for each output channel.

The following aggregation methods are supported:
The following named reducers are supported:

* *first* - the first value, in input order
* *last* - the last value, in input order
Expand All @@ -195,13 +195,22 @@ The following aggregation methods are supported:
* *variance* - the variance per [Welford’s algorithm](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm)
* *mode* - the value with the most occurrences
* *identity* - the array of values
* a function to be passed the array of values for each bin and the extent of the bin
* *x* - the hexagon’s *x* center
* *y* - the hexagon’s *y* center

In addition, a reducer may be specified as:

* a function to be passed the array of values for each bin and the center of the bin
* an object with a *reduceIndex* method

In the last case, the **reduceIndex** method is repeatedly passed three arguments: the index for each bin (an array of integers), the input channel’s array of values, and the center of the bin (an object {data, x, y}); it must then return the corresponding aggregate value for the bin.

Most reducers require binding the output channel to an input channel; for example, if you want the **y** output channel to be a *sum* (not merely a count), there should be a corresponding **y** input channel specifying which values to sum. If there is not, *sum* will be equivalent to *count*.

## hexbin(*outputs*, *options*) {#hexbin}

```js
Plot.dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"}))
```

Bins (hexagonally) on **x** and **y**. Also groups on the first channel of **z**, **fill**, or **stroke**, if any.
Bins hexagonally on **x** and **y**. Also groups on the first channel of **z**, **fill**, or **stroke**, if any.
36 changes: 35 additions & 1 deletion src/transforms/group.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,42 @@ export interface GroupOutputOptions<T = Reducer> {
z?: ChannelValue;
}

/**
* How to reduce grouped values; one of:
*
* - a generic reducer name, such as *count* or *first*
* - *x* - the group’s **x** value (when grouping on **x**)
* - *y* - the group’s **y** value (when grouping on **y**)
* - a function that takes an array of values and returns the reduced value
* - an object that implements the *reduceIndex* method
*
* When a reducer function or implementation is used with the group transform,
* it is passed the group extent {x, y} as an additional argument.
*/
export type GroupReducer = Reducer | GroupReducerFunction | GroupReducerImplementation | "x" | "y";

/**
* A shorthand functional group reducer implementation: given an array of input
* channel *values*, and the current group’s *extent*, returns the corresponding
* reduced output value.
*/
export type GroupReducerFunction<S = any, T = S> = (values: S[], extent: {x: any; y: any}) => T;

/** A group reducer implementation. */
export interface GroupReducerImplementation<S = any, T = S> {
/**
* Given an *index* representing the contents of the current group, the input
* channel’s array of *values*, and the current group’s *extent*, returns the
* corresponding reduced output value. If no input channel is supplied (e.g.,
* as with the *count* reducer) then *values* may be undefined.
*/
reduceIndex(index: number[], values: S[], extent: {x: any; y: any}): T;
// TODO scope
// TODO label
}

/** Output channels (and options) for the group transform. */
export type GroupOutputs = ChannelReducers | GroupOutputOptions;
export type GroupOutputs = ChannelReducers<GroupReducer> | GroupOutputOptions<GroupReducer>;

/**
* Groups on the first channel of **z**, **fill**, or **stroke**, if any, and
Expand Down
46 changes: 42 additions & 4 deletions src/transforms/group.js
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,10 @@ function groupn(
inputs = {} // input channels and options
) {
// Compute the outputs.
outputs = maybeOutputs(outputs, inputs);
reduceData = maybeReduce(reduceData, identity);
sort = sort == null ? undefined : maybeOutput("sort", sort, inputs);
filter = filter == null ? undefined : maybeEvaluator("filter", filter, inputs);
outputs = maybeGroupOutputs(outputs, inputs);
reduceData = maybeGroupReduce(reduceData, identity);
sort = sort == null ? undefined : maybeGroupOutput("sort", sort, inputs);
filter = filter == null ? undefined : maybeGroupEvaluator("filter", filter, inputs);

// Produce x and y output channels as appropriate.
const [GX, setGX] = maybeColumn(x);
Expand Down Expand Up @@ -287,6 +287,32 @@ function invalidReduce(reduce) {
throw new Error(`invalid reduce: ${reduce}`);
}

export function maybeGroupOutputs(outputs, inputs) {
return maybeOutputs(outputs, inputs, maybeGroupOutput);
}

function maybeGroupOutput(name, reduce, inputs) {
return maybeOutput(name, reduce, inputs, maybeGroupEvaluator);
}

function maybeGroupEvaluator(name, reduce, inputs) {
return maybeEvaluator(name, reduce, inputs, maybeGroupReduce);
}

function maybeGroupReduce(reduce, value) {
return maybeReduce(reduce, value, maybeGroupReduceFallback);
}

function maybeGroupReduceFallback(reduce) {
switch (`${reduce}`.toLowerCase()) {
case "x":
return reduceX;
case "y":
return reduceY;
}
throw new Error(`invalid group reduce: ${reduce}`);
}

export function maybeSubgroup(outputs, inputs) {
for (const name in inputs) {
const value = inputs[name];
Expand Down Expand Up @@ -399,6 +425,18 @@ function reduceProportion(value, scope) {
: {scope, reduceIndex: (I, V, basis = 1) => sum(I, (i) => V[i]) / basis};
}

const reduceX = {
reduceIndex(I, X, {x}) {
return x;
}
};

const reduceY = {
reduceIndex(I, X, {y}) {
return y;
}
};

export function find(test) {
if (typeof test !== "function") throw new Error(`invalid test function: ${test}`);
return {
Expand Down
3 changes: 2 additions & 1 deletion src/transforms/hexbin.d.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import type {ChannelReducers, ChannelValue} from "../channel.js";
import type {Initialized} from "./basic.js";
import type {GroupReducer} from "./group.js";

/** Options for the hexbin transform. */
export interface HexbinOptions {
Expand Down Expand Up @@ -43,4 +44,4 @@ export interface HexbinOptions {
*
* To draw empty hexagons, see the hexgrid mark.
*/
export function hexbin<T>(outputs?: ChannelReducers, options?: T & HexbinOptions): Initialized<T>;
export function hexbin<T>(outputs?: ChannelReducers<GroupReducer>, options?: T & HexbinOptions): Initialized<T>;
30 changes: 14 additions & 16 deletions src/transforms/hexbin.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import {map, number, valueof} from "../options.js";
import {applyPosition} from "../projection.js";
import {sqrt3} from "../symbol.js";
import {initializer} from "./basic.js";
import {hasOutput, maybeGroup, maybeOutputs, maybeSubgroup} from "./group.js";
import {hasOutput, maybeGroup, maybeGroupOutputs, maybeSubgroup} from "./group.js";

// We don’t want the hexagons to align with the edges of the plot frame, as that
// would cause extreme x-values (the upper bound of the default x-scale domain)
Expand All @@ -16,9 +16,8 @@ export function hexbin(outputs = {fill: "count"}, {binWidth, ...options} = {}) {
const {z} = options;

// TODO filter e.g. to show empty hexbins?
// TODO disallow x, x1, x2, y, y1, y2 reducers?
binWidth = binWidth === undefined ? 20 : number(binWidth);
outputs = maybeOutputs(outputs, options);
outputs = maybeGroupOutputs(outputs, options);

// A fill output means a fill channel; declaring the channel here instead of
// waiting for the initializer allows the mark constructor to determine that
Expand Down Expand Up @@ -65,15 +64,15 @@ export function hexbin(outputs = {fill: "count"}, {binWidth, ...options} = {}) {
const binFacet = [];
for (const o of outputs) o.scope("facet", facet);
for (const [f, I] of maybeGroup(facet, G)) {
for (const bin of hbin(I, X, Y, binWidth)) {
for (const {index: b, extent} of hbin(data, I, X, Y, binWidth)) {
binFacet.push(++i);
BX.push(bin.x);
BY.push(bin.y);
if (Z) GZ.push(G === Z ? f : Z[bin[0]]);
if (F) GF.push(G === F ? f : F[bin[0]]);
if (S) GS.push(G === S ? f : S[bin[0]]);
if (Q) GQ.push(G === Q ? f : Q[bin[0]]);
for (const o of outputs) o.reduce(bin);
BX.push(extent.x);
BY.push(extent.y);
if (Z) GZ.push(G === Z ? f : Z[b[0]]);
if (F) GF.push(G === F ? f : F[b[0]]);
if (S) GS.push(G === S ? f : S[b[0]]);
if (Q) GQ.push(G === Q ? f : Q[b[0]]);
for (const o of outputs) o.reduce(b, extent);
}
}
binFacets.push(binFacet);
Expand Down Expand Up @@ -106,7 +105,7 @@ export function hexbin(outputs = {fill: "count"}, {binWidth, ...options} = {}) {
});
}

function hbin(I, X, Y, dx) {
function hbin(data, I, X, Y, dx) {
const dy = dx * (1.5 / sqrt3);
const bins = new Map();
for (const i of I) {
Expand All @@ -127,11 +126,10 @@ function hbin(I, X, Y, dx) {
const key = `${pi},${pj}`;
let bin = bins.get(key);
if (bin === undefined) {
bins.set(key, (bin = []));
bin.x = (pi + (pj & 1) / 2) * dx + ox;
bin.y = pj * dy + oy;
bin = {index: [], extent: {data, x: (pi + (pj & 1) / 2) * dx + ox, y: pj * dy + oy}};
bins.set(key, bin);
}
bin.push(i);
bin.index.push(i);
}
return bins.values();
}
273 changes: 273 additions & 0 deletions test/output/hexbinFillX.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit c6c1bcd

Please sign in to comment.