Skip to content

Conversation

@danieljbruce
Copy link
Contributor

@danieljbruce danieljbruce commented Jul 3, 2025

Description

This PR adds source code changes with plumbing so that client side metric collection can be done for unary calls. No client side metrics are actually collected yet for readModifyWriteRow or checkAndMutate calls, but future PRs will use this plumbing to collect metrics for those calls. A test is provided that makes a fake method on the table to demonstrate how a method would collect these metrics to serve as an example.

Impact

Doesn't change the way the client library works. Just provides support for when we open a PR that collects client side metrics for unary calls.

Testing

A unit test is created that adds a fake method to a table that does a basic unary readModifyWriteRow call. It ensures that the metrics make it to the test metrics handler correctly.

Additional Information

Changes:

src/interceptor.ts: This file contains code for attaching interceptors to a call that will call the correct metrics collector methods when data arrives

src/client-side-metrics/operation-metrics-collector.ts: The types are cleaned up in this file. None of the functionality has changed in this file

system-test/read-modify-write-row-interceptors.ts: This adds a test that ensures the plumbing for collecting metadata trailers and headers works as intended for unary calls.

danieljbruce and others added 5 commits July 2, 2025 17:26
This commit introduces a new system test for client-side metrics related to
the ReadModifyWriteRow operation.

The test (`system-test/isolated-read-modify-write-row-metrics.ts`) is designed
to be self-contained:
- It uses a `TestMetricsHandler` to capture metrics locally.
- It simulates the gRPC call for ReadModifyWriteRow to avoid reliance on
  a running emulator for its core assertions regarding metrics collection.
- It verifies that zone, cluster, and server timing are correctly extracted
  from simulated metadata and reported to the metrics handler.

The previous non-isolated test file for this functionality has been removed.

A minor modification was also made to `system-test/bigtable.ts` to ensure
the global Bigtable client (used in suite-level before/after hooks) is
initialized with a dummy project ID and emulator endpoint. This addresses
potential 'project ID not found' errors when an emulator is used,
although current test environment issues indicate the emulator is not connected.
@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. api: bigtable Issues related to the googleapis/nodejs-bigtable API. labels Jul 3, 2025
google-labs-jules bot and others added 23 commits July 3, 2025 19:07
Refactors the system test for ReadModifyWriteRow client-side metrics
to use a mock gRPC server and custom interceptors.

Key changes in `system-test/read-modify-write-row-interceptors.ts`:
- A mock Bigtable gRPC service is implemented, providing a controllable
  `ReadModifyWriteRow` endpoint that can return specific responses,
  initial metadata (for server-timing), and trailing metadata (for
  zone/cluster via x-cbt-op-details).
- A custom gRPC interceptor provider, `createMetricsInterceptorProvider`,
  is introduced. This interceptor is attached to the `bigtable.request`
  call and invokes the lifecycle methods of an `OperationMetricsCollector`.
- The test now creates a `fakeReadModifyWriteRow` method on a Table
  instance. This method makes a `bigtable.request` call (configured with
  the custom interceptor) to the mock server.
- Assertions verify that the `TestMetricsHandler` correctly receives metrics
  populated by the `OperationMetricsCollector` via the interceptor pipeline,
  including data derived from the mock server's metadata (zone, cluster,
  server timing).

This approach allows for more thorough testing of the metrics collection
mechanism, including the interceptor integration, in a controlled environment.
This commit refactors the ReadModifyWriteRow client metrics system test
(`system-test/read-modify-write-row-interceptors.ts`) to improve how the
`OperationMetricsCollector` lifecycle is managed in conjunction with the
mock gRPC server and custom interceptor.

Changes:
- The `createMetricsInterceptorProvider` function has been simplified. The
  interceptor it creates is now solely responsible for relaying in-flight
  gRPC data (initial metadata, response messages, final status with trailing
  metadata) to the `OperationMetricsCollector` by calling `onMetadataReceived`,
  `onResponse`, and `onStatusMetadataReceived`.
- The `fakeReadModifyWriteRow` method, which orchestrates the test call,
  now explicitly manages the `OperationMetricsCollector`'s broader lifecycle.
  It calls `onOperationStart` and `onAttemptStart` *before* the `bigtable.request()`
  call to the mock server. *After* the request promise settles (either resolves
  or rejects), it calls `onAttemptComplete` and `onOperationComplete` with the
  final status of the RPC attempt.

This revised structure more accurately reflects the division of responsibilities:
  - The calling code (simulated by `fakeReadModifyWriteRow`) manages the
    overall operation and attempt lifecycle with the collector.
  - The gRPC interceptor observes and passes along the data specific to the
    RPC attempt as it flows through.

This should lead to more accurate testing of the metrics collection pipeline.
This commit adds detailed console.log statements throughout the
`system-test/read-modify-write-row-interceptors.ts` test file.

Logging has been added to:
- The `fakeReadModifyWriteRow` method to trace its execution flow,
  particularly around the `await bigtable.request()` call.
- The `createMetricsInterceptorProvider` function and the methods of the
  gRPC `InterceptingCall` object it returns (e.g., `start`,
  `onReceiveMetadata`, `onReceiveMessage`, `onReceiveStatus`).
- The `MockBigtableService.ReadModifyWriteRow` method to track when the
  mock server is hit and what it's attempting to send.

These logs are intended to help diagnose potential issues with the
execution order, such as the `await bigtable.request()` call completing
prematurely or interceptor hooks not being triggered as expected.
This will be useful once the test suite's environment allows the test
file to be reached (i.e., global hooks requiring emulator connection
can pass).
…tps://github.com/googleapis/nodejs-bigtable into feat/read-modify-write-row-metrics-isolated-test

# Conflicts:
#	system-test/read-modify-write-row-interceptors.ts
};
}

export function withInterceptors(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show me how you expect to use this? Does something like this need to be called at the start of each request?

Copy link
Contributor Author

@danieljbruce danieljbruce Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a method call we write something like this:

        bigtable.request(
          {
            client: 'BigtableClient',
            method: 'readModifyWriteRow',
            reqOpts: {
              tableName: table.name,
              rowKey: Buffer.from(ROW_KEY),
              rules: [
                {
                  familyName: COLUMN_FAMILY,
                  columnQualifier: Buffer.from(COLUMN),
                  appendValue: Buffer.from('-wood'),
                },
              ],
              appProfileId: undefined,
            },
            gaxOpts: withInterceptors({}, metricsCollector),
          },
          (err: ServiceError | null, resp?: any) => {
            if (err) {
              reject(err);
            } else {
              resolve(resp);
            }
          },
        );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the tests below. The interceptors are added onto gax options.

);
assert.ok(attemptCompleteData.attemptLatency >= 0);
assert(attemptCompleteData.serverLatency);
assert.ok(attemptCompleteData.serverLatency >= 0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where we prove we are getting the server latency from the metadata headers.

assert.strictEqual(
operationCompleteData.metricsCollectorData.cluster,
CLUSTER,
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where we prove we are getting the Zone/Cluster from the trailer.

@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: xl Pull request size is extra large. labels Jul 14, 2025
Copy link

@daniel-sanche daniel-sanche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good for the most part, but I have a few small comments

}
if (!gaxOptions.otherArgs.options.interceptors) {
gaxOptions.otherArgs.options.interceptors = [interceptor];
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need to add the interceptor, even if other interceptors were added?

Copy link
Contributor Author

@danieljbruce danieljbruce Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That really depends on how we want to design the system. Here are the pros and cons of adding the interceptor.

Pros:

  • Existing users that are already using interceptors, likely a small minority of our users, will automatically be using the new interceptor now so they don't have to add the CSM interceptor to their interceptors

Cons:

  • It means the user doesn't have full control of the interceptors which reduces flexibility

Yeah, I guess the pros outweigh the cons so we can add this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not sure I like this change anymore. If the interceptors are not an array (although they should be) then trying to push another interceptor onto interceptors will throw an error. I don't want users to experience that just in case they are using interceptors incorrectly. As you can see in the screenshot, the compiler doesn't guarantee any particular structure for the interceptors property.

image

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I'm understanding this right: with the current code, if the user passes in their own interceptors, they would effectively be disabling csm, right?

What else could they be passing other than an array? There must be some docstring somewhere telling us what we need to support

Copy link
Contributor Author

@danieljbruce danieljbruce Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the user passes in their own interceptors, they would effectively be disabling csm, right?

Yes, unless they explicitly passed in the CSM interceptor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this says "Additional arguments to be passed to the API calls." That's not helpful enough so I'll keep looking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the code examples pass interceptors in as an array. If you feel strongly about adding it to the existing interceptors then we can make that change, but I just don't like how the compiler doesn't guarantee that it is an array.

Copy link

@daniel-sanche daniel-sanche Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check the type of gaxOptions.otherArgs.options.interceptors, and only append if it is a list? And ignore unexpected input types?

I really don't think we should silently disable CSM if the user passes in an interceptor. That would present as a bug, and it could be difficult to track down

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I made this change.

},
onReceiveStatus: (status, nextStat) => {
if (status.code === GrpcStatus.OK && savedReceiveMessage) {
collector.onResponse(); // Call onResponse for successful unary calls with a message

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the right place for onResponse? Why not put this in onReceiveMessage?

Also, isn't onResponse only used by read rows? should createMetricsInterceptorProvider have a bool argument indicating whether to call onResponse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, isn't onResponse only used by read rows?

This is a good point. I don't think we need the call to onResponse at all because this interceptor should only be used for unary calls. For streaming calls like readRows, if we really did want to call onResponse earlier like right when the client receives the message rather than after it passes through gax then we should have a different interceptor that doesn't do the status and metadata parse. Unless we plan on making changes to the streaming calls to change the way they do status and metadata parsing?

My suggestion is to remove the onResponse call for now since the intent is to only use these interceptors for unary calls.

Then as a backlog item we can decide:

  1. If streaming calls should use interceptors as well. If not no changes are required
  2. I did an experiment and confirmed that onReceiveMessage seems to be the right place to do the onResponseCall
  3. Make the withInterceptors method accept a parameter that picks the right interceptor. ie. if ReadRows then call onResponse

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, yeah IIRC onResponse is only for read_rows, so probably don't need it here

I think interceptors are optional for streams. If it would make the code more clean, and let us strip out some extra logic, we should go for it. But if it's easier to grab the metadata using wrappers, that's fine too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the code which appears to always send back an OK code when there are partial failure errors?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was that last comment meant for the other thread, where we were discussing Partial Errors?

I don't think that section of code tells us much on its own though. Where is the error being handled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was that last comment meant for the other thread, where we were discussing Partial Errors?

Yes I think so. I see it contains the link to the same code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think interceptors are optional for streams

Yes. I think so too. And even if not then this isn't a blocker. I think no further action is required here.

nextMd(metadata);
},
onReceiveMessage: (message, nextMsg) => {
savedReceiveMessage = message; // Still need to know if a message was received for onResponse
Copy link

@daniel-sanche daniel-sanche Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this variable still needed? If not, is this event onReceiveMessage handler still needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This variable has been removed now.

Copy link

@daniel-sanche daniel-sanche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my only remaining concern is that users passing in interceptors would disable CSM.

We need to either document that and explain how to manually pass the CSM interceptor, or make sure we append the interceptor to whatever the user passes in

},
onReceiveMessage: (message, nextMsg) => {
nextMsg(message);
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this block actually needed? If we left this off, doesn't it just call nextMesg(message) automatically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I tested it and we can ignore this block.

},
sendMessage: (message, next) => next(message),
halfClose: next => next(),
cancel: next => next(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these unimplemented callbacks needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. These aren't necessary either. If you remove them then messages still get sent etc. They were just copy/paste from the logging interceptor code.

@danieljbruce danieljbruce added the automerge Merge the pull request once unit tests and other checks pass. label Jul 23, 2025
Copy link

@daniel-sanche daniel-sanche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if (!gaxOptions.otherArgs.options.interceptors) {
gaxOptions.otherArgs.options.interceptors = [interceptor];
} else {
if (Array.isArray(gaxOptions.otherArgs.options.interceptors)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this would scan better as an else if

@gcf-merge-on-green gcf-merge-on-green bot merged commit c267ede into main Jul 23, 2025
20 of 23 checks passed
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Jul 23, 2025
@gcf-merge-on-green gcf-merge-on-green bot deleted the feat/read-modify-write-row-metrics-isolated-test branch July 23, 2025 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigtable Issues related to the googleapis/nodejs-bigtable API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants