-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Converting cosmos exception into json format #18092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converting cosmos exception into json format #18092
Conversation
|
/azp run java - cosmos - ci |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Naveen for the PR. This is helpful.
@simplynaveen20 Could you please add a sample exception stacktrace to the PR description?
also:
Serializing diagnostics is expensive and will have perf impact.
Now that we are doing that in the exception stacktrace for each exception we should exclude the business logic errors:
- NotFound (with no substatus code, i.e., for READ_SESSION_NOT_AVAILABLE we need diagnostics),
- Conflict,
- PreconditionFailed.
These are business logic failures we don't need to the diagnostics for this.
NotFound with READ_SESSION_NOT_AVAILABLE substatus code is a consistency failure not business logic we should have diagnostics for that.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosDiagnostics.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosException.java
Show resolved
Hide resolved
Before this PR we were still serializing diagnostics and putting in string on exception message. In this PR we putting diagnostics object in exception parent ObjectNode and then serializing in json at the last. So for computation on diagnostics serialization still remain the same. |
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
moderakh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to capture the following work in a separate PR (work item)
materializing diagnostics in every exception stacktrace for every exception is expensive and has perf impact, we should exclude the business logic errors:
NotFound (with no substatus code, i.e., for READ_SESSION_NOT_AVAILABLE we need diagnostics),
Conflict,
PreconditionFailed.
These are business logic failures we don't need to the diagnostics for this.
NotFound with READ_SESSION_NOT_AVAILABLE substatus code is a consistency failure not business logic we should have diagnostics for that.
Currently CosmosException toString() and getMessage() method return a mix of non-json text and json diagnostic information.
This is causing problems during debugging the issue when the customer reports the exception.
This PR will convert both toString() and getMessage() into json format. Added test coverage to avoid future contract break.
Examples
toString()-
{
"ClassName": "NotFoundException",
"userAgent": "azsdk-java-cosmos/4.10.0 Windows10/10.0 JRE/11.0.6",
"statusCode": 404,
"resourceAddress": "rntbd://127.0.0.1:10253/apps/DocDbApp/services/DocDbServer20/partitions/a4cb4960-38c8-11e6-8106-8cdcd42c33be/replicas/1p/",
"error": "{"Errors":["Resource Not Found. Learn more: https://aka.ms/cosmosdb-tsg-not-found"]}",
"innerErrorMessage": "["Resource Not Found. Learn more: https://aka.ms/cosmosdb-tsg-not-found"]",
"causeInfo": null,
"responseHeaders": "{x-ms-current-replica-set-size=1, x-ms-last-state-change-utc=Tue, 15 Dec 2020 19:26:43.217 GMT, x-ms-session-token=0:-1#1632, lsn=1632, x-ms-request-charge=1.0, x-ms-schemaversion=1.10, x-ms-transport-request-id=2, x-ms-number-of-read-regions=0, x-ms-current-write-quorum=1, x-ms-cosmos-quorum-acked-llsn=1632, x-ms-quorum-acked-lsn=1632, x-ms-activity-id=42c69559-3f0c-11eb-9f3c-59b9e4f788a5, x-ms-xp-role=0, x-ms-global-Committed-lsn=-1, x-ms-cosmos-llsn=1632, x-ms-serviceversion= version=2.11.0.0}",
"cosmosDiagnostics": {
"userAgent": "azsdk-java-cosmos/4.10.0 Windows10/10.0 JRE/11.0.6",
"requestLatencyInMs": 392,
"requestStartTimeUTC": "2020-12-15T19:32:25.730376600Z",
"requestEndTimeUTC": "2020-12-15T19:32:26.122406500Z",
"connectionMode": "DIRECT",
"responseStatisticsList": [
{
"storeResult": {
"storePhysicalAddress": null,
"lsn": 1632,
"globalCommittedLsn": -1,
"partitionKeyRangeId": "0",
"isValid": true,
"statusCode": 404,
"subStatusCode": 0,
"isGone": false,
"isNotFound": true,
"isInvalidPartition": false,
"requestCharge": 1,
"itemLSN": -1,
"sessionToken": "-1#1632",
"exception": "["Resource Not Found. Learn more: https://aka.ms/cosmosdb-tsg-not-found"]",
"transportRequestTimeline": [
{
"eventName": "created",
"startTimeUTC": "2020-12-15T19:32:26.025410700Z",
"durationInMicroSec": 1993
},
{
"eventName": "queued",
"startTimeUTC": "2020-12-15T19:32:26.027404500Z",
"durationInMicroSec": 0
},
{
"eventName": "channelAcquisitionStarted",
"startTimeUTC": "2020-12-15T19:32:26.027404500Z",
"durationInMicroSec": 36003
},
{
"eventName": "pipelined",
"startTimeUTC": "2020-12-15T19:32:26.063407800Z",
"durationInMicroSec": 13000
},
{
"eventName": "transitTime",
"startTimeUTC": "2020-12-15T19:32:26.076407900Z",
"durationInMicroSec": 2998
},
{
"eventName": "received",
"startTimeUTC": "2020-12-15T19:32:26.079406600Z",
"durationInMicroSec": 32000
},
{
"eventName": "completed",
"startTimeUTC": "2020-12-15T19:32:26.111407100Z",
"durationInMicroSec": 8000
}
],
"rntbdRequestLengthInBytes": 463,
"rntbdResponseLengthInBytes": 283,
"requestPayloadLengthInBytes": 0,
"responsePayloadLengthInBytes": null,
"channelTaskQueueSize": 1,
"pendingRequestsCount": 1,
"serviceEndpointStatistics": {
"availableChannels": 0,
"acquiredChannels": 0,
"executorTaskQueueSize": 0,
"inflightRequests": 1,
"lastSuccessfulRequestTime": "2020-12-15T19:32:26.026Z",
"lastRequestTime": "2020-12-15T19:32:26.026Z",
"createdTime": "2020-12-15T19:32:26.026406900Z",
"isClosed": false
}
},
"requestResponseTimeUTC": "2020-12-15T19:32:26.122406500Z",
"requestResourceType": "Document",
"requestOperationType": "Read"
}
],
"supplementalResponseStatisticsList": [],
"addressResolutionStatistics": {
"42ded84d-3f0c-11eb-9f3c-59b9e4f788a5": {
"startTimeUTC": "2020-12-15T19:32:25.900406600Z",
"endTimeUTC": "2020-12-15T19:32:25.917410400Z",
"targetEndpoint": "https://127.0.0.1:8081/addresses/?$resolveFor=dbs%2FRuR-AA%3D%3D%2Fcolls%2FRuR-AKEqke8%3D%2Fdocs&$filter=protocol%20eq%20rntbd&$partitionKeyRangeIds=0",
"errorMessage": null,
"inflightRequest": false
}
},
"regionsContacted": [
"https://127.0.0.1:8081/"
],
"retryContext": {
"retryCount": 0,
"statusAndSubStatusCodes": null,
"retryLatency": 0
},
"metadataDiagnosticsContext": {
"metadataDiagnosticList": [
{
"metaDataName": "CONTAINER_LOOK_UP",
"startTimeUTC": "2020-12-15T19:32:25.750382200Z",
"endTimeUTC": "2020-12-15T19:32:25.789377600Z",
"durationinMS": 38
},
{
"metaDataName": "PARTITION_KEY_RANGE_LOOK_UP",
"startTimeUTC": "2020-12-15T19:32:25.857394800Z",
"endTimeUTC": "2020-12-15T19:32:25.894406900Z",
"durationinMS": 37
},
{
"metaDataName": "SERVER_ADDRESS_LOOKUP",
"startTimeUTC": "2020-12-15T19:32:25.900406600Z",
"endTimeUTC": "2020-12-15T19:32:25.917410400Z",
"durationinMS": 17
}
]
},
"serializationDiagnosticsContext": {
"serializationDiagnosticsList": null
},
"gatewayStatistics": null,
"systemInformation": {
"usedMemory": "79083 KB",
"availableMemory": "4074261 KB",
"systemCpuLoad": "(2020-12-15T19:32:22.653618900Z 33.7%), (2020-12-15T19:32:22.990406700Z 33.7%)"
},
"clientCfgs": {
"id": -2,
"numberOfClients": 4,
"connCfg": {
"rntbd": "(cto:PT5S, rto:PT5S, icto:PT0S, ieto:PT1H, mcpe:130, mrpc:30, cer:false)",
"gw": "(cps:1000, rto:PT5S, icto:null, p:false)",
"other": "(ed: true, cs: false)"
},
"consistencyCfg": "(consistency: null, mm: true, prgns: [])"
}
}
}
getMessage()-
{
"innerErrorMessage": "["Resource Not Found. Learn more: https://aka.ms/cosmosdb-tsg-not-found"]",
"cosmosDiagnostics": {
"userAgent": "azsdk-java-cosmos/4.10.0 Windows10/10.0 JRE/11.0.6",
"requestLatencyInMs": 392,
"requestStartTimeUTC": "2020-12-15T19:32:25.730376600Z",
"requestEndTimeUTC": "2020-12-15T19:32:26.122406500Z",
"connectionMode": "DIRECT",
"responseStatisticsList": [
{
"storeResult": {
"storePhysicalAddress": null,
"lsn": 1632,
"globalCommittedLsn": -1,
"partitionKeyRangeId": "0",
"isValid": true,
"statusCode": 404,
"subStatusCode": 0,
"isGone": false,
"isNotFound": true,
"isInvalidPartition": false,
"requestCharge": 1,
"itemLSN": -1,
"sessionToken": "-1#1632",
"exception": "["Resource Not Found. Learn more: https://aka.ms/cosmosdb-tsg-not-found"]",
"transportRequestTimeline": [
{
"eventName": "created",
"startTimeUTC": "2020-12-15T19:32:26.025410700Z",
"durationInMicroSec": 1993
},
{
"eventName": "queued",
"startTimeUTC": "2020-12-15T19:32:26.027404500Z",
"durationInMicroSec": 0
},
{
"eventName": "channelAcquisitionStarted",
"startTimeUTC": "2020-12-15T19:32:26.027404500Z",
"durationInMicroSec": 36003
},
{
"eventName": "pipelined",
"startTimeUTC": "2020-12-15T19:32:26.063407800Z",
"durationInMicroSec": 13000
},
{
"eventName": "transitTime",
"startTimeUTC": "2020-12-15T19:32:26.076407900Z",
"durationInMicroSec": 2998
},
{
"eventName": "received",
"startTimeUTC": "2020-12-15T19:32:26.079406600Z",
"durationInMicroSec": 32000
},
{
"eventName": "completed",
"startTimeUTC": "2020-12-15T19:32:26.111407100Z",
"durationInMicroSec": 8000
}
],
"rntbdRequestLengthInBytes": 463,
"rntbdResponseLengthInBytes": 283,
"requestPayloadLengthInBytes": 0,
"responsePayloadLengthInBytes": null,
"channelTaskQueueSize": 1,
"pendingRequestsCount": 1,
"serviceEndpointStatistics": {
"availableChannels": 0,
"acquiredChannels": 0,
"executorTaskQueueSize": 0,
"inflightRequests": 1,
"lastSuccessfulRequestTime": "2020-12-15T19:32:26.026Z",
"lastRequestTime": "2020-12-15T19:32:26.026Z",
"createdTime": "2020-12-15T19:32:26.026406900Z",
"isClosed": false
}
},
"requestResponseTimeUTC": "2020-12-15T19:32:26.122406500Z",
"requestResourceType": "Document",
"requestOperationType": "Read"
}
],
"supplementalResponseStatisticsList": [],
"addressResolutionStatistics": {
"42ded84d-3f0c-11eb-9f3c-59b9e4f788a5": {
"startTimeUTC": "2020-12-15T19:32:25.900406600Z",
"endTimeUTC": "2020-12-15T19:32:25.917410400Z",
"targetEndpoint": "https://127.0.0.1:8081/addresses/?$resolveFor=dbs%2FRuR-AA%3D%3D%2Fcolls%2FRuR-AKEqke8%3D%2Fdocs&$filter=protocol%20eq%20rntbd&$partitionKeyRangeIds=0",
"errorMessage": null,
"inflightRequest": false
}
},
"regionsContacted": [
"https://127.0.0.1:8081/"
],
"retryContext": {
"retryCount": 0,
"statusAndSubStatusCodes": null,
"retryLatency": 0
},
"metadataDiagnosticsContext": {
"metadataDiagnosticList": [
{
"metaDataName": "CONTAINER_LOOK_UP",
"startTimeUTC": "2020-12-15T19:32:25.750382200Z",
"endTimeUTC": "2020-12-15T19:32:25.789377600Z",
"durationinMS": 38
},
{
"metaDataName": "PARTITION_KEY_RANGE_LOOK_UP",
"startTimeUTC": "2020-12-15T19:32:25.857394800Z",
"endTimeUTC": "2020-12-15T19:32:25.894406900Z",
"durationinMS": 37
},
{
"metaDataName": "SERVER_ADDRESS_LOOKUP",
"startTimeUTC": "2020-12-15T19:32:25.900406600Z",
"endTimeUTC": "2020-12-15T19:32:25.917410400Z",
"durationinMS": 17
}
]
},
"serializationDiagnosticsContext": {
"serializationDiagnosticsList": null
},
"gatewayStatistics": null,
"systemInformation": {
"usedMemory": "79083 KB",
"availableMemory": "4074261 KB",
"systemCpuLoad": "(2020-12-15T19:32:22.653618900Z 33.7%), (2020-12-15T19:32:22.990406700Z 33.7%)"
},
"clientCfgs": {
"id": -2,
"numberOfClients": 4,
"connCfg": {
"rntbd": "(cto:PT5S, rto:PT5S, icto:PT0S, ieto:PT1H, mcpe:130, mrpc:30, cer:false)",
"gw": "(cps:1000, rto:PT5S, icto:null, p:false)",
"other": "(ed: true, cs: false)"
},
"consistencyCfg": "(consistency: null, mm: true, prgns: [])"
}
}
}