Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting "Can not find message descriptor by type_url" error when calling client.logging_api.write_entries() #945

Open
minherz opened this issue Oct 10, 2024 · 6 comments
Assignees
Labels
api: logging Issues related to the googleapis/python-logging API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@minherz
Copy link
Contributor

minherz commented Oct 10, 2024

Environment details

CloudRun v2 running a job as described in Import logs from storage to logging solution architecture.

  • OS type and version: N/A
  • Python version: Current version defined in the buildpacl
  • pip version: Current version defined in the buildpacl
  • google-cloud-logging version: 3.5.*

Steps to reproduce

  1. Deploy architecture
  2. Import previously exported files. Because these files are property of the customer they cannot be shared. I've sent a request to identify the specific object(s).

Code example

Stack trace

{
  "textPayload": "Task #1, failed: Failed to parse serviceData field: Can not find message descriptor by type_url: type.googleapis.com/google.cloud.bigquery.logging.v1.AuditData at LogEntry.protoPayload.serviceData.",
  "insertId": "6707b2f0000123f332576c49",
  "resource": {
    "type": "cloud_run_job",
    "labels": {
      "job_name": "log-reingest",
      "project_id": "oxydincproject",
      "location": "europe-west1"
    }
  },
  "timestamp": "2024-10-10T10:56:48.074739Z",
  "labels": {
    "run.googleapis.com/task_index": "0",
    "instanceId": "007989f2a1edc07a9422045e0489e77a3082305d4721358c83378156601ca21ae45476480ada98bdc376be7f04cbf17182042fba56df74748bf7585f9f7d78e5a359b046d6",
    "run.googleapis.com/task_attempt": "0",
    "run.googleapis.com/execution_name": "log-reingest-8frwp"
  },
  "logName": "projects/oxydincproject/logs/[run.googleapis.com](http://run.googleapis.com/)%2Fstderr",
  "receiveTimestamp": "2024-10-10T10:56:48.077400529Z"
}

Code of the solution can be found at python-docs-samples/logging/import-logs.

See below for the minimal code sample that reproduces the problem.

@minherz minherz added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. triage me I really want to be triaged. labels Oct 10, 2024
@product-auto-label product-auto-label bot added the api: logging Issues related to the googleapis/python-logging API. label Oct 10, 2024
@minherz
Copy link
Contributor Author

minherz commented Oct 14, 2024

After a few tests I have managed to create a minimal code sample that demonstrates this behavior. Try to run the following code to see this error:

import sys
from google.cloud import logging_v2

TEST_ENTRY = {
    "logName": "placeholder",
    "protoPayload": {
        "@type": "type.googleapis.com/google.cloud.audit.AuditLog",
        "authenticationInfo": {
            "principalEmail": "service-org-12345@gcp-sa-scc-notification.iam.gserviceaccount.com"
        },
        "authorizationInfo": [
            {
                "granted": True,
                "permission": "bigquery.tables.update",
                "resource": "projects/someproject/datasets/sccFindings/tables/findings",
                "resourceAttributes": {}
            }
        ],
        "serviceData": {
            '@type': 'type.googleapis.com/google.cloud.bigquery.logging.v1.AuditData',
            'tableUpdateRequest': {
                'resource': {
                    'info': {},
                    'schemaJson': '{}',
                    'tableName': {
                        'datasetId': 'sccFindings',
                        'projectId': 'someproject',
                        'tableId': 'findings'
                    },
                    'updateTime': '2024-08-20T15:01:48.399Z',
                    'view': {}
                }
            }
        },
        "methodName": "google.cloud.bigquery.v2.TableService.PatchTable",
        "requestMetadata": {
            "callerIp": "private",
            "destinationAttributes": {},
            "requestAttributes": {}
        },
        "resourceName": "projects/someproject/datasets/sccFindings/tables/findings",
        "serviceName": "bigquery.googleapis.com",
        "status": {}
    },
    "resource": {
        "labels": {
            "dataset_id": "sccFindings",
            "project_id": "someproject"
        },
        "type": "bigquery_dataset"
    },
    "severity": "NOTICE",
}

def main():

    client = logging_v2.Client()
    TEST_ENTRY['logName'] = f"projects/{client.project}/logs/test_writing_logs"
    logs = [TEST_ENTRY]
    client.logging_api.write_entries(logs)

# Start script
if __name__ == "__main__":
    try:
        main()
    except Exception as err:
        print(f"Task failed: {err}")
        sys.exit(1)

@minherz
Copy link
Contributor Author

minherz commented Oct 14, 2024

If the protobuf payload uses metadata field instead of serviceData field, the code successfully writes the log entry.

Looks like the assumption mentioned in the code comment does not hold for the type.googleapis.com/google.cloud.bigquery.logging.v1.AuditData protobuf datatype.

@minherz
Copy link
Contributor Author

minherz commented Oct 15, 2024

Changing the deprecated field serviceData to metadata resolves the problem. I am unsure how this patch should be applied.

@minherz
Copy link
Contributor Author

minherz commented Oct 21, 2024

See b/374328640 for additional information.

@gkevinzheng
Copy link
Contributor

@minherz I've been looking in to the failure message, and it looks like it's being raised in google.protobuf.json_format.ParseDict, in this code:

def _CreateMessageFromTypeUrl(type_url, descriptor_pool):
  """Creates a message from a type URL."""
  db = symbol_database.Default()
  pool = db.pool if descriptor_pool is None else descriptor_pool
  type_name = type_url.split('/')[-1]
  try:
    message_descriptor = pool.FindMessageTypeByName(type_name)
  except KeyError as e:
    raise TypeError(
        'Can not find message descriptor by type_url: {0}'.format(type_url)
    ) from e
  message_class = message_factory.GetMessageClass(message_descriptor)
  return message_class()

From what our code does, it looks like you can add the following in your code to resolve the issue:

from google.protobuf import symbol_database

symbol_database.Default().RegisterMessage(<message>)

The only issue is what the correct value of message to be put here, some sort of class that extends google.protobuf.message.Message and has that type URL.

@minherz
Copy link
Contributor Author

minherz commented Oct 24, 2024

Thank you, @gkevinzheng. The matter is that this particular type is a part of the official Google Cloud collection of types.
The application cannot guess how many types will be required. Moreover, as I explained in the workaround, this type is successfully parsed but when it uses not deprecated field. Please, have a look at the bug in my previously comment. As far as I understand, it proposes a more comprehensive solution that does not require registering each Google protobuf type in each application.

I am unsure whether the proposed solution has to be implemented in this client library or it addresses the bundle that the library already using.

For what it worth, I think that, given the protoPayload type for payload is used only for logs generated by Google Cloud services, the customers of this library are not supposed to register each protobuf type from Google Cloud that they plan to use with this library for just in case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: logging Issues related to the googleapis/python-logging API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

3 participants