Skip to content

Conversation

@swathipil
Copy link
Member

fixes: #20818

@swathipil swathipil requested a review from yunhaoling as a code owner October 1, 2021 08:52
@ghost ghost added the Schema Registry label Oct 1, 2021
@swathipil
Copy link
Member Author

/azp run python - schemaregistry - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

if self._auto_register_schemas
else self._schema_registry_client.get_schema_id
)
self._user_input_schema_cache = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache topic: do we need this cache since we now use lru_cache?
as OOB discussion: this is to store schema -> normalized schema pair.
I think we could either remove this cache first or wrap the logic with a lru_cache decorated method.

Comment on lines +31 to +34
:ivar message: The error message.
:vartype message: str
:ivar error: The error condition, if available.
:vartype error: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why "error" ivar here? since the AzureError already has inner_exception ivar

Comment on lines +37 to +40
def __init__(self, message, **kwargs):
self.message = message
self.error = kwargs.get("error")
super(SchemaParseException, self).__init__(self.message, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could remove the __init__ method completely if there's no reason to have error ivar different than the inner_exception

# --------------------------------------------------------------------------
from azure.core.exceptions import AzureError

class SchemaParseException(AzureError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name sounds very generic.
Do we want this error to be library specific or a general one that could be shared among different serializer libraries in the future -- which means it's to be placed under schema registry client library like azure.schemaregistry.serializer.exceptions.SchemaParseException.

I'm leaning towards to having a common shared error, less types

cached_schema.type
)
)
data_bytes = self._avro_serializer.serialize(value, cached_schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be safe here, we could abstract the underlying avro serializer library

  1. parse schema
  • _fast_avro_serializer.parse_schema, normal_avro_serializer.parse_schema
  • try:
    _fast_avro_serializer.parse_schema()
    exception Exception:
    raise SchemaParseException()
  1. serialize
    try:
    _fast_avro_serializer.serialize()
    except Exception:
    SerializationException()
  2. deserialize
    DeserializationException

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could make a matrix for comparison:
fastavro avro
missing property: raising Keyerror raise SchemaParseException
wrong format: ...

but I'm afraid it would take much time to traverse all the cases (especially comers cases which are difficult to think about, and may not be worth the efforts)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Schema Registry] add tests in avro serializer to check that types are not leaked

2 participants