Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interface Partitioner is inefficient #75

Open
dcsobral opened this issue Jul 6, 2018 · 0 comments
Open

Interface Partitioner is inefficient #75

dcsobral opened this issue Jul 6, 2018 · 0 comments

Comments

@dcsobral
Copy link

dcsobral commented Jul 6, 2018

The Partitioner interface design is inefficient. The generatePartitionedPath takes a topic, which is immutable per task, plus an encodedPartition, which is per-record. That leads to things such as confluentinc/kafka-connect-hdfs#224, in which generatePartitionedPath is ignored and a separate method is called that reproduces the behavior of DefaultPartitioner's implementation, minus the partition information.

Instead, a base method should take only topic, and encodedPartition should take that static result and the SinkRecord.

dcsobral added a commit to RichRelevance/kafka-connect-storage-common that referenced this issue Jul 10, 2018
The issue being fixed is that the way this interface is currently
designed leads to Partitioner being effectively un-extensible, unless
they don't need any parameters at all except those defined by the
connectors using the partitioner.

The new design puts the onus of turning properties into a configuration
map on the partitioner class. Because those classes use recommenders and
also use common storage configuration, the interface was extended with
two recommender getters, with a default implementation.

TimestampExtractor also gets the same treatment, as it is used from some
of the Partitioner implementations.

While this does break API, the existing API is difficult to extend to
begin with, which is the reason for the change.
dcsobral added a commit to RichRelevance/kafka-connect-storage-common that referenced this issue Jul 12, 2018
The issue being fixed is that the way this interface is currently
designed leads to Partitioner being effectively un-extensible, unless
they don't need any parameters at all except those defined by the
connectors using the partitioner.

The new design puts the onus of turning properties into a configuration
map on the partitioner class. Because those classes use recommenders and
also use common storage configuration, the interface was extended with
two recommender getters, with a default implementation.

TimestampExtractor also gets the same treatment, as it is used from some
of the Partitioner implementations.

While this does break API, the existing API is difficult to extend to
begin with, which is the reason for the change.
dcsobral added a commit to RichRelevance/kafka-connect-storage-common that referenced this issue Jul 12, 2018
The issue being fixed is that the way this interface is currently
designed leads to Partitioner being effectively un-extensible, unless
they don't need any parameters at all except those defined by the
connectors using the partitioner.

The new design puts the onus of turning properties into a configuration
map on the partitioner class. Because those classes use recommenders and
also use common storage configuration, the interface was extended with
two recommender getters, with a default implementation.

TimestampExtractor also gets the same treatment, as it is used from some
of the Partitioner implementations.

While this does break API, the existing API is difficult to extend to
begin with, which is the reason for the change.
dcsobral added a commit to RichRelevance/kafka-connect-storage-common that referenced this issue Jul 31, 2018
The issue being fixed is that the way this interface is currently
designed leads to Partitioner being effectively un-extensible, unless
they don't need any parameters at all except those defined by the
connectors using the partitioner.

The new design puts the onus of turning properties into a configuration
map on the partitioner class. Because those classes use recommenders and
also use common storage configuration, the interface was extended with
two recommender getters, with a default implementation.

TimestampExtractor also gets the same treatment, as it is used from some
of the Partitioner implementations.

While this does break API, the existing API is difficult to extend to
begin with, which is the reason for the change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant