-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interface Partitioner is inefficient #75
Comments
dcsobral
added a commit
to RichRelevance/kafka-connect-storage-common
that referenced
this issue
Jul 10, 2018
The issue being fixed is that the way this interface is currently designed leads to Partitioner being effectively un-extensible, unless they don't need any parameters at all except those defined by the connectors using the partitioner. The new design puts the onus of turning properties into a configuration map on the partitioner class. Because those classes use recommenders and also use common storage configuration, the interface was extended with two recommender getters, with a default implementation. TimestampExtractor also gets the same treatment, as it is used from some of the Partitioner implementations. While this does break API, the existing API is difficult to extend to begin with, which is the reason for the change.
dcsobral
added a commit
to RichRelevance/kafka-connect-storage-common
that referenced
this issue
Jul 12, 2018
The issue being fixed is that the way this interface is currently designed leads to Partitioner being effectively un-extensible, unless they don't need any parameters at all except those defined by the connectors using the partitioner. The new design puts the onus of turning properties into a configuration map on the partitioner class. Because those classes use recommenders and also use common storage configuration, the interface was extended with two recommender getters, with a default implementation. TimestampExtractor also gets the same treatment, as it is used from some of the Partitioner implementations. While this does break API, the existing API is difficult to extend to begin with, which is the reason for the change.
dcsobral
added a commit
to RichRelevance/kafka-connect-storage-common
that referenced
this issue
Jul 12, 2018
The issue being fixed is that the way this interface is currently designed leads to Partitioner being effectively un-extensible, unless they don't need any parameters at all except those defined by the connectors using the partitioner. The new design puts the onus of turning properties into a configuration map on the partitioner class. Because those classes use recommenders and also use common storage configuration, the interface was extended with two recommender getters, with a default implementation. TimestampExtractor also gets the same treatment, as it is used from some of the Partitioner implementations. While this does break API, the existing API is difficult to extend to begin with, which is the reason for the change.
dcsobral
added a commit
to RichRelevance/kafka-connect-storage-common
that referenced
this issue
Jul 31, 2018
The issue being fixed is that the way this interface is currently designed leads to Partitioner being effectively un-extensible, unless they don't need any parameters at all except those defined by the connectors using the partitioner. The new design puts the onus of turning properties into a configuration map on the partitioner class. Because those classes use recommenders and also use common storage configuration, the interface was extended with two recommender getters, with a default implementation. TimestampExtractor also gets the same treatment, as it is used from some of the Partitioner implementations. While this does break API, the existing API is difficult to extend to begin with, which is the reason for the change.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The
Partitioner
interface design is inefficient. ThegeneratePartitionedPath
takes a topic, which is immutable per task, plus anencodedPartition
, which is per-record. That leads to things such as confluentinc/kafka-connect-hdfs#224, in whichgeneratePartitionedPath
is ignored and a separate method is called that reproduces the behavior ofDefaultPartitioner
's implementation, minus the partition information.Instead, a base method should take only topic, and
encodedPartition
should take that static result and theSinkRecord
.The text was updated successfully, but these errors were encountered: