Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit Exceed Error After Upgrading KPL API to 0.15.11 Due to New Metrics Publishing #601

Open
vinitm798 opened this issue Oct 17, 2024 · 3 comments

Comments

@vinitm798
Copy link

After upgrading the Kinesis Producer Library (KPL) API from version 0.13.1 to 0.15.11 (also tested on 0.14.12, 0.14.5), we started encountering a Limit Exceed error. Upon investigation, we identified that this issue is related to new KPL metrics being published with the newer versions, specifically:

  • UserRecordsPerPutRecordsRequest
  • KinesisRecordsPerPutRecordsRequest
    image (1)
    image

These metrics were not available in KPL version 0.13.1 and their presence has caused throttling and quota limit issues.

Steps to Reproduce:

  • Upgrade KPL API from 0.13.1 to 0.15.11.
  • Observe that new KPL metrics UserRecordsPerPutRecordsRequest and KinesisRecordsPerPutRecordsRequest are now published.
  • Experience throttling and limit exceed errors.
  • Revert back to KPL 0.13.1 and observe that these metrics are no longer published and the issue no longer occurs.

Expected Behavior:
The application should function without hitting throttling or quota limits after upgrading the KPL API.

Actual Behavior:
New KPL metrics introduced in versions beyond 0.13.1 are causing the application to exceed limits and experience throttling, leading to a Limit Exceed error.

AWS Support Response: This issue has been confirmed as a known bug by AWS Support. They are currently working on a fix, but no release timeline has been provided.

Workaround: Reverting back to KPL 0.13.1 stops these metrics from being published and mitigates the issue.

Additional Context: We have raised a ticket with AWS and are waiting for the release of a fix for this known issue.

Following is the error encountered:

[2024-10-16 11:08:23.511755] [0x000022eb][0x00007f7088388700] [error] [metrics_manager.cc:145] Metrics upload failed. | Code: Throttling | Message: Rate exceeded | Request was: Action=PutMetricData&Namespace=KinesisProducerLibrary&MetricData.member.1.MetricName=AllErrors&MetricData.member.1.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.1.StatisticValues.SampleCount=150&MetricData.member.1.StatisticValues.Sum=0&MetricData.member.1.StatisticValues.Minimum=0&MetricData.member.1.StatisticValues.Maximum=0&MetricData.member.1.Unit=Count&MetricData.member.2.MetricName=BufferingTime&MetricData.member.2.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.2.StatisticValues.SampleCount=150&MetricData.member.2.StatisticValues.Sum=9996&MetricData.member.2.StatisticValues.Minimum=1&MetricData.member.2.StatisticValues.Maximum=101&MetricData.member.2.Unit=Milliseconds&MetricData.member.3.MetricName=KinesisRecordsDataPut&MetricData.member.3.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.3.StatisticValues.SampleCount=150&MetricData.member.3.StatisticValues.Sum=280384&MetricData.member.3.StatisticValues.Minimum=567&MetricData.member.3.StatisticValues.Maximum=7251&MetricData.member.3.Unit=Bytes&MetricData.member.4.MetricName=KinesisRecordsPerPutRecordsRequest&MetricData.member.4.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.4.StatisticValues.SampleCount=27&MetricData.member.4.StatisticValues.Sum=150&MetricData.member.4.StatisticValues.Minimum=1&MetricData.member.4.StatisticValues.Maximum=15&MetricData.member.4.Unit=Count&MetricData.member.5.MetricName=KinesisRecordsPut&MetricData.member.5.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.5.StatisticValues.SampleCount=150&MetricData.member.5.StatisticValues.Sum=150&MetricData.member.5.StatisticValues.Minimum=1&MetricData.member.5.StatisticValues.Maximum=1&MetricData.member.5.Unit=Count&MetricData.member.6.MetricName=RequestTime&MetricData.member.6.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.6.StatisticValues.SampleCount=27&MetricData.member.6.StatisticValues.Sum=516&MetricData.member.6.StatisticValues.Minimum=10&MetricData.member.6.StatisticValues.Maximum=62&MetricData.member.6.Unit=Milliseconds&MetricData.member.7.MetricName=RetriesPerRecord&MetricData.member.7.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.7.StatisticValues.SampleCount=150&MetricData.member.7.StatisticValues.Sum=0&MetricData.member.7.StatisticValues.Minimum=0&MetricData.member.7.StatisticValues.Maximum=0&MetricData.member.7.Unit=Count&MetricData.member.8.MetricName=UserRecordsDataPut&MetricData.member.8.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.8.StatisticValues.SampleCount=150&MetricData.member.8.StatisticValues.Sum=280384&MetricData.member.8.StatisticValues.Minimum=567&MetricData.member.8.StatisticValues.Maximum=7251&MetricData.member.8.Unit=Bytes&MetricData.member.9.MetricName=UserRecordsPending&MetricData.member.9.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.9.StatisticValues.SampleCount=296&MetricData.member.9.StatisticValues.Sum=37&MetricData.member.9.StatisticValues.Minimum=0&MetricData.member.9.StatisticValues.Maximum=8&MetricData.member.9.Unit=Count&MetricData.member.10.MetricName=UserRecordsPerKinesisRecord&MetricData.member.10.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.10.StatisticValues.SampleCount=150&MetricData.member.10.StatisticValues.Sum=150&MetricData.member.10.StatisticValues.Minimum=1&MetricData.member.10.StatisticValues.Maximum=1&MetricData.member.10.Unit=Count&MetricData.member.11.MetricName=UserRecordsPerPutRecordsRequest&MetricData.member.11.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.11.StatisticValues.SampleCount=27&MetricData.member.11.StatisticValues.Sum=150&MetricData.member.11.StatisticValues.Minimum=1&MetricData.member.11.StatisticValues.Maximum=15&MetricData.member.11.Unit=Count&MetricData.member.12.MetricName=UserRecordsPut&MetricData.member.12.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.12.StatisticValues.SampleCount=150&MetricData.member.12.StatisticValues.Sum=150&MetricData.member.12.StatisticValues.Minimum=1&MetricData.member.12.StatisticValues.Maximum=1&MetricData.member.12.Unit=Count&MetricData.member.13.MetricName=UserRecordsReceived&MetricData.member.13.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.13.StatisticValues.SampleCount=150&MetricData.member.13.StatisticValues.Sum=150&MetricData.member.13.StatisticValues.Minimum=1&MetricData.member.13.StatisticValues.Maximum=1&MetricData.member.13.Unit=Count&MetricData.member.14.MetricName=AllErrors&MetricData.member.14.Dimensions.member.1.Name=StreamName&MetricData.member.14.Dimensions.member.1.Value=push-messages-wnssls-iot1-shared&MetricData.member.14.Timestamp=2024-10-16T11%3A08%3A23Z&MetricData.member.14.StatisticValues.SampleCount=150&MetricData.member.14.StatisticValues.Sum=0&MetricData.member.14.StatisticValues.Minimum=0&MetricData.member.14.StatisticValues.Maximum=0&MetricData.member.14.Unit=Count&Version=2010-08-01

@buddhike
Copy link

Hi @vinitm798, Thanks for reporting this issue. Have you tried changing the metrics granularity as suggested in this document

@vinitm798
Copy link
Author

vinitm798 commented Jan 14, 2025

hi @buddhike ,
yes, I tried reducing the scope of granularity but we can still see these errors.

@buddhike
Copy link

Thanks for confirming that @vinitm798 🙏🏾. I believe PutMetricData is an account level quota. You can check your utilisation by going to Service Quotas > AWS Services > Amazon CloudWatch > Rate of PutMetricData Requests. If the utilisation graph shows that it's over utilised, you should be able to request a quota increase. Have you already tried this option?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants