Option to filter tags by service or resource type in aws_tagging_resource table#2466
Option to filter tags by service or resource type in aws_tagging_resource table#2466thomasklemm wants to merge 1 commit intoturbot:mainfrom
Conversation
c7120ff to
8e75ea2
Compare
|
Hi @thomasklemm, I was reviewing the code changes and had a few suggestions for improvement:
Thanks! |
8e75ea2 to
3c2a4eb
Compare
|
Hi @thomasklemm, just checking in to see if you had a chance to review the suggestions and comments above. |
|
Hi @ParthaI, thanks for the detailed suggestions! I made them locally, need to test them in our cluster, would update the PR after |
|
Hi @thomasklemm, just checking in to see if you’ve had a chance to test it out and push the changes based on your findings? |
3c2a4eb to
9779b47
Compare
|
Hi @ParthaI, made the changes and adjusted the initial query examples to match the Tried these queries and all looks correct: select arn from aws_tagging_resource where resource_types = '["ec2", "rds", "s3:bucket"]' order by arn;
select count(arn) from aws_tagging_resource where resource_types = '["ec2", "rds", "s3:bucket"]';
select arn from aws_tagging_resource where resource_types = '["ec2:instance", "rds:db", "s3:bucket"]' order by arn;
select count(arn) from aws_tagging_resource where resource_types = '["ec2:instance", "rds:db", "s3:bucket"]';
select arn from aws_tagging_resource where resource_types = '[]' order by arn;
select arn from aws_tagging_resource order by arn;
select count(arn) from aws_tagging_resource where resource_types = '[]';
select count(arn) from aws_tagging_Resource; |
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a new filter option for the aws_tagging_resource table that allows filtering resources by specific AWS service or resource types using a JSON array of strings.
- Added documentation in aws_tagging_resource.md to explain the new filtering option with examples.
- Modified aws/table_aws_tagging_resource.go to support a new key column "resource_types" and to parse the JSON array of resource types from query qualifiers.
- Enhanced error handling for invalid JSON input in resource_types.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| docs/tables/aws_tagging_resource.md | Updated documentation to describe resource type filters. |
| aws/table_aws_tagging_resource.go | Added key column support, JSON parsing, and error handling for resource_types filter. |
Comments suppressed due to low confidence (1)
aws/table_aws_tagging_resource.go:127
- [nitpick] Consider renaming 'resource_types' to 'rawResourceTypes' to better distinguish it from the parsed slice 'resourceTypes' and to align with Go naming conventions.
resource_types := d.EqualsQuals["resource_types"].GetJsonbValue()
|
@ParthaI There's two length constraints mentioned in the API docs: 100 array items, and 256 characters in total for the string that gets sent to the API. Wondering if we should handle that here and raise an error to the user? If I see correctly AWS just starts to ignore resource types after the character limit (but need to verify this better) |
|
Hello @thomasklemm, thank you for sharing the detailed information. Approach 1: We can pass the Approach 2: Alternatively, if the In my opinion, I’d prefer Approach 1, as it aligns with the default behavior of the AWS CLI. Please let me know your thoughts. Thanks! |
|
@ParthaI I think the API in this case in not returning an error in either case (array > 100 entries, complete string > 256 characters) based on what I observed earlier, it will just silently drop the additional items. The string limit is actually quite easy to it if you're querying for more than 20-25 services at the same time, w/ complete resource types much less is actually usable. I think it might make sense to raise an error in the code to allow the user to adjust their query, or even better do the chunking you describe in approach 2. Is there another place in the AWS plugin where this strategy is being used? Another thing I just noticed: I think the caching isn't working in the case where |
We have implemented a similar pattern (though not exactly the same) in the
Since we are using |
|
Hello @thomasklemm, did you get a chance to take a look at the above comment? |
9779b47 to
9fdcdbf
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR introduces filtering functionality by resource types for the aws_tagging_resource table. Key changes include:
- Updating documentation to describe the new JSON array filter for resource types.
- Adding a "resource_types" column and KeyColumns configuration to support filtering.
- Implementing JSON parsing, batching of resource type qualifiers, and deduplication of results based on ARN.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| docs/tables/aws_tagging_resource.md | Added documentation for filtering resources by resource types. |
| aws/table_aws_tagging_resource.go | Implemented parsing, batching, and deduplication for the new filter. |
Comments suppressed due to low confidence (2)
aws/table_aws_tagging_resource.go:122
- [nitpick] Consider renaming 'resource_types' to 'resourceTypesJSON' to follow Go's camelCase naming conventions and to clearly differentiate the qualifier value from other variables.
resource_types := d.EqualsQuals["resource_types"].GetJsonbValue()
aws/table_aws_tagging_resource.go:222
- [nitpick] Consider renaming 'currentItems' to 'batchCount' to improve clarity and align with idiomatic Go naming practices.
currentItems := 0
|
@ParthaI I have now confirmed that the 256 character limit that the API docs mention doesn't exist in reality, not sure why it made it to the docs. However the 100 items limit exists, so I adjusted the implementation to do automatic batching if more than 100 services/resource types get provided to the resource types filter. Locally it's working very well, but I'd like to confirm it in our production environment too w/ access to larger AWS organizations, so see if there's any issues. Without the batching, this error would get returned: |
|
Thanks, @thomasklemm , for diving deeper into this. The implementation looks good. Please let me know once you've pushed the changes to this PR, and I’ll do a final review. Thanks again for all your efforts! |
|
Hello @thomasklemm, Just checking, did you get any chance to push your latest changes in this PR? |
ae2f94e to
e1a1f5e
Compare
e1a1f5e to
a3d7ff5
Compare
Add support for filtering tagged resources by service or resource type through a new `resource_types` column that accepts a JSON array of filter strings. - Add `resource_types` column to filter resources by type (e.g., ["ec2:instance", "s3:bucket"]) - Implement automatic batching to handle AWS API's 100-item limit for resource type filters - Add comprehensive documentation with examples for common resource type queries - The column accepts service-wide filters (e.g., "lambda") or specific resource types (e.g., "lambda:function") - Large filter lists are automatically split into multiple API requests (100 items per batch) - Results are deduplicated by ARN to prevent duplicate entries across batches - Full backward compatibility maintained - existing queries work unchanged - Parse resource type filters from query qualifiers using GetJsonbValue() - Split filters into batches respecting the 100-item API limit - Process each batch with separate API calls - Track seen resources by ARN to avoid duplicates - Stream results as they're received for optimal performance This feature enables more targeted queries for large AWS environments, reducing API calls and improving query performance when working with specific resource types. Co-authored-by: ParthaI <parthai@turbot.com>
a3d7ff5 to
6b05500
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR introduces support for filtering the aws_tagging_resource table by resource types, allowing users to limit query results based on specific AWS service or resource type identifiers.
- Updated documentation with examples demonstrating the valid JSON array syntax for resource type filters.
- Implemented new functions for parsing filters, batching resource type parameters to comply with the API limit, and handling deduplication across multiple API calls.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| docs/tables/aws_tagging_resource.md | Adds detailed examples and usage instructions for filtering resource types in the documentation. |
| aws/table_aws_tagging_resource.go | Introduces parsing, batching, fetching, and deduplication logic for resource type filters in queries. |
Comments suppressed due to low confidence (1)
aws/table_aws_tagging_resource.go:155
- [nitpick] Consider updating the error message to match documented examples (e.g. using 'rds:db' if that is the expected format) for consistency between the error feedback and the documentation.
return nil, errors.New("failed to parse 'resource_types' qualifier: value must be a JSON array of strings, e.g. [\"ec2:instance\", \"s3:bucket\", \"rds\"]")
| Name: "resource_types", | ||
| Require: plugin.Optional, | ||
| Operators: []string{"="}, | ||
| CacheMatch: query_cache.CacheMatchExact, |
There was a problem hiding this comment.
@ParthaI As far as I can tell there's no caching being applied for this query, the query times are the almost the same when you rerun a query, so it behaves very differently then when another table returns cached data. Any idea how to debug this further? Or should I just remove this line then?
|
HI @ParthaI, I've had a chance to test this in production against AWS organizations of different sizes and have been getting good results for smaller organizations (< 100k tags returned). However I'm stuck on a case where there's a small number of duplicates getting returned – not sure yet why, could be that the AWS API is returning the same resource from multiple regions, which the code here doesn't catch, it only deduplicates when there's the same ARN getting returned across multiple requests in the same batch; hooking into the underlying matrix of requests against different regions and deduplicating across them looks quite a bit more involved, I have the feeling that it doesn't make sense to go this way and rather recommend to the user to put a |
|
Thanks, @thomasklemm, for the detailed information about your testing. I'll take a look at the PR! |
|
Hello @thomasklemm, I’ve reviewed the code changes, and everything looks great—thank you for the clean implementation! 🙌 I’ve been testing the changes from your PR branch, and so far, I haven’t encountered any duplicate rows. As you mentioned, this could be due to the relatively small number of resources (fewer than 500) in my environment. I'd like to investigate this further. Could you please help me with the following details?
Appreciate your thorough testing and detailed observations—thank you again! Looking forward to your response so I can dive deeper into this. 🙏 |
|
Hi @thomasklemm, have you had a chance to review my above comment? |
|
Hi @thomasklemm, Thank you for your contribution and for taking the time to open this PR! Since we haven’t heard back in a while, we’ll go ahead and close this for now to keep things tidy. If you’d like to revisit this work in the future, feel free to reopen the PR or open a new one - we’d be happy to review it when you’re ready. We really appreciate your effort and hope to collaborate again soon! Thanks 👍!! |

Adds the option to filter the
aws_tagging_resourcetable by resource types, e.g.ec2:instance,s3:bucket,auditmanagerfor limiting the response to only Amazon EC2 instances, Amazon S3 buckets, or any AWS Audit Manager resource.Integration test logs
Logs
Example queries