Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table aws_network_firewall_log #114

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Priyanka-Chatterjee-2000
Copy link
Contributor

Example query results

Results
Add example SQL query results here (please include the input queries as well)

@Priyanka-Chatterjee-2000 Priyanka-Chatterjee-2000 added the review Triggers a workflow to review and validate queries label Mar 30, 2025
Copy link

SQL Query Evaluation Results for aws_network_firewall_log

Daily Network Firewall Activity Trends ❌

Query

Daily Network Firewall Activity Trends

Count Network Firewall log entries per day to identify network activity trends. This helps monitor overall firewall activity and detect unusual spikes in traffic.

select
  strftime(event_timestamp, '%Y-%m-%d') as traffic_date,
  count(*) as log_count
from
  aws_network_firewall_log
group by
  traffic_date
order by
  traffic_date asc;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema Use event.timestamp instead of event_timestamp
STRUCT type columns use dot notation Use event.timestamp to access the nested field
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
Space before and after each -> and ->> N/A
SQL query syntax uses valid DuckDB syntax Replace strftime with date_trunc('day', event.timestamp)
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp as the first column N/A
Non-aggregated queries should include columns related to where the resources exist N/A
Non-aggregated queries should place columns related to where the resources exist last N/A
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause N/A
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Top 10 Source IPs Generating Traffic ❌

Query

Top 10 Source IPs Generating Traffic

Identify the top 10 source IP addresses that generated the most network traffic. This helps detect potential high-traffic sources that may indicate misconfigured applications or suspicious activity.

select
  tp_source_ip,
  count(*) as log_count
from
  aws_network_firewall_log
where
  tp_source_ip is not null
group by
  tp_source_ip
order by
  log_count desc
limit 10;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema Replace tp_source_ip with event.src_ip
STRUCT type columns use dot notation Use event.src_ip instead of tp_source_ip
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
Space before and after each -> and ->> N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp as the first column N/A
Non-aggregated queries should include columns related to where the resources exist N/A
Non-aggregated queries should place columns related to where the resources exist last N/A
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Traffic Distribution by Protocol ❌

Query

Traffic Distribution by Protocol

Analyze traffic distribution by protocol to understand what types of network protocols are most frequently used in your environment.

Identify Traffic from a Suspicious IP ✅

Query

Identify Traffic from a Suspicious IP

Check if a specific IP is sending or receiving traffic. This is useful for investigating potential threats or monitoring known suspicious IPs.

select
  event_timestamp,
  event ->> 'src_ip' as source_ip,
  event ->> 'src_port' as source_port,
  event ->> 'dest_ip' as destination_ip,
  event ->> 'dest_port' as destination_port,
  event ->> 'proto' as protocol,
  firewall_name
from
  aws_network_firewall_log
where
  event ->> 'src_ip' = '192.0.2.100'
  or event ->> 'dest_ip' = '192.0.2.100'
order by
  event_timestamp desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis Wrap JSON operators in parentheses, e.g., (event ->> 'src_ip')
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ❌
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column Replace event_timestamp with tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist Include availability_zone column
Non-aggregated queries should place columns related to where the resources exist last Move firewall_name and availability_zone to the end of the SELECT statement
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Detect Potentially Suspicious DNS Traffic ❌

Query

Detect Potentially Suspicious DNS Traffic

Identify DNS traffic that might indicate command and control (C2) communications or data exfiltration attempts over DNS.

select
  event_timestamp,
  event ->> 'src_ip' as source_ip,
  event ->> 'dest_ip' as destination_ip,
  event ->> 'dest_port' as destination_port,
  event ->> 'proto' as protocol,
  firewall_name
from
  aws_network_firewall_log
where
  event ->> 'app_proto' = 'dns'
  and event ->> 'dest_port' = '53'
  and event ->> 'proto' = 'UDP'
order by
  event_timestamp desc;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis Wrap JSON access operations in parentheses, e.g., (event ->> 'src_ip')
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ❌
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column Replace event_timestamp with tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist Include availability_zone in the SELECT statement
Non-aggregated queries should place columns related to where the resources exist last Move availability_zone to the end of the SELECT statement
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause Remove 'proto' from SELECT as it's fixed to 'UDP' in the WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Monitor Network Firewall Alerts ✅

Query

Monitor Network Firewall Alerts

Identify and analyze alert events generated by the Network Firewall. This helps monitor security rules and detect potential threats.

select
  event_timestamp,
  event ->> 'src_ip' as source_ip,
  event ->> 'dest_ip' as destination_ip,
  event ->> 'alert' ->> 'action' as alert_action,
  event ->> 'alert' ->> 'signature' as alert_signature,
  event ->> 'alert' ->> 'severity' as alert_severity,
  event ->> 'alert' ->> 'category' as alert_category,
  firewall_name
from
  aws_network_firewall_log
where
  event ->> 'alert' is not null
order by
  event_timestamp desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis Wrap JSON access operations in parentheses, e.g., (event ->> 'src_ip')
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Monitor Traffic Through Specific Firewall ✅

Query

Monitor Traffic Through Specific Firewall

Retrieve network traffic flows through a specific Network Firewall. This helps analyze firewall behavior and troubleshoot connectivity issues.

select
  event_timestamp,
  event ->> 'src_ip' as source_ip,
  event ->> 'src_port' as source_port,
  event ->> 'dest_ip' as destination_ip,
  event ->> 'dest_port' as destination_port,
  event ->> 'proto' as protocol,
  event ->> 'app_proto' as app_protocol,
  availability_zone
from
  aws_network_firewall_log
where
  firewall_name = 'main-firewall'
order by
  event_timestamp desc
limit 100;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis Wrap JSON column accesses in parentheses, e.g., (event ->> 'src_ip')
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query Consider updating the title to reflect the 100 limit, e.g., "Monitor Last 100 Traffic Flows Through Specific Firewall"
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Netflow Traffic Analysis ❌

Query

Netflow Traffic Analysis

Analyze netflow data to understand traffic patterns, including bytes transferred and packet counts.

select
  event_timestamp,
  event ->> 'src_ip' as source_ip,
  event ->> 'dest_ip' as destination_ip,
  event ->> 'proto' as protocol,
  event ->> 'netflow' ->> 'bytes' as bytes,
  event ->> 'netflow' ->> 'packets' as packets,
  event ->> 'netflow' ->> 'start_time' as start_time,
  event ->> 'netflow' ->> 'end_time' as end_time,
  firewall_name
from
  aws_network_firewall_log
where
  event ->> 'event_type' = 'netflow'
  and event ->> 'netflow' is not null
order by
  event_timestamp desc
limit 100;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis Wrap JSON access operations in parentheses, e.g., (event ->> 'src_ip')
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ❌
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query Update title to reflect the 100 limit, e.g., "Top 100 Netflow Traffic Analysis"
Description explains what the query does
Description explains why a user would run the query Add a sentence explaining the benefits of analyzing netflow data
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ❌
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column Replace event_timestamp with tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist Include availability_zone in the SELECT statement
Non-aggregated queries should place columns related to where the resources exist last Move firewall_name and availability_zone to the end of the SELECT statement
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

TLS Connection Analysis ❌

Query

TLS Connection Analysis

Monitor TLS connections and any related errors to identify potential SSL/TLS issues or suspicious encrypted traffic.

select
  event_timestamp,
  event ->> 'src_ip' as source_ip,
  event ->> 'dest_ip' as destination_ip,
  event ->> 'sni' as server_name_indication,
  event ->> 'tls_inspected' as tls_inspected,
  event ->> 'tls_error' ->> 'error_message' as tls_error,
  firewall_name
from
  aws_network_firewall_log
where
  event ->> 'app_proto' = 'tls'
order by
  event_timestamp desc;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis Wrap JSON access operations in parentheses, e.g., (event ->> 'src_ip')
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ❌
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column Replace event_timestamp with tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist Include availability_zone in the SELECT statement
Non-aggregated queries should place columns related to where the resources exist last Move firewall_name and availability_zone to the end of the SELECT statement
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Unusually Large Data Transfers ✅

Query

Unusually Large Data Transfers

Identify unusually large data transfers based on bytes transferred. This helps detect potential data exfiltration or abnormal network usage.

select
  event_timestamp,
  event ->> 'src_ip' as source_ip,
  event ->> 'dest_ip' as destination_ip,
  event ->> 'netflow' ->> 'bytes' as bytes,
  event ->> 'netflow' ->> 'packets' as packets,
  event ->> 'proto' as protocol,
  firewall_name
from
  aws_network_firewall_log
where
  (event ->> 'netflow' ->> 'bytes')::bigint > 1000000 -- 1MB
order by
  bytes desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

High-Volume Network Traffic by Source IP ✅

Query

High-Volume Network Traffic by Source IP

Find network sources generating a high number of connections, which helps detect possible denial-of-service (DoS) attacks or heavy application usage.

select
  event ->> 'src_ip' as source_ip,
  count(*) as connection_count,
  date_trunc('hour', event_timestamp) as traffic_hour
from
  aws_network_firewall_log
where
  event ->> 'src_ip' is not null
group by
  source_ip,
  traffic_hour
having
  count(*) > 50
order by
  connection_count desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis N/A
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp as the first column N/A
Non-aggregated queries should include columns related to where the resources exist N/A
Non-aggregated queries should place columns related to where the resources exist last N/A
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Destination Port Usage Analysis ✅

Query

Destination Port Usage Analysis

Analyze traffic by destination port to understand which services are most frequently accessed across your network.

select
  event ->> 'dest_port' as destination_port,
  event ->> 'app_proto' as app_protocol,
  count(*) as connection_count
from
  aws_network_firewall_log
where
  event ->> 'dest_port' is not null
group by
  destination_port,
  app_protocol
order by
  connection_count desc
limit 20;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis Wrap JSON access in parentheses: (event ->> 'dest_port') and (event ->> 'app_proto')
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query Consider updating the title to "Top 20 Destination Port Usage Analysis"
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp as the first column N/A
Non-aggregated queries should include columns related to where the resources exist N/A
Non-aggregated queries should place columns related to where the resources exist last N/A
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

…rove readability by adding parentheses around JSON extraction.
@Priyanka-Chatterjee-2000 Priyanka-Chatterjee-2000 added review Triggers a workflow to review and validate queries and removed review Triggers a workflow to review and validate queries labels Mar 31, 2025
@Priyanka-Chatterjee-2000 Priyanka-Chatterjee-2000 marked this pull request as ready for review March 31, 2025 09:47
Copy link

SQL Query Evaluation Results for aws_network_firewall_log

Daily Network Firewall Activity Trends ✅

Query

Daily Network Firewall Activity Trends

Count Network Firewall log entries per day to identify network activity trends. This helps monitor overall firewall activity and detect unusual spikes in traffic.

select
  strftime(event_timestamp, '%Y-%m-%d') as traffic_date,
  count(*) as log_count
from
  aws_network_firewall_log
group by
  traffic_date
order by
  traffic_date asc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
Space before and after each -> and ->> N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp as the first column N/A
Non-aggregated queries should include columns related to where the resources exist N/A
Non-aggregated queries should place columns related to where the resources exist last N/A
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause N/A
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Top 10 Source IPs Generating Traffic ❌

Query

Top 10 Source IPs Generating Traffic

Identify the top 10 source IP addresses that generated the most network traffic. This helps detect potential high-traffic sources that may indicate misconfigured applications or suspicious activity.

select
  tp_source_ip,
  count(*) as log_count
from
  aws_network_firewall_log
where
  tp_source_ip is not null
group by
  tp_source_ip
order by
  log_count desc
limit 10;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema Replace tp_source_ip with event.src_ip
STRUCT type columns use dot notation Use event.src_ip instead of tp_source_ip
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
Space before and after each -> and ->> N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp as the first column N/A
Non-aggregated queries should include columns related to where the resources exist N/A
Non-aggregated queries should place columns related to where the resources exist last N/A
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Traffic Distribution by Protocol ✅

Query

Traffic Distribution by Protocol

Analyze traffic distribution by protocol to understand what types of network protocols are most frequently used in your environment.

select
  (event ->> 'proto') as protocol,
  count(*) as traffic_count
from
  aws_network_firewall_log
group by
  protocol
order by
  traffic_count desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp as the first column N/A
Non-aggregated queries should include columns related to where the resources exist N/A
Non-aggregated queries should place columns related to where the resources exist last N/A
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause N/A
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Identify Traffic from a Suspicious IP ✅

Query

Identify Traffic from a Suspicious IP

Check if a specific IP is sending or receiving traffic. This is useful for investigating potential threats or monitoring known suspicious IPs.

select
  event_timestamp,
  (event ->> 'src_ip') as source_ip,
  (event ->> 'src_port') as source_port,
  (event ->> 'dest_ip') as destination_ip,
  (event ->> 'dest_port') as destination_port,
  (event ->> 'proto') as protocol,
  firewall_name,
  availability_zone
from
  aws_network_firewall_log
where
  (event ->> 'src_ip') = '192.0.2.100'
  or (event ->> 'dest_ip') = '192.0.2.100'
order by
  event_timestamp desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Detect Potentially Suspicious DNS Traffic ✅

Query

Detect Potentially Suspicious DNS Traffic

Identify DNS traffic that might indicate command and control (C2) communications or data exfiltration attempts over DNS.

select
  event_timestamp,
  (event ->> 'src_ip') as source_ip,
  (event ->> 'dest_ip') as destination_ip,
  (event ->> 'dest_port') as destination_port,
  (event ->> 'proto') as protocol,
  firewall_name,
  availability_zone
from
  aws_network_firewall_log
where
  (event ->> 'app_proto') = 'dns'
  and (event ->> 'dest_port') = '53'
  and (event ->> 'proto') = 'UDP'
order by
  event_timestamp desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Monitor Network Firewall Alerts ✅

Query

Monitor Network Firewall Alerts

Identify and analyze alert events generated by the Network Firewall. This helps monitor security rules and detect potential threats.

select
  event_timestamp,
  (event ->> 'src_ip') as source_ip,
  (event ->> 'dest_ip') as destination_ip,
  (event ->> 'alert' ->> 'action') as alert_action,
  (event ->> 'alert' ->> 'signature') as alert_signature,
  (event ->> 'alert' ->> 'severity') as alert_severity,
  (event ->> 'alert' ->> 'category') as alert_category,
  firewall_name,
  availability_zone
from
  aws_network_firewall_log
where
  (event ->> 'alert') is not null
order by
  event_timestamp desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Monitor Traffic Through Specific Firewall ✅

Query

Monitor Traffic Through Specific Firewall

Retrieve network traffic flows through a specific Network Firewall. This helps analyze firewall behavior and troubleshoot connectivity issues.

select
  event_timestamp,
  (event ->> 'src_ip') as source_ip,
  (event ->> 'src_port') as source_port,
  (event ->> 'dest_ip') as destination_ip,
  (event ->> 'dest_port') as destination_port,
  (event ->> 'proto') as protocol,
  (event ->> 'app_proto') as app_protocol,
  firewall_name,
  availability_zone
from
  aws_network_firewall_log
where
  firewall_name = 'main-firewall'
order by
  event_timestamp desc
limit 100;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Netflow Traffic Analysis ❌

Query

Netflow Traffic Analysis

Analyze netflow data to understand traffic patterns, including bytes transferred and packet counts.

select
  event_timestamp,
  (event ->> 'src_ip') as source_ip,
  (event ->> 'dest_ip') as destination_ip,
  (event ->> 'proto') as protocol,
  (event ->> 'netflow' ->> 'bytes') as bytes,
  (event ->> 'netflow' ->> 'packets') as packets,
  (event ->> 'netflow' ->> 'start_time') as start_time,
  (event ->> 'netflow' ->> 'end_time') as end_time,
  firewall_name,
  availability_zone
from
  aws_network_firewall_log
where
  (event ->> 'event_type') = 'netflow'
  and (event ->> 'netflow') is not null
order by
  event_timestamp desc
limit 100;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation Use dot notation for accessing STRUCT fields: event.src_ip, event.dest_ip, event.proto, event.netflow.bytes, event.netflow.packets, event.netflow.start_time, event.netflow.end_time
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query Update title to reflect the 100 limit, e.g., "Top 100 Netflow Traffic Analysis"
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ❌
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column Replace event_timestamp with tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

TLS Connection Analysis ✅

Query

TLS Connection Analysis

Monitor TLS connections and any related errors to identify potential SSL/TLS issues or suspicious encrypted traffic.

select
  event_timestamp,
  (event ->> 'src_ip') as source_ip,
  (event ->> 'dest_ip') as destination_ip,
  (event ->> 'sni') as server_name_indication,
  (event ->> 'tls_inspected') as tls_inspected,
  (event ->> 'tls_error' ->> 'error_message') as tls_error,
  firewall_name,
  availability_zone
from
  aws_network_firewall_log
where
  (event ->> 'app_proto') = 'tls'
order by
  event_timestamp desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Unusually Large Data Transfers ✅

Query

Unusually Large Data Transfers

Identify unusually large data transfers based on bytes transferred. This helps detect potential data exfiltration or abnormal network usage.

select
  event_timestamp,
  (event ->> 'src_ip') as source_ip,
  (event ->> 'dest_ip') as destination_ip,
  (event ->> 'netflow' ->> 'bytes') as bytes,
  (event ->> 'netflow' ->> 'packets') as packets,
  (event ->> 'proto') as protocol,
  firewall_name,
  availability_zone
from
  aws_network_firewall_log
where
  (event ->> 'netflow' ->> 'bytes')::bigint > 1000000 -- 1MB
order by
  bytes desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp as the first column
Non-aggregated queries should include columns related to where the resources exist
Non-aggregated queries should place columns related to where the resources exist last
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

High-Volume Network Traffic by Source IP ✅

Query

High-Volume Network Traffic by Source IP

Find network sources generating a high number of connections, which helps detect possible denial-of-service (DoS) attacks or heavy application usage.

select
  (event ->> 'src_ip') as source_ip,
  count(*) as connection_count,
  date_trunc('hour', event_timestamp) as traffic_hour
from
  aws_network_firewall_log
where
  event ->> 'src_ip' is not null
group by
  source_ip,
  traffic_hour
having
  count(*) > 50
order by
  connection_count desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp as the first column N/A
Non-aggregated queries should include columns related to where the resources exist N/A
Non-aggregated queries should place columns related to where the resources exist last N/A
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Destination Port Usage Analysis ✅

Query

Destination Port Usage Analysis

Analyze traffic by destination port to understand which services are most frequently accessed across your network.

select
  (event ->> 'dest_port') as destination_port,
  (event ->> 'app_proto') as app_protocol,
  count(*) as connection_count
from
  aws_network_firewall_log
where
  (event ->> 'dest_port') is not null
group by
  destination_port,
  app_protocol
order by
  connection_count desc
limit 20;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators
JSON type columns are wrapped in parenthesis
Space before and after each -> and ->>
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Title contains limit value if in query N/A
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp as the first column N/A
Non-aggregated queries should include columns related to where the resources exist N/A
Non-aggregated queries should place columns related to where the resources exist last N/A
Non-aggregated queries should only include tp_index if missing index information in other columns N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review Triggers a workflow to review and validate queries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant