Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table alb_connection_log #106

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Priyanka-Chatterjee-2000
Copy link
Contributor

Example query results

Results
Add example SQL query results here (please include the input queries as well)

@Priyanka-Chatterjee-2000 Priyanka-Chatterjee-2000 marked this pull request as ready for review February 27, 2025 14:35
Copy link

SQL Query Evaluation Results for aws_alb_conection_log

Daily Connection Trends ❌

Query

Daily Connection Trends

Count connections per day to identify traffic patterns over time. This query provides a comprehensive view of daily connection volume, helping you understand usage patterns, peak hours, and potential seasonal variations in network traffic.

select
  strftime(timestamp, '%Y-%m-%d') as connection_date,
  count(*) as connection_count
from
  aws_alb_connection_log
group by
  connection_date
order by
  connection_date asc;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema The column 'timestamp' is not present in the provided schema. Use the correct timestamp column name from the schema.
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp first N/A
Non-aggregated queries should include tp_index N/A
Include resource location columns when available N/A
Avoid selecting columns with fixed values in WHERE clause N/A
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Top 10 Clients by Connection Count ✅

Query

Top 10 Clients by Connection Count

List the top 10 client IP addresses making connection attempts. This query helps identify the most active clients, potential sources of high traffic, and can assist in network security monitoring and capacity planning.

select
  client_ip,
  count(*) as connection_count
from
  aws_alb_connection_log
group by
  client_ip
order by
  connection_count desc
limit 10;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp first N/A
Non-aggregated queries should include tp_index N/A
Include resource location columns when available N/A
Avoid selecting columns with fixed values in WHERE clause N/A
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Connection Distribution by Listener Port ✅

Query

Connection Distribution by Listener Port

Analyze how connections are distributed across listener ports. Understanding port-level connection patterns can help optimize network configuration, identify potential bottlenecks, and ensure balanced traffic across different services.

select
  listener_port,
  count(*) as connection_count
from
  aws_alb_connection_log
group by
  listener_port
order by
  connection_count desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp first N/A
Non-aggregated queries should include tp_index N/A
Include resource location columns when available N/A
Avoid selecting columns with fixed values in WHERE clause N/A
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

TLS Protocol Distribution ✅

Query

TLS Protocol Distribution

Analyze the distribution of TLS protocols used by clients. This query provides insights into the security and encryption standards of incoming connections, helping identify potential security upgrades or legacy system interactions.

select
  tls_protocol,
  count(*) as connection_count,
  round(count(*) * 100.0 / sum(count(*)) over (), 2) as percentage
from
  aws_alb_connection_log
where
  tls_protocol is not null
group by
  tls_protocol
order by
  connection_count desc;
SQL syntax checks ✅
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp first N/A
Non-aggregated queries should include tp_index N/A
Include resource location columns when available N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Failed TLS Handshakes ❌

Query

Failed TLS Handshakes

Identify connections with TLS handshake verification failures. This query helps detect potential security issues, misconfigured clients, or network problems that prevent successful encrypted connections.

select
  timestamp,
  tp_index as client_ip,
  conn_trace_id,
  client_port,
  tls_protocol,
  tls_cipher,
  tls_verify_status
from
  aws_alb_connection_log
where
  tls_verify_status like 'Failed:%'
order by
  timestamp desc;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema Verify that all columns exist in the schema. The provided schema is empty.
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ❌
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp first
Non-aggregated queries should include tp_index Include tp_index in the SELECT statement
Include resource location columns when available Consider including account_id or other resource location columns if available
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

Deprecated TLS Protocols ❌

Query

Deprecated TLS Protocols

Detect usage of deprecated or insecure TLS protocols. This query helps identify outdated SSL/TLS protocols that may pose security risks, allowing you to upgrade and maintain robust encryption standards.

select
  tls_protocol,
  tls_cipher,
  count(*) as connection_count
from
  aws_alb_connection_log
where
  tls_protocol in ('TLSv1.1', 'TLSv1', 'SSLv3', 'SSLv2') -- Insecure protocols
group by
  tls_protocol,
  tls_cipher
order by
  connection_count desc;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema The columns tls_protocol, tls_cipher are not present in the provided schema. Verify the correct column names for this log type.
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp first N/A
Non-aggregated queries should include tp_index N/A
Include resource location columns when available N/A
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Slow TLS Handshakes ❌

Query

Slow TLS Handshakes

Top 10 connections with unusually high TLS handshake latency. This query helps identify performance bottlenecks in the TLS negotiation process, which can impact overall connection establishment times and user experience.

select
  timestamp,
  tp_index as client_ip,
  conn_trace_id,
  client_port,
  tls_protocol,
  tls_cipher,
  tls_handshake_latency
from
  aws_alb_connection_log
where
  tls_handshake_latency > 1 -- Handshakes taking longer than 1 second
order by
  tls_handshake_latency desc
limit 10;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema The schema provided is empty. Ensure all columns used in the query exist in the actual schema.
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ❌
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp first Move 'timestamp' to be the first column in the SELECT statement.
Non-aggregated queries should include tp_index
Include resource location columns when available Consider including 'account_id' or similar column if available in the schema.
Avoid selecting columns with fixed values in WHERE clause
Sorting strategy checks ❌
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc Consider ordering by 'timestamp desc' as the primary or secondary sort criteria to show most recent logs first.
Aggregated queries ordered by count desc or time asc N/A

TLS Cipher Usage ❌

Query

TLS Cipher Usage

Analyze the distribution of TLS ciphers used by connections. This query provides detailed insights into the encryption methods clients are using, helping assess cryptographic diversity and potential security improvements.

select
  tls_cipher,
  tls_protocol,
  count(*) as connection_count,
  round(count(*) * 100.0 / sum(count(*)) over (), 3) as percentage
from
  aws_alb_connection_log
group by
  tls_cipher,
  tls_protocol
order by
  connection_count desc;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema Verify that 'tls_cipher', 'tls_protocol' exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ✅
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp
Non-aggregated queries should have tp_timestamp first N/A
Non-aggregated queries should include tp_index N/A
Include resource location columns when available N/A
Avoid selecting columns with fixed values in WHERE clause N/A
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Connection Failure Rate by Time Period ❌

Query

Connection Failure Rate by Time Period

Analyze the rate of connection failures over time. This query helps identify temporal patterns in connection failures, potentially revealing systemic issues, network problems, or security-related connection challenges.

select
  strftime(timestamp, '%Y-%m-%d %H:00:00') as hour,
  count(*) as total_connections,
  sum(case when tls_verify_status like 'Failed:%' then 1 else 0 end) as failed_connections,
  round(sum(case when tls_verify_status like 'Failed:%' then 1 else 0 end) * 100.0 / count(*), 2) as failure_rate
from
  aws_alb_connection_log
group by
  hour
order by
  hour desc;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema Verify that 'timestamp' and 'tls_verify_status' columns exist in the schema
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ❌
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp Remove 'timestamp' from the query and use 'tp_timestamp' instead
Non-aggregated queries should have tp_timestamp first N/A
Non-aggregated queries should include tp_index N/A
Include resource location columns when available N/A
Avoid selecting columns with fixed values in WHERE clause N/A
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc N/A
Aggregated queries ordered by count desc or time asc

Connection Trace Correlation ❌

Query

Connection Trace Correlation

Link connection logs to access logs using the connection trace ID. This query enables deep investigation of connection lifecycle by correlating low-level connection details with HTTP access information.

select
  c.timestamp as connection_timestamp,
  c.client_ip,
  c.client_port,
  c.tls_protocol,
  c.tls_handshake_latency,
  a.timestamp as access_timestamp,
  a.request_url,
  a.request_http_method,
  a.request_http_version,
  a.elb_status_code
from
  aws_alb_connection_log c
join
  aws_alb_access_log a
on
  c.conn_trace_id = a.conn_trace_id
order by
  c.timestamp desc
limit 10;
SQL syntax checks ❌
Criteria Pass/Fail Suggestions
Use 2 space indentation
Query should end with a semicolon
Keywords should be in lowercase
Each clause is on its own line
All columns exist in the schema Verify that all columns exist in the schema. The provided schema is empty.
STRUCT type columns use dot notation N/A
JSON type columns use -> and ->> operators N/A
JSON type columns are wrapped in parenthesis N/A
SQL query syntax uses valid DuckDB syntax
Title and description checks ✅
Criteria Pass/Fail Suggestions
Title uses title case
Title accurately describes the query
Description explains what the query does
Description explains why a user would run the query
Description is concise
Query relevance checks ✅
Criteria Pass/Fail Suggestions
Provides useful insights for this log type
Relevant to security, operational, or performance monitoring
Column selection checks ❌
Criteria Pass/Fail Suggestions
Aggregated queries should not include tp_index/tp_timestamp N/A
Non-aggregated queries should have tp_timestamp first Move tp_timestamp (c.timestamp) to be the first column in the SELECT statement
Non-aggregated queries should include tp_index Include tp_index in the SELECT statement
Include resource location columns when available Include account_id or other resource location columns if available
Avoid selecting columns with fixed values in WHERE clause N/A
Sorting strategy checks ✅
Criteria Pass/Fail Suggestions
Non-aggregated queries default to tp_timestamp desc
Aggregated queries ordered by count desc or time asc N/A

@cbruno10 cbruno10 requested a review from Copilot March 19, 2025 13:15
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds support for querying AWS ALB connection logs by introducing a new table implementation and corresponding documentation. The changes include:

  • Adding documentation for the table’s configuration, usage examples, and query examples.
  • Implementing the Go modules for log entry parsing and table handling.
  • Registering the new table in the AWS plugin.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
docs/tables/aws_alb_conection_log/index.md New documentation for table configuration and examples (note: folder name appears to have a typo).
tables/alb_connection_log/alb_connection_log.go Implements the data structure and map initialization for ALB connection logs.
tables/alb_connection_log/alb_connection_log_table.go Defines the table metadata, source configuration, and row enrichment.
aws/plugin.go Registers the new ALB connection log table.
docs/tables/aws_alb_conection_log/queries.md Provides various SQL query examples for exploring ALB connection log data.
Comments suppressed due to low confidence (1)

docs/tables/aws_alb_conection_log/queries.md:1

  • The folder name 'aws_alb_conection_log' used in the docs appears to be misspelled; consider renaming it to 'aws_alb_connection_log' to match the actual table name.
## Activity Examples

This table sets the following defaults for the [aws_s3_bucket source](https://hub.tailpipe.io/plugins/turbot/aws/sources/aws_s3_bucket#arguments):

| Argument | Default |
|--------------|---------|
Copy link
Preview

Copilot AI Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The folder name 'aws_alb_conection_log' appears to be misspelled; consider renaming it to 'aws_alb_connection_log' to maintain consistency with the table name.

Copilot is powered by AI, so mistakes are possible. Review output carefully before use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants