Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Utility Functions for URL Manipulation and Query Parameter Handling #40

Merged
merged 1 commit into from
Nov 18, 2024

Conversation

andremartinsds
Copy link
Contributor

@andremartinsds andremartinsds commented Nov 15, 2024

This PR includes several utility functions for handling URL manipulation in Go. Each function serves a specific purpose to parse, modify, and build URLs with query parameters and other components.

Summary by CodeRabbit

  • New Features
    • Introduced a new package for URL utilities, including functions for building URLs, adding query parameters, validating URLs, extracting domains, and retrieving query parameters.
  • Tests
    • Added a comprehensive suite of unit tests to ensure the reliability and functionality of the new URL utilities, covering various scenarios and error handling.
  • Chores
    • Updated module dependencies, including an upgrade to the golang.org/x/text module and the addition of a new indirect dependency on golang.org/x/net.

Copy link
Contributor

coderabbitai bot commented Nov 15, 2024

Walkthrough

The changes introduce a new package named url, which includes several utility functions for constructing, manipulating, and validating URLs. The package provides methods such as BuildURL, AddQueryParams, IsValidURL, ExtractDomain, and GetQueryParam. Additionally, a corresponding test file url_test.go is created to implement unit tests for these functions, ensuring their correctness through a table-driven testing approach. The go.mod file is updated to reflect changes in module dependencies.

Changes

File Change Summary
url/url.go Added a new package url with functions: BuildURL, AddQueryParams, IsValidURL, ExtractDomain, and GetQueryParam.
url/url_test.go Added unit tests for the functions in the url package, including tests for error handling.
go.mod Updated dependency version for golang.org/x/text from v0.14.0 to v0.20.0 and added an indirect dependency on golang.org/x/net version v0.31.0.

Possibly related issues

  • [FEATURE] URL utilities ([FEATURE] URL utilities #35): The changes in this PR directly address the proposal to introduce a set of URL utility functions, including those for building, validating, and manipulating URLs, as outlined in the issue description.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Outside diff range and nitpick comments (5)
url/url_test.go (1)

1-3: Enhance package documentation.

The current package documentation is too brief. Consider adding:

  • A detailed description of the package's purpose
  • List of available functions with their descriptions
  • Usage examples for common scenarios

Example improvement:

/*
-Package url defines url utilities helpers.
+Package url provides utility functions for URL manipulation and validation.
+
+Available functions:
+  - BuildURL: Constructs a URL from scheme, host, path, and query parameters
+  - AddQueryParams: Adds query parameters to an existing URL
+  - IsValidURL: Validates a URL string against allowed schemes
+  - ExtractDomainWithSubdomain: Extracts domain with subdomain from a URL
+  - ExtractDomain: Extracts the main domain from a URL
+  - GetQueryParam: Retrieves a specific query parameter value
+
+Example Usage:
+    url, _ := BuildURL("https", "example.com", "/path", map[string]string{"key": "value"})
+    // Returns: "https://example.com/path?key=value"
*/
url/url.go (4)

1-3: Enhance package documentation

The current package documentation is too brief. Consider adding more details about:

  • The package's primary purpose
  • Common use cases
  • Any assumptions or limitations
  • Examples of how the package fits into larger applications
 /*
-Package url defines url utilities helpers.
+Package url provides utilities for URL manipulation, including functions for
+building URLs, managing query parameters, validating URLs, and extracting
+domain information. It wraps Go's net/url package to provide higher-level
+operations commonly needed in web applications.
+
+Common use cases include:
+- Constructing URLs with query parameters
+- Validating URLs against allowed schemes
+- Extracting domain information from URLs
+- Managing query parameters in existing URLs
 */

164-183: Simplify URL validation logic

The function can be simplified by:

  1. Using a map for O(1) scheme lookup instead of O(n) slice iteration
  2. Simplifying the return statement as suggested by static analysis
+// allowedSchemes is initialized once and reused
+var defaultAllowedSchemes = map[string]bool{
+    "http": true,
+    "https": true,
+}

 func IsValidURL(urlStr string, allowedReqSchemes []string) bool {
     parsedURL, err := url.Parse(urlStr)
     if err != nil {
         return false
     }
 
-    allowedSchemes := false
-    for _, scheme := range allowedReqSchemes {
-        if parsedURL.Scheme == scheme {
-            allowedSchemes = true
-            break
-        }
-    }
-
-    if !allowedSchemes {
-        return false
+    // Convert slice to map for O(1) lookup
+    allowedSchemes := make(map[string]bool, len(allowedReqSchemes))
+    for _, scheme := range allowedReqSchemes {
+        allowedSchemes[scheme] = true
     }
 
-    return true
+    return allowedSchemes[parsedURL.Scheme]
 }
🧰 Tools
🪛 golangci-lint

178-178: S1008: should use 'return allowedSchemes' instead of 'if !allowedSchemes { return false }; return true'

(gosimple)


287-300: Add consistent edge case handling

For consistency with ExtractDomainWithSubdomain, consider handling special cases like IP addresses and localhost similarly.

 func ExtractDomain(urlStr string) (string, error) {
     parsedURL, err := url.ParseRequestURI(urlStr)
     if err != nil {
         return "", err
     }
 
     host := parsedURL.Hostname()
     if host == "" {
         return "", errors.New("parameter not found")
     }
 
+    // Handle special cases consistently with ExtractDomainWithSubdomain
+    if net.ParseIP(host) != nil {
+        return "", errors.New("IP addresses not supported")
+    }
+    if host == "localhost" {
+        return host, nil
+    }
+
     return host, nil
 }

1-351: Consider adding comprehensive test coverage

While the implementation looks solid, consider adding:

  1. Table-driven tests for each function
  2. Edge case testing (empty strings, malformed URLs, etc.)
  3. Benchmark tests for performance-critical functions
  4. Examples that can be verified by go test

Would you like me to help generate a comprehensive test suite for this package?

🧰 Tools
🪛 golangci-lint

178-178: S1008: should use 'return allowedSchemes' instead of 'if !allowedSchemes { return false }; return true'

(gosimple)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 98bbcc4 and 1796128.

📒 Files selected for processing (2)
  • url/url.go (1 hunks)
  • url/url_test.go (1 hunks)
🧰 Additional context used
🪛 golangci-lint
url/url.go

178-178: S1008: should use 'return allowedSchemes' instead of 'if !allowedSchemes { return false }; return true'

(gosimple)

url/url_test.go Outdated Show resolved Hide resolved
url/url_test.go Outdated Show resolved Hide resolved
url/url_test.go Outdated Show resolved Hide resolved
url/url_test.go Outdated Show resolved Hide resolved
url/url_test.go Outdated Show resolved Hide resolved
url/url_test.go Outdated Show resolved Hide resolved
url/url.go Outdated Show resolved Hide resolved
url/url.go Outdated Show resolved Hide resolved
url/url.go Outdated Show resolved Hide resolved
url/url.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (6)
go.mod (1)

3-3: Consider updating the Go version requirement

The module currently uses Go 1.18, which is quite old. Consider upgrading to a newer version (current is 1.21) to benefit from:

  • Performance improvements
  • Better security features
  • New language features that could help with URL manipulation
  • Better compatibility with recent versions of dependencies
url/url.go (3)

1-3: Enhance package documentation

Consider expanding the package documentation to include:

  • Main purpose and use cases
  • Common usage patterns
  • Any assumptions or limitations
 /*
-Package url defines url utilities helpers.
+Package url provides utilities for URL manipulation, including construction,
+validation, and parameter handling. It supports common operations such as:
+  - Building URLs with query parameters
+  - Validating URLs against allowed schemes
+  - Extracting domains and query parameters
+  - Adding query parameters to existing URLs
+
+The package ensures proper URL encoding and validation while handling
+common edge cases and security concerns.
 */

66-67: Use raw strings for regular expressions

Using raw strings for regular expressions improves readability and reduces escape character complexity.

-		re := regexp.MustCompile("^[a-zA-Z0-9-]+(\\.[a-zA-Z0-9-]+)+$")
+		re := regexp.MustCompile(`^[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)+$`)

-		re := regexp.MustCompile("^[a-zA-Z]+(\\/[a-zA-Z]+)*$")
+		re := regexp.MustCompile(`^[a-zA-Z]+(/[a-zA-Z]+)*$`)

Also applies to: 76-77

🧰 Tools
🪛 golangci-lint

66-66: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)


89-91: Standardize error handling patterns

Error handling is inconsistent across functions. Some wrap the original error while others create new ones. Consider standardizing the approach:

 	parsedUrl, err := url.Parse(urlMounted)
 	if err != nil {
-		return "", errors.New("URL could not be parsed")
+		return "", fmt.Errorf("failed to parse URL: %w", err)
 	}

 	parsedURL, err := url.Parse(urlStr)
 	if err != nil {
-		return "", errors.New("URL could not be parsed")
+		return "", fmt.Errorf("failed to parse URL: %w", err)
 	}

 	parsedURL, err := url.Parse(urlStr)
 	if err != nil {
-		return "", errors.New("URL could not be parsed")
+		return "", fmt.Errorf("failed to parse URL: %w", err)
 	}

 	parsedURL, err := url.Parse(urlStr)
 	if err != nil {
-		return "", err
+		return "", fmt.Errorf("failed to parse URL: %w", err)
 	}

Also applies to: 144-147, 274-276, 327-329

url/url_test.go (2)

196-241: Clarify test case names in TestIsValidURL

Some test case names in TestIsValidURL are prefixed with "success - is valid URL" but expect false as the result, indicating that the URL is invalid. For clarity and maintainability, consider updating the test case names to accurately reflect whether the URL is expected to be valid or invalid.

For example:

  • Change "success - is valid URL" to "failure - invalid scheme" when want is false due to an invalid scheme.
  • Use descriptive names like "failure - empty URL string" or "failure - missing scheme".

397-422: Rename TestGetQueryParamError and its test cases for clarity

The test function TestGetQueryParamError contains a test case named "success - get query param with error", which is misleading. Since this function is intended to test error scenarios, consider:

  • Renaming the test function to TestGetQueryParam_Errors to reflect that it focuses on error cases.
  • Updating the test case name to something like "error - parameter not found" to clearly indicate the purpose of the test.

This improves readability and helps other developers quickly understand the intent of the tests.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1796128 and 2629c9c.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • go.mod (1 hunks)
  • url/url.go (1 hunks)
  • url/url_test.go (1 hunks)
🧰 Additional context used
🪛 golangci-lint
url/url.go

66-66: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)


76-76: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)


149-149: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)


231-231: S1008: should use 'return allowedSchemes' instead of 'if !allowedSchemes { return false }; return true'

(gosimple)

🔇 Additional comments (4)
go.mod (2)

7-7: Verify the indirect dependency requirement

The addition of golang.org/x/net seems appropriate for URL manipulation functionality. However, since it's marked as indirect, verify if it should be a direct dependency instead, given that the PR adds URL-related utilities.

#!/bin/bash
# Description: Check if golang.org/x/net is used directly in the codebase

# Search for direct imports of net package
rg -l 'import.*"golang.org/x/net'

# Search for URL-related code that might need direct net package usage
ast-grep --pattern 'import ($$$"net/url"$$$)'

5-5: Verify compatibility with the major version upgrade of golang.org/x/text

The upgrade from v0.14.0 to v0.20.0 is significant. Please ensure there are no breaking changes that could affect existing functionality.

✅ Verification successful

The previous attempt didn't provide the changelog information and had a syntax error. Let's try a different approach to get the release information:


Based on the results, I can now generate a final response. The search results show that:

  1. Only strings/strings.go uses the golang.org/x/text package, specifically for cases and language packages
  2. The usage is minimal and focused on text case manipulation
  3. The actual implementation shows no complex usage that would be prone to breaking changes

The golang.org/x/text upgrade appears safe for this codebase

The codebase has very limited usage of golang.org/x/text, only using basic case conversion functionality in strings/strings.go through the cases and language packages. These are stable APIs that rarely introduce breaking changes. The upgrade should be safe as:

  • Usage is minimal and focused on core functionality
  • Only one file depends on this package
  • The used packages (cases and language) are fundamental packages with stable APIs
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for breaking changes in golang.org/x/text upgrade

# Get the changelog or release notes
gh api repos/golang/text/releases | jq -r '.[] | select(.tag_name | contains("v0.")) | select(.tag_name >= "v0.14.0" and .tag_name <= "v0.20.0") | "\(.tag_name)\n\(.body)\n"'

# Check for usage of potentially affected packages
rg -l "text\." 

Length of output: 2829


Script:

#!/bin/bash
# Let's check for specific usage of golang.org/x/text packages
rg -l "golang.org/x/text"

# Check actual usage of text package functions
rg "text\." -A 2 -B 2

Length of output: 3044


Script:

#!/bin/bash
# Try to get releases information with a simpler query
gh api repos/golang/text/releases/latest

# Also check the current version in go.mod for confirmation
rg "golang.org/x/text" go.mod

Length of output: 1890

url/url.go (2)

272-288: LGTM! Well-implemented domain extraction

The implementation is efficient and handles domain extraction correctly using the publicsuffix package.


325-339: LGTM! Well-implemented query parameter retrieval

The implementation correctly handles parameter existence and empty value cases.

url/url.go Outdated Show resolved Hide resolved
url/url.go Outdated Show resolved Hide resolved
url/url.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (3)
url/url.go (1)

1-12: Consider adding package-level constants for regex patterns

To improve maintainability and reusability, consider extracting commonly used regex patterns into package-level constants.

Example implementation:

+// Common regex patterns used across the package
+var (
+    hostPattern    = regexp.MustCompile(`^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])(\.[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])*$`)
+    pathPattern    = regexp.MustCompile(`^[\w-]+(?:\/[\w-]+)*$`)
+    paramPattern   = regexp.MustCompile(`^[a-zA-Z0-9-]+$`)
+)
url/url_test.go (2)

1-4: Add more comprehensive package documentation.

The current package documentation is too brief. Consider adding:

  • Purpose and scope of the package
  • Examples of common use cases
  • Links to related packages or documentation
 /*
-Package url defines url utilities helpers.
+Package url provides utilities for URL manipulation and validation.
+
+It offers functions for:
+- Building URLs with query parameters
+- Adding query parameters to existing URLs
+- Validating URLs against allowed schemes
+- Extracting domains from URLs
+- Retrieving specific query parameters
 */

1-417: Add test coverage for edge cases across all test functions.

While the test coverage is good, there are missing edge cases:

  1. URLs with fragments (#)
  2. URLs with authentication (user:pass@)
  3. URLs with non-ASCII characters in paths
  4. URLs with repeated query parameters
  5. URLs with special characters in query values
  6. URLs with IPv4/IPv6 hosts

Would you like me to provide example test cases for these scenarios?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2629c9c and d27e458.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • go.mod (1 hunks)
  • url/url.go (1 hunks)
  • url/url_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod
🧰 Additional context used
🪛 golangci-lint
url/url.go

66-66: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)


76-76: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)


154-154: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)

🔇 Additional comments (4)
url/url.go (4)

148-170: Previous review comments are still applicable

The existing review comments about optimizing query parameter processing and improving URL validation are still valid for this function.

🧰 Tools
🪛 golangci-lint

154-154: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)


206-223: Implementation looks good!

The function is well-implemented with proper error handling and scheme validation.


259-275: Excellent implementation using publicsuffix!

The function makes good use of the publicsuffix package for reliable domain extraction and includes proper error handling.


312-326: Clean implementation with proper error handling!

The function correctly handles parameter existence checks and returns appropriate errors.

url/url.go Outdated Show resolved Hide resolved
url/url.go Show resolved Hide resolved
url/url_test.go Show resolved Hide resolved
url/url_test.go Show resolved Hide resolved
url/url_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)
url/url.go (2)

1-3: Enhance package documentation

The current package documentation is too brief. Consider adding more details about the package's purpose, main features, and usage examples.

 /*
-Package url defines url utilities helpers.
+Package url provides utilities for URL manipulation, including functions for:
+- Building and validating URLs
+- Managing query parameters
+- Extracting domains
+- Supporting various URL schemes (http, https, ws, wss, ftp)
+
+This package is designed to simplify common URL-related operations while ensuring
+proper validation and encoding of URL components.
 */

204-221: Optimize validation order

Consider checking scheme validity before attempting to parse the URL for better performance.

 func IsValidURL(urlStr string, allowedReqSchemes []string) bool {
 	if urlStr == "" {
 		return false
 	}
+	
+	// Check if URL starts with an allowed scheme
+	for _, scheme := range allowedReqSchemes {
+		if scheme == "" {
+			return false
+		}
+		if strings.HasPrefix(urlStr, scheme+"://") {
+			break
+		}
+		// If we've checked all schemes and none match, return false
+		if scheme == allowedReqSchemes[len(allowedReqSchemes)-1] {
+			return false
+		}
+	}
+
 	parsedURL, err := url.Parse(urlStr)
 	if err != nil {
 		return false
 	}
-	for _, scheme := range allowedReqSchemes {
-		if scheme == "" {
-			return false
-		}
-		if parsedURL.Scheme == scheme {
-			return true
-		}
-	}
-	return false
+	return true
url/url_test.go (2)

1-4: Enhance package documentation with more details.

The current documentation is too brief. Consider adding:

  • Purpose and main features of the package
  • Examples of common use cases
  • List of available functions and their purposes
 /*
-Package url defines url utilities helpers.
+Package url provides utilities for URL manipulation and validation.
+
+It offers functions for:
+- Building URLs with query parameters
+- Adding query parameters to existing URLs
+- Validating URLs against allowed schemes
+- Extracting domains from URLs
+- Retrieving specific query parameters
+
+Example usage:
+    url, _ := BuildURL("https", "example.com", "path", map[string]string{"key": "value"})
+    // Returns: https://example.com/path?key=value
 */

73-73: Fix grammatical error in error message.

The error message is grammatically incorrect and unclear.

-			want: "path is permitted with a-z character and multiple path segments",
+			want: "path must contain only a-z characters and can include multiple path segments",
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d27e458 and 6294dba.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • go.mod (1 hunks)
  • url/url.go (1 hunks)
  • url/url_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod
🧰 Additional context used
🪛 golangci-lint
url/url.go

66-66: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)


76-76: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)

🔇 Additional comments (6)
url/url.go (4)

86-103: LGTM! Efficient URL construction implementation

The implementation correctly uses the url.URL struct for URL construction and handles query parameters efficiently.


153-157: LGTM! Good URL scheme validation

The switch statement provides a clean and efficient way to validate URL schemes.


257-273: LGTM! Robust domain extraction implementation

The function correctly uses the publicsuffix package for reliable domain extraction and provides clear error messages.


310-324: LGTM! Proper parameter existence check

The implementation correctly distinguishes between missing parameters and empty values.

url/url_test.go (2)

8-45: Past review comments are still applicable.

The existing review comments about improving test case names, adding more scenarios, and fixing error handling are still valid.


387-412: Past review comments are still applicable.

The existing review comment about refactoring this function to follow the table-driven test pattern and fixing the test name is still valid.

Enhance test coverage for AddQueryParams

Add comprehensive test cases for URL validation

Expand test coverage for domain extraction

Refactor test cases to remove redundancy and add edge cases

Improve empty parameter value handling
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (6)
url/url.go (5)

1-3: Enhance package documentation

Consider adding more comprehensive package documentation that includes:

  • Overview of the package's purpose
  • Common use cases
  • List of available functions
  • Any assumptions or limitations
 /*
-Package url defines url utilities helpers.
+Package url provides utilities for URL manipulation and validation.
+
+This package offers a set of functions to:
+  - Construct and validate URLs
+  - Add and retrieve query parameters
+  - Extract domains from URLs
+  - Validate URL schemes
+
+The utilities ensure proper URL encoding and handle common edge cases
+while maintaining RFC compliance.
 */

61-84: Improve error message consistency and validation

The error handling could be improved in several ways:

  1. Error messages should be consistent in format (some use "is required" while others use different formats)
  2. Query parameters should be validated before URL construction
 func BuildURL(scheme, host, path string, query map[string]string) (string, error) {
 	var errMessage []string
 	if scheme == "" {
-		errMessage = append(errMessage, "scheme is required")
+		errMessage = append(errMessage, "scheme: required")
 	}
 	if host != "" {
 		re := regexp.MustCompile(`^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])(\.[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])*$`)
 		if !re.MatchString(host) {
-			errMessage = append(errMessage, "the host is not valid")
+			errMessage = append(errMessage, "host: invalid format")
 		}
 	}
 	if host == "" {
-		errMessage = append(errMessage, "host is required")
+		errMessage = append(errMessage, "host: required")
 	}
 
 	if path != "" {
 		re := regexp.MustCompile("^[a-zA-Z]+(\\/[a-zA-Z]+)*$")
 		if !re.MatchString(path) {
-			errMessage = append(errMessage, "path is permitted with a-z character and multiple path segments")
+			errMessage = append(errMessage, "path: must contain only alphabetic characters")
 		}
 	}
+
+	// Validate query parameters
+	for key, value := range query {
+		if key == "" {
+			errMessage = append(errMessage, "query: empty key not allowed")
+		}
+		if value == "" {
+			errMessage = append(errMessage, fmt.Sprintf("query: empty value not allowed for key '%s'", key))
+		}
+	}
🧰 Tools
🪛 golangci-lint

76-76: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)


160-163: Improve query parameter validation

The current implementation has several areas for improvement:

  1. Use raw string notation for regex to avoid double escaping
  2. Provide more descriptive error messages
  3. Separate key and value validation for better error reporting
-		re := regexp.MustCompile("^[a-zA-Z0-9-]+$")
+		re := regexp.MustCompile(`^[a-zA-Z0-9-]+$`)
-		if !re.MatchString(value) || !re.MatchString(key) || value == "" {
-			return "", errors.New("the query parameter is not valid")
+		if !re.MatchString(key) {
+			return "", fmt.Errorf("invalid query parameter key '%s': must contain only alphanumeric characters and hyphens", key)
+		}
+		if value == "" {
+			return "", fmt.Errorf("empty value not allowed for query parameter '%s'", key)
+		}
+		if !re.MatchString(value) {
+			return "", fmt.Errorf("invalid query parameter value '%s': must contain only alphanumeric characters and hyphens", value)
 		}

212-220: Optimize scheme validation

The current implementation iterates through the allowed schemes sequentially. Consider using a map for O(1) lookup time.

 func IsValidURL(urlStr string, allowedReqSchemes []string) bool {
 	if urlStr == "" {
 		return false
 	}
+
+	// Create a map of allowed schemes for O(1) lookup
+	allowedSchemes := make(map[string]struct{}, len(allowedReqSchemes))
+	for _, scheme := range allowedReqSchemes {
+		if scheme == "" {
+			return false
+		}
+		allowedSchemes[scheme] = struct{}{}
+	}
+
 	parsedURL, err := url.Parse(urlStr)
 	if err != nil {
 		return false
 	}
-	for _, scheme := range allowedReqSchemes {
-		if scheme == "" {
-			return false
-		}
-		if parsedURL.Scheme == scheme {
-			return true
-		}
-	}
-	return false
+
+	_, ok := allowedSchemes[parsedURL.Scheme]
+	return ok
 }

258-270: Improve error messages with context

The error messages could be more descriptive by including the problematic URL and specific reasons for failure.

 	parsedURL, err := url.Parse(urlStr)
 	if err != nil {
-		return "", errors.New("URL could not be parsed")
+		return "", fmt.Errorf("failed to parse URL '%s': %w", urlStr, err)
 	}
 
 	host, err := publicsuffix.EffectiveTLDPlusOne(parsedURL.Hostname())
 	if err != nil {
-		return "", errors.New("could not extract public suffix")
+		return "", fmt.Errorf("failed to extract domain from '%s': %w", urlStr, err)
 	}
 
 	if host == "" {
-		return "", errors.New("parameter not found")
+		return "", fmt.Errorf("no domain found in URL '%s'", urlStr)
 	}
url/url_test.go (1)

1-4: Enhance package documentation.

The current documentation is too brief. Consider adding:

  • Purpose and main features of the package
  • Usage examples for key functions
  • Any important notes or limitations
 /*
-Package url defines url utilities helpers.
+Package url provides utilities for URL manipulation and validation.
+
+It offers functions for:
+- Building URLs with query parameters
+- Adding query parameters to existing URLs
+- Validating URLs against allowed schemes
+- Extracting domains from URLs
+- Retrieving specific query parameters
+
+Example:
+    url := BuildURL("https", "example.com", "path", map[string]string{"key": "value"})
+    // Returns: https://example.com/path?key=value
 */
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 6294dba and 67abf73.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • go.mod (1 hunks)
  • url/url.go (1 hunks)
  • url/url_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod
🧰 Additional context used
📓 Learnings (1)
url/url.go (1)
Learnt from: andremartinsds
PR: kashifkhan0771/utils#40
File: url/url.go:76-79
Timestamp: 2024-11-16T00:59:45.523Z
Learning: In the `BuildURL` function within `url/url.go`, the current path validation regex is appropriate and should not be changed without a specific reason.
🪛 golangci-lint
url/url.go

76-76: S1007: should use raw string (...) with regexp.MustCompile to avoid having to escape twice

(gosimple)

🔇 Additional comments (1)
url/url.go (1)

310-324: LGTM!

The implementation is clean, efficient, and handles edge cases properly:

  • Proper URL parsing
  • Efficient parameter lookup
  • Correct handling of missing parameters
  • Clear error messages

url/url_test.go Show resolved Hide resolved
url/url_test.go Show resolved Hide resolved
url/url_test.go Show resolved Hide resolved
url/url_test.go Show resolved Hide resolved
@kashifkhan0771 kashifkhan0771 merged commit 62e0ef4 into kashifkhan0771:main Nov 18, 2024
1 check passed
@kashifkhan0771
Copy link
Owner

Thanks @andremartinsds for the contribution!

@kashifkhan0771 kashifkhan0771 linked an issue Nov 19, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] URL utilities
2 participants