Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for offline response scanning #23

Closed
wants to merge 2 commits into from

Conversation

confuciussayuhm
Copy link

Pull Request Template

Description

This PR introduces offline analysis capabilities to humble.py, allowing users to analyze HTTP headers from raw response files without making live requests. This enhancement makes the tool more versatile, especially useful for analyzing historical responses or working in environments with limited connectivity.

Key changes:

  • Added new -if/--input-file parameter for offline analysis
  • Implemented parse_offline_headers() to process raw HTTP response files
  • Created determine_scheme_safety() for better URL scheme validation
  • Added get_display_url() for consistent URL representation
  • Enhanced error handling for invalid URLs and file processing
  • Updated documentation and help text to reflect new functionality

Fixes # (no specific issue referenced)

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Tests performed to verify the changes include:

  • Offline Analysis Testing

    • Tested with valid HTTP response files containing various header combinations
    • Verified parsing of status codes and headers from raw response files
    • Validated error handling for malformed input files
  • URL Handling Testing

    • Tested URL scheme safety validation with HTTP/HTTPS URLs
    • Verified handling of invalid URLs and empty URL scenarios
    • Confirmed proper display of URLs in analysis output

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
    • Added docstrings for new functions
    • Included detailed comments explaining offline analysis logic
  • I have made corresponding changes to the documentation
    • Updated help text and usage information
    • Added documentation for new command line parameters
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

@rfc-st rfc-st self-assigned this Oct 29, 2024
@rfc-st rfc-st added the enhancement New feature or request label Oct 29, 2024
@rfc-st
Copy link
Owner

rfc-st commented Oct 31, 2024

Hi, @confuciussayuhm

First all thanks for this PR and for your time!. I have a few comments (the first ones that come to mind) related to it:

  • When you mention “raw response files” do you mean the possibility to have a text file, with certain HTTP response headers like of the following image, and parse it with 'humble' via the proposed new parameters? (this is an example of the headers of google): https://imgur.com/a/HqKpGj0. This text file should have a specific format to parse it correctly?, do you have any concrete example?

  • If this is what you want, and reviewing your PR, the HTTP response header analysis in 'humble' is based on the following functions and logic: print_missing_headers(), check_frame_options(), get_fingerprint_headers(), print_fingerprint_headers(),
    , all checks based on the URL requests library response (identified with the comment “# Section '3. Deprecated HTTP Response Headers/Protocols and Insecure Values')”, print_empty_headers() and print_browser_compatibility(). All of them must be taken into account when analyzing HTTP response headers.

  • In order to correctly display and calculate the statistics of the analysis performed (parameter '-a'), it is required that the analysis performed has a URL associated with it. Ex: function url_analytics()

  • Similarly, certain insecure header checks require knowledge of the scheme (http:// or https://) of the URL being analyzed.

Let's do the following: please show me some examples of "raw response files" and I will investigate how to integrate their parsing into 'humble': the files must include request status codes and the parsed URL to which those HTTP response headers correspond.

Thanks!

@rfc-st rfc-st added the question Further information is requested label Oct 31, 2024
@confuciussayuhm
Copy link
Author

Hi @rfc-st,

Thank you for taking the time to review my PR. To answer your questions:

  • A response like that logged by Burp (I want the ability to get humble to parse saved responses after the live testing site has been taken down).
  • Noted, I assume I missed this logic.
  • If the offline flag is used, humble could prompt the user to enter the original URL.
  • If the offline flag is used, humble could prompt the user to enter the original URL.

Here's a sample raw response files:
Request:

curl --path-as-is -i -s -k -X $'GET' \
    -H $'Host: normandy.cdn.mozilla.net' -H $'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:131.0) Gecko/20100101 Firefox/131.0' -H $'Accept: application/json' -H $'Accept-Language: en-US,en;q=0.5' -H $'Accept-Encoding: gzip, deflate, br' -H $'Priority: u=4' -H $'Te: trailers' -H $'Connection: keep-alive' \
    $'https://normandy.cdn.mozilla.net/api/v1/'

Response in Burp:

HTTP/2 200 OK
Server: nginx
Content-Length: 598
Allow: GET, HEAD, OPTIONS
Content-Security-Policy: object-src 'none'; base-uri 'none'; block-all-mixed-content; frame-src 'none'; default-src 'self' https://normandy.cdn.mozilla.net/; worker-src 'none'; form-action 'self'; report-uri /__cspreport__
X-Frame-Options: DENY
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block
Strict-Transport-Security: max-age=31536000
Via: 1.1 google
Date: Wed, 23 Oct 2024 19:41:21 GMT
Cache-Control: public, max-age=86400
Content-Type: application/json
Vary: Accept, Origin
Age: 4235
Alt-Svc: clear

{"action-list":"https://normandy.cdn.mozilla.net/api/v1/action/","action-signed":"https://normandy.cdn.mozilla.net/api/v1/action/signed/","approvalrequest-list":"https://normandy.cdn.mozilla.net/api/v1/approval_request/","classify-client":"https://classify-client.services.mozilla.com/api/v1/classify_client/","extension-list":"https://normandy.cdn.mozilla.net/api/v1/extension/","recipe-list":"https://normandy.cdn.mozilla.net/api/v1/recipe/","recipe-signed":"https://normandy.cdn.mozilla.net/api/v1/recipe/signed/","reciperevision-list":"https://normandy.cdn.mozilla.net/api/v1/recipe_revision/"}

@rfc-st
Copy link
Owner

rfc-st commented Oct 31, 2024

Hi, @confuciussayuhm

Cool!.

Let me look into the specific format of curl and Burp Suite responses (and maybe, why not?, other tools!) when requesting the HTTP response headers of a URL; to parse and analyze them in 'humble' (HTTP status code, URL and HTTP header/Value).

Great suggestion!, I'll keep you posted in this thread.

Thanks.

Best regards,

@rfc-st
Copy link
Owner

rfc-st commented Nov 1, 2024

Hi @confuciussayuhm,

Please do a 'git pull' to get the latest changes:

  • New '-if' parameter: analyzes the HTTP response headers, and their values, from a text file.
  • Including check that the file from the previous point exists.
  • Created messages in English and Spanish, about various errors that may arise when processing this file.
  • And, of course, mentioning you!

Try these changes, please ... and if you miss anything, let me know in this thread.

Thanks!.

Regards,

@confuciussayuhm confuciussayuhm closed this by deleting the head repository Nov 1, 2024
@confuciussayuhm
Copy link
Author

Looking good man! Thanks for the addition!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants