Skip to content

Bug in WeaviateHook: HTTP port logic ignores connection port for non-HTTPS connections #55433

@ArthurKretzer

Description

@ArthurKretzer

Versions of Apache Airflow Providers

Apache Airflow Provider(s)

weaviate

Versions of Apache Airflow Providers

apache-airflow-providers-weaviate: 3.2.2

Apache Airflow version

3.0.5

Operating System

Ubuntu 22.05 (WSL2)

Deployment

Astronomer

Deployment details

Using Astronomer dev deployment locally.

astro dev start

What happened

The WeaviateHook incorrectly defaults to port 80 for HTTP connections, ignoring the port attribute specified in the Airflow Connection settings.

When trying to connect to a Weaviate instance running on a custom port (e.g., 8080) without HTTPS, the hook attempts to connect to port 80, resulting in a Connection refused error.

The error log clearly shows the wrong port being used:

ERROR - Error testing Weaviate connection: Connection to Weaviate failed. Details: Error: [Errno 111] Connection refused. 
Is Weaviate running and reachable at http://weaviate:80?

What you think should happen instead

The WeaviateHook should respect the port field defined in the Airflow Connection.

Given a connection configured with host: "weaviate" and port: 8080, the hook should attempt to connect to http://weaviate:8080.

How to reproduce

Create a Weaviate connection in Airflow with the following configuration (specifically, a non-standard port and use_https set to false):

JSON

{
      "connection_id": "weaviate_default",
      "conn_type": "weaviate",
      "description": "Weaviate vector DB",
      "host": "weaviate",
      "login": null,
      "schema": null,
      "port": 8080,
      "password": null,
      "extra": "{\"use_https\": false, \"grpc_host\": \"weaviate:8080\", \"grpc_port\": 50051, \"grpc_secure\": false}"
}

Use the WeaviateHook to test the connection from within an Airflow environment (e.g., airflow kerberos or a simple Python script):

Python

from airflow.providers.weaviate.hooks.weaviate import WeaviateHook

hook = WeaviateHook(conn_id="weaviate_default")
hook.test_connection() 

Observe the logs: The logs will show a connection error and a message indicating that the hook is trying to connect to port 80 instead of 8080.

Anything else

The root cause of this bug is located in the get_conn method of the WeaviateHook (airflow/providers/weaviate/src/airflow/providers/weaviate/hooks/weaviate.py).

The specific line is: 144 -> http_port=conn.port or 443 if http_secure else 80,

Due to Python's operator precedence, this line is evaluated as (conn.port or 443) if http_secure else 80. When http_secure is False, the expression always resolves to 80, completely ignoring the value of conn.port.

Suggested Fix:

The logic should be corrected by using parentheses to ensure the ternary operator is evaluated first:

Python

From
http_port=conn.port or 443 if http_secure else 80,

To
http_port=conn.port or (443 if http_secure else 80),

Or, more explicitly:

Python

http_port = conn.port if conn.port else (443 if http_secure else 80)

Interestingly, the logic for grpc_port in the same method is correct, as it uses extras.pop("grpc_port", ...) which correctly prioritizes the user-defined value from the extra field. The http_port logic fails to do the same for the standard port field.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions