-
Notifications
You must be signed in to change notification settings - Fork 15.6k
Description
Versions of Apache Airflow Providers
Apache Airflow Provider(s)
weaviate
Versions of Apache Airflow Providers
apache-airflow-providers-weaviate: 3.2.2
Apache Airflow version
3.0.5
Operating System
Ubuntu 22.05 (WSL2)
Deployment
Astronomer
Deployment details
Using Astronomer dev deployment locally.
astro dev start
What happened
The WeaviateHook incorrectly defaults to port 80 for HTTP connections, ignoring the port attribute specified in the Airflow Connection settings.
When trying to connect to a Weaviate instance running on a custom port (e.g., 8080) without HTTPS, the hook attempts to connect to port 80, resulting in a Connection refused error.
The error log clearly shows the wrong port being used:
ERROR - Error testing Weaviate connection: Connection to Weaviate failed. Details: Error: [Errno 111] Connection refused.
Is Weaviate running and reachable at http://weaviate:80?
What you think should happen instead
The WeaviateHook should respect the port field defined in the Airflow Connection.
Given a connection configured with host: "weaviate" and port: 8080, the hook should attempt to connect to http://weaviate:8080.
How to reproduce
Create a Weaviate connection in Airflow with the following configuration (specifically, a non-standard port and use_https set to false):
JSON
{
"connection_id": "weaviate_default",
"conn_type": "weaviate",
"description": "Weaviate vector DB",
"host": "weaviate",
"login": null,
"schema": null,
"port": 8080,
"password": null,
"extra": "{\"use_https\": false, \"grpc_host\": \"weaviate:8080\", \"grpc_port\": 50051, \"grpc_secure\": false}"
}
Use the WeaviateHook to test the connection from within an Airflow environment (e.g., airflow kerberos or a simple Python script):
Python
from airflow.providers.weaviate.hooks.weaviate import WeaviateHook
hook = WeaviateHook(conn_id="weaviate_default")
hook.test_connection()
Observe the logs: The logs will show a connection error and a message indicating that the hook is trying to connect to port 80 instead of 8080.
Anything else
The root cause of this bug is located in the get_conn method of the WeaviateHook (airflow/providers/weaviate/src/airflow/providers/weaviate/hooks/weaviate.py
).
The specific line is: 144 -> http_port=conn.port or 443 if http_secure else 80,
Due to Python's operator precedence, this line is evaluated as (conn.port or 443) if http_secure else 80
. When http_secure is False, the expression always resolves to 80, completely ignoring the value of conn.port.
Suggested Fix:
The logic should be corrected by using parentheses to ensure the ternary operator is evaluated first:
Python
From
http_port=conn.port or 443 if http_secure else 80,
To
http_port=conn.port or (443 if http_secure else 80),
Or, more explicitly:
Python
http_port = conn.port if conn.port else (443 if http_secure else 80)
Interestingly, the logic for grpc_port in the same method is correct, as it uses extras.pop("grpc_port", ...) which correctly prioritizes the user-defined value from the extra field. The http_port logic fails to do the same for the standard port field.
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct