Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add interface to override get_column_from_field | get_sqlalchemy_type function behavior #503

Open
8 tasks done
243f6a8885a308d313198a2e037 opened this issue Nov 20, 2022 · 1 comment
Labels
feature New feature or request

Comments

@243f6a8885a308d313198a2e037
Copy link

243f6a8885a308d313198a2e037 commented Nov 20, 2022

First Check

  • I added a very descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the SQLModel documentation, with the integrated search.
  • I already searched in Google "How to X in SQLModel" and didn't find any information.
  • I already read and followed all the tutorial in the docs and didn't find an answer.
  • I already checked if it is not related to SQLModel but to Pydantic.
  • I already checked if it is not related to SQLModel but to SQLAlchemy.

Commit to Help

  • I commit to help with one of those options 👆

Example Code

# current workaround: force extending sqlmodel.main.get_sqlalchemy_type

from typing import Any, Callable

import sqlmodel.main
from pydantic import BaseModel, ConstrainedStr
from pydantic.fields import ModelField
from sqlalchemy import String
from typing_extensions import TypeAlias

_GetSqlalchemyTypeProtocol: TypeAlias = Callable[[ModelField], Any]


def _override_get_sqlalchemy_type(
    original_get_sqlalchemy_type: _GetSqlalchemyTypeProtocol,
) -> _GetSqlalchemyTypeProtocol:
    def _extended_get_sqlalchemy_type(field: ModelField) -> Any:
        if issubclass(field.type_, BaseModel):
            # TODO use sqlalchemy.JSON or CHAR(N) for "known to be short" models
            raise NotImplementedError(field.type_)
        if issubclass(field.type_, ConstrainedStr):
            # MAYBE add CHECK constraint for field.type_.regex
            length = field.type_.max_length
            if length is not None:
                return String(length=length)
            return String()
        return original_get_sqlalchemy_type(field)

    return _extended_get_sqlalchemy_type


sqlmodel.main.get_sqlachemy_type = _override_get_sqlalchemy_type(
    sqlmodel.main.get_sqlachemy_type
)
# MAYBE get_sqlachemy_type -> get_sqlalchemy_type (sqlmodel > 0.0.8)
# cf. <https://github.com/tiangolo/sqlmodel/commit/267cd42fb6c17b43a8edb738da1b689af6909300>

Description

Problem:

  • We want to decide database column types deterministically by model field.
  • Sometimes SQLModel does not provide expected column type, and it is (in some cases) impossible because requirements of SQLModel users are not always the same (e.g. DBMS dialects, strictness of constraints, choice of CHAR vs VARCHAR vs TEXT vs JSON, TIMESTAMP vs DATETIME)

Wanted Solution

Allow user to use customized get_column_from_field | get_sqlalchemy_type function to fit with their own requirements.

Add parameter to model config like sa_column_builder: Callable[[ModelField], Column] = get_column_from_field.

Function get_column_from_field would be better split by the following concerns, to be used as a part of customized sa_column_builder implementation:

  1. Deciding the column type (currently done in get_sqlalchemy_type)
  2. Applying pydantic field options to column type (e.g. nullable, min, max, min_length, max_length, regex, ...)
  3. Applying column options (e.g. primary_key, index, foreign_key, unique, ...)

Possible effects on other issues/PRs:


p.s-1

Conversion rule between Field/column value may become necessary, mainly to serialize field value to column value.
(e.g. Classes inheriting BaseModel cannot be stored directly into sqlalchemy.JSON because it is not JSON or dict. We avoid this by adding json_serializer to create_engine. Deserialize part has no problem because JSON str -> BaseModel will be done by pydantic validation for now (pydantic v1))

def _json_serializer(value: Any) -> str:
    if isinstance(value, BaseModel):
        return value.json()
    return json.dumps(value)

p.s-2

IMO using sqlmodel.sql.sqltypes.AutoString() in alembic revision file is not good from the sight of future migration constancy, and this is one of the reason I overridden get_sqlalchemy_type function.

Wanted Code

################################################################
# expected: any of the following `Foo` / `Bar`


def _custom_sa_column_builder(field: ModelField) -> Column:
    ...


class Foo(SQLModel, table=True):
    class SQLModelConfig:
        sa_column_builder: Callable[[ModelField], Column] = _custom_sa_column_builder
    ...


class Bar(SQLModel, table=True, sa_column_builder=_custom_sa_column_builder):
    ...

Alternatives

  • Write a function that returns sa_column and call it in sqlmodel.Field declaration
    • -> Not easy to apply pydantic-side constraints (e.g. nullable, ConstrainedStr, ...)

Operating System

Linux

Operating System Details

No response

SQLModel Version

0.0.8

Python Version

3.10.7

Additional Context

No response

@bkanuka
Copy link

bkanuka commented Feb 19, 2023

I used this idea to extend / monkeypatch SQLModel to work with STRUCT and ARRAY from BigQuery:

# Monkeypatch sqlmodel.main.get_sqlalchemy_type to work with BigQuery's STRUCT and ARRAY types:

from typing import Any, Callable, List

import sqlalchemy.sql
import sqlmodel.main
from sqlmodel.main import get_sqlachemy_type as original_get_sqlalchemy_type
from pydantic import BaseModel, ConstrainedStr
from pydantic.fields import ModelField
from sqlalchemy import String
from sqlalchemy_bigquery import STRUCT, ARRAY
from typing_extensions import TypeAlias

_GetSqlalchemyTypeProtocol: TypeAlias = Callable[[ModelField], Any]


def extended_get_sqlalchemy_type(field: ModelField) -> Any:
    # Add support for STRUCT
    if issubclass(field.type_, BaseModel):
        return STRUCT(
            *((f.name, extended_get_sqlalchemy_type(f)) for f in field.type_.__fields__.values())
        )

    # Add support for ARRAY
    if issubclass(field.type_, List):
        if field.sub_fields:
            return ARRAY(extended_get_sqlalchemy_type(field.sub_fields[0]))
        return ARRAY()

    # Add support for ConstrainedStr
    if issubclass(field.type_, ConstrainedStr):
        length = field.type_.max_length
        if length is not None:
            return String(length=length)
        return String()

    return original_get_sqlalchemy_type(field)


def patch_sqlmodel() -> None:
    sqlmodel.main.get_sqlachemy_type = extended_get_sqlalchemy_type

patch_sqlmodel()

Now submodels can be used in SQLModel models and they get converted to STRUCT. e.g.:

class Service(SQLModel):
    id: str = Field(description="The ID of the Google Cloud Platform service that offers the SKU.")
    description: str = Field(description="The Google Cloud Platform service that offers the SKU.")

class ServiceTable(SQLModel, table=True):
    service: Service = Field()

I generally try to avoid this kind of patching shenanigans but this has greatly simplified my models and completely gotten rid of the need for sa_column in my SQLModel models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants