Skip to content

Conversation

@JoaquimEsteves
Copy link
Contributor

@JoaquimEsteves JoaquimEsteves commented Mar 10, 2025

Closes #383

Note: I don't have a postgres DB to play with this patch - I'm kind of hopping that the tests are enough. I can report back tomorrow when I'm back at $WORK

Another thing I could do (since I have my hands in the dough) is to add an option to the cli that makes it so that we don't import typing.List, we could very well get away with just using list[<attribute here>].

Indeed typing.List has been deprecated since python 3.9.
I could also do the same for Optional but maybe that's best left to some other PR.

@coveralls
Copy link

coveralls commented Mar 10, 2025

Coverage Status

coverage: 97.074% (+0.02%) from 97.055%
when pulling b0c1186 on JoaquimEsteves:master
into 924181e on agronholm:master.

Copy link
Collaborator

@sheinbergon sheinbergon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your contribution. The result creates a conflicting import where sqlalchemy ARRAY and postgres dialect ARRAY, are imported. You need to adapt your code to make sure only one type surfaces

@JoaquimEsteves
Copy link
Contributor Author

Thank you for your review. I'll implement the necessary changes tomorrow evening.

@JoaquimEsteves
Copy link
Contributor Author

JoaquimEsteves commented Mar 11, 2025

Should have addressed all of the points.

  • ✂️ typing.List is gone
  • ✂️ new comments are gone
  • Simplified the function; although looking at it I kind of prefer the old recursive function, because there's a pre and a post-script component to the wrapped type I couldn't think of a nicer way to structure the function. Thought about making it an iterator or making the column_python_type a list that we can then join at the end, but wasn't happy with either result and so went for the most boring one. If you can think of anything better feel free to suggest or edit the code directly.

Note: That the bug mentioned here seems to have been part of the code for a while, the tests just never checked for an import with the same name from some other module - probably because it just doesn't happen.
I thought about fixing it by changing the way the imports are handled (if ARRAY is already in the name-space give it another name, but then hmmm, what name could that be?) .
I think that that might be slightly out of scope for what I wanted.

Copy link
Collaborator

@sheinbergon sheinbergon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much, much better.

One small comment. We still need to solve the double import issue

@sheinbergon
Copy link
Collaborator

Should have addressed all of the points.

  • ✂️ typing.List is gone
  • ✂️ new comments are gone
  • Simplified the function; although looking at it I kind of prefer the old recursive function, because there's a pre and a post-script component to the wrapped type I couldn't think of a nicer way to structure the function. Thought about making it an iterator or making the column_python_type a list that we can then join at the end, but wasn't happy with either result and so went for the most boring one. If you can think of anything better feel free to suggest or edit the code directly.

Note: That the bug mentioned here seems to have been part of the code for a while, the tests just never checked for an import with the same name from some other module - probably because it just doesn't happen. I thought about fixing it by changing the way the imports are handled (if ARRAY is already in the name-space give it another name, but then hmmm, what name could that be?) . I think that that might be slightly out of scope for what I wanted.

Again, thank you for your efforts. The PR is much better. But, we cannot merge it while the dual ARRAY import remains.
If you need more help working on it, I can provide guidance. If you want me to "take over" and try and tackle it myself, let me know.

@JoaquimEsteves
Copy link
Contributor Author

But, we cannot merge it while the dual ARRAY import remains. If you need more help working on it, I can provide guidance. If you want me to "take over" and try and tackle it myself, let me know.

I can certainly take another crack at it.

Although I do think I would need some help on the dual ARRAY imports. When I tested it out at $WORK it didn't seem to have a double import, so I just thought it was bad test-data.

It could very well be a "skill-issue" on my side, I'm not quite familiar with the code yet.

@sheinbergon
Copy link
Collaborator

But, we cannot merge it while the dual ARRAY import remains. If you need more help working on it, I can provide guidance. If you want me to "take over" and try and tackle it myself, let me know.

I can certainly take another crack at it.

Although I do think I would need some help on the dual ARRAY imports. When I tested it out at $WORK it didn't seem to have a double import, so I just thought it was bad test-data.

It could very well be a "skill-issue" on my side, I'm not quite familiar with the code yet.

Once you're done with the final fix suggested, let me know and I will take a look

@JoaquimEsteves
Copy link
Contributor Author

JoaquimEsteves commented Mar 11, 2025

Once you're done with the final fix suggested, let me know and I will take a look

Done!

Edit: Feel free to ping me when you've gotten closer to the double-array problem.
The closest I got was the comment on line src/sqlacodegen/generators.py:272 telling me that the types have already been adapted - I just can't figure out where 😅

I think the following test demonstrates the problem.

# Fails because postgres_array adds a second ARRAY import
def test_table_with_arrays(generator: CodeGenerator) -> None:
    _ = Table(
        "with_items",
        generator.metadata,
        Column("id", INTEGER, primary_key=True),
        Column("int_items_not_optional", ARRAY(INTEGER()), nullable=False),
        Column("postgres_array", postgresql.ARRAY(INTEGER()), nullable=False),
        Column("str_matrix", ARRAY(VARCHAR(), dimensions=2)),
    )

    validate_code(
        generator.generate(),
        """\
from typing import Optional

from sqlalchemy import ARRAY, INTEGER, Integer, VARCHAR
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column

class Base(DeclarativeBase):
    pass


class WithItems(Base):
    __tablename__ = 'with_items'

    id: Mapped[int] = mapped_column(Integer, primary_key=True)
    int_items_not_optional: Mapped[list[int]] = mapped_column(ARRAY(INTEGER()))
    postgres_array: Mapped[list[int]] = mapped_column(ARRAY(INTEGER()))
    str_matrix: Mapped[Optional[list[list[str]]]] = mapped_column(ARRAY(VARCHAR(), dimensions=2))
""",
    )

@sheinbergon
Copy link
Collaborator

@

Once you're done with the final fix suggested, let me know and I will take a look

Done!

Edit: Feel free to ping me when you've gotten closer to the double-array problem. The closest I got was the comment on line src/sqlacodegen/generators.py:272 telling me that the types have already been adapted - I just can't figure out where 😅

I think the following test demonstrates the problem.

# Fails because postgres_array adds a second ARRAY import
def test_table_with_arrays(generator: CodeGenerator) -> None:
    _ = Table(
        "with_items",
        generator.metadata,
        Column("id", INTEGER, primary_key=True),
        Column("int_items_not_optional", ARRAY(INTEGER()), nullable=False),
        Column("postgres_array", postgresql.ARRAY(INTEGER()), nullable=False),
        Column("str_matrix", ARRAY(VARCHAR(), dimensions=2)),
    )

    validate_code(
        generator.generate(),
        """\
from typing import Optional

from sqlalchemy import ARRAY, INTEGER, Integer, VARCHAR
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column

class Base(DeclarativeBase):
    pass


class WithItems(Base):
    __tablename__ = 'with_items'

    id: Mapped[int] = mapped_column(Integer, primary_key=True)
    int_items_not_optional: Mapped[list[int]] = mapped_column(ARRAY(INTEGER()))
    postgres_array: Mapped[list[int]] = mapped_column(ARRAY(INTEGER()))
    str_matrix: Mapped[Optional[list[list[str]]]] = mapped_column(ARRAY(VARCHAR(), dimensions=2))
""",
    )

Maybe I'm not seeing things correctly, but after the recent iteration, I'm not seeing the double ARRAY import anymore. I'm also not seeing any misplaced fixes. @JoaquimEsteves am I missing anything?

@sheinbergon sheinbergon requested a review from agronholm March 12, 2025 08:14
@sheinbergon
Copy link
Collaborator

sheinbergon commented Mar 12, 2025

@agronholm this went some review iteration. I pretty much approve what's going in @JoaquimEsteves contribution. Please have another look

Copy link
Owner

@agronholm agronholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What bugs me is the inconsistency. We're now suddenly using list but only in this new code. If we switch to using the generic built-ins, we should do the switch everywhere. And why did you duplicate the column type rendering code?

@sheinbergon
Copy link
Collaborator

sheinbergon commented Mar 12, 2025

What bugs me is the inconsistency. We're now suddenly using list but only in this new code. If we switch to using the generic built-ins, we should do the switch everywhere. And why did you duplicate the column type rendering code?

IMO We can do it in another PR. I don't think adding changes unrelated to Array/List types is a good idea.

@JoaquimEsteves
Copy link
Contributor Author

JoaquimEsteves commented Mar 12, 2025

Hello, thanks for the second look.

We're now suddenly using list but only in this new code. If we switch to using the generic built-ins, we should do the switch everywhere

I second the idea that adding more generic built-ins can be done in future PRs. For example, I didn't change the Optional because the T | None syntax is python 3.10+ only. I wanted to focus only on lists since they were a very common occurrence in the databases I use.

I didn't re-use the render_column_type because it returns sqlalchemy types instead of python types; so Mapped[Integer] instead of the correct Mapped[int]. The function looked quite tricky, so I just wrapped the old code that was on render_column_attributes into some function (I'll readily admit this could very much be a skill issue on my side).

The double array problem went away once I fixed the test data. Running my patch on a postgres DB with arrays didn't yield any double imports.

@agronholm
Copy link
Owner

I didn't re-use the render_column_type because it returns sqlalchemy types instead of python types; so Mapped[Integer] instead of the correct Mapped[int].

Right, of course. But it seems to me like this should be an overrideable method in the generator then, render_annotation() or something. We can probably refactor it that way in a follow-up, but for the moment please at least rename the local function to avoid any misconceptions about its purpose.

@JoaquimEsteves
Copy link
Contributor Author

Ahoy hoy,

Renamed the local function from render_col_type -> render_python_type.

If there's anything else feel free to ping :)

Copy link
Owner

@agronholm agronholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @sheinbergon ?

@sheinbergon
Copy link
Collaborator

@agronholm @JoaquimEsteves let's also document this change in CHANGES.rst?

@agronholm
Copy link
Owner

@agronholm @JoaquimEsteves let's also document this change in CHANGES.rst?

Oh, right, that's still missing.

@JoaquimEsteves
Copy link
Contributor Author

How about:

**3.?.?**

- Array types are no longer partially unknown.

I don't know which version - should I put **UNRELEASED** as the following commit did?

e34718c

@agronholm
Copy link
Owner

I don't know which version - should I put **UNRELEASED** as the following commit did?

e34718c

Exactly that. I actually just pushed a PR template with just such instructions.

@agronholm
Copy link
Owner

How about:

**3.?.?**

- Array types are no longer partially unknown.

Perhaps something like "Type annotations for ARRAY column attributes now include the Python type of the array elements"

@agronholm
Copy link
Owner

Great, works for me.

@JoaquimEsteves
Copy link
Contributor Author

JoaquimEsteves commented Mar 12, 2025

Grand - I'll wait for @sheinbergon to approve, feel free to merge at your convenience.

@sheinbergon sheinbergon self-requested a review March 13, 2025 05:58
Copy link
Collaborator

@sheinbergon sheinbergon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

@sheinbergon sheinbergon merged commit b60293f into agronholm:master Mar 13, 2025
8 checks passed
@sheinbergon sheinbergon added this to the 3.1 milestone Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Problem with Array generic type mapping

4 participants