feat(data frame): Support polars#1474
Conversation
We currently do not have special support in the browser... so we let the json to-string method handle it. But we should definitely add support at some point! |
|
It seems like the JSON string method converts it to an int
opened issue here: #1477 |
Thanks for the ping! Happy to help here, please do let me know if there's anything missing which you'd need or which would make your work easier As a potential consumer, I'd be particularly interested in your thoughts on the |
|
From pairing w/ Barret, we're able to serialize pandas datetime series to ISO 8601 strings. There's a note in the code about possible results from pandas # investigating the different outputs from infer_dtype
import pandas as pd
from datetime import date, datetime
val_dt = pd.to_datetime(["2020-01-01"])
val_per = pd.Period('2012-1-1', freq='D')
col_per = pd.Series([val_per]) # period
col_date = [date.today()] # date
col_datetime = [datetime.now()] # datetime
col_datetime64 = pd.Series(pd.to_datetime(["2020-01-01"])) # datetime64
pd.api.types.infer_dtype(col_date) |
|
From pairing w/ @schloerke, we discovered that pandas.DataFrame.to_json has an interesting serialization strategy for custom objects. import pandas as pd
from dataclasses import dataclass
class C:
x: int
def __init__(self, x: int):
self.x = x
def __str__(self):
return f"I am C({self.x})"
@dataclass
class D:
y: int
df = pd.DataFrame({"x": [C(1), D(2)]})
df.to_json()
#> {"x":{"0":{"x":1},"1":{"y":2}}}Notice that it somehow serialized C(1) to {"x": 1}. This is because it seems to use df.to_json(default_handler=str)
#> {"x":{"0":"I am C(1)","1":"D(y=2)"}}Notice that the outputs are now the result of called pd.DataFrame({"x": [{"A": 1}, [8, 9]]}).to_json()
#> {"x":{"0":{"A":1},"1":[8,9]} |
|
Alright, handing off to @schloerke. There are two outstanding pieces:
More on swapping in narwhalsCurrently, For example, @schloerke mentioned needing a @singledispatch
def get_column_names(data: DataFrameLike) -> list[str]:
raise TypeError()
@get_column_names.register
def _(data: pd.DataFrame) -> list[str]:
# note that technically column names don't have to be strings in Pandas
# so you might add validation, etc.. here
return list(data.columns)
@get_column_names.register
def _(data: pl.DataFrame) -> list[str]:
return data.columns
@get_column_names.register
def _(data: nw.DataFrame) -> list[str]:
return data.columnsNotice that once narwhals is fully wired up everywhere, we can just always wrap inputs to the functions with The number 1 reason IMO for not going directly to narwhals is that the requirements on the shiny side need to be fleshed out. I think it'll help to flesh out support for pandas DataFrames, and add test cases against Polars and Pandas, to indicate when a refactor has succeeded (or is blocked). |
* main: fix(tests): dynamically determine the path to the shiny app (posit-dev#1485) tests(deploys): use a stable version of html tools instead of main branch (posit-dev#1483) feat(data frame): Support basic cell styling (posit-dev#1475) fix: support static files on pyodide / py.cafe under a prefix (posit-dev#1486) feat: Dynamic theming (posit-dev#1358) Add return type for `_task()` (posit-dev#1484) tests(controls): Change API from controls to controller (posit-dev#1481) fix(docs): Update path to reflect correct one (posit-dev#1478) docs(testing): Add quarto page for testing (posit-dev#1461) fix(test): Remove unused testrail reporting from nightly builds (posit-dev#1476)
* main: test(controllers): Refactor column sort and filter methods for Dataframe class (posit-dev#1496) Follow up to posit-dev#1453: allow user roles when normalizing a dictionary (posit-dev#1495) fix(layout_columns): Fix coercion of scalar row height to list for python <= 3.9 (posit-dev#1494) Add `shiny.ui.Chat` (posit-dev#1453) docs(Theme): Fix example and clarify usage (posit-dev#1491) chore(pyright): Pin pyright version to `1.1.369` to avoid CI failures (posit-dev#1493) tests(dataframe): Add additional tests for dataframe (posit-dev#1487) bug(data frame): Export `render.StyleInfo` (posit-dev#1488)
|
From chatting with @schloerke, I glanced over the code just now and it LGTM. I think the I didn't know enough about a lot of the shiny bits to have an opinion on things outside |
polars
* main: api(playwright): Code review of complete playwright API (posit-dev#1501) fix: Move `www/shared/py-shiny` to `www/py-shiny` (posit-dev#1499)
* main: feat(data frame): Support `polars` (#1474) api(playwright): Code review of complete playwright API (#1501) fix: Move `www/shared/py-shiny` to `www/py-shiny` (#1499) test(controllers): Refactor column sort and filter methods for Dataframe class (#1496) Follow up to #1453: allow user roles when normalizing a dictionary (#1495) fix(layout_columns): Fix coercion of scalar row height to list for python <= 3.9 (#1494) Add `shiny.ui.Chat` (#1453) docs(Theme): Fix example and clarify usage (#1491) chore(pyright): Pin pyright version to `1.1.369` to avoid CI failures (#1493)

This PR addresses #1439, by generalizing pandas.DataFrame specific logic to include Polars. It adds a module for DataFrame specific logic (
_tbl_data.py) and simple tests for each piece.From pairing with @schloerke, it seems like these next steps might be useful:
_tbl_data.py_tbl_data.pylogic (cc @MarcoGorelli)Notes:
strback up tostr, so certain htmltools tags can't be identified in a Polars Series.