Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusion about the function signature of python from_pandas #45105

Open
zhuwenxing opened this issue Dec 24, 2024 · 1 comment
Open

Confusion about the function signature of python from_pandas #45105

zhuwenxing opened this issue Dec 24, 2024 · 1 comment

Comments

@zhuwenxing
Copy link

Describe the bug, including details regarding any error messages, version, and platform.


    @classmethod
    def from_pandas(cls, cls_1, df, Schema_schema=None, preserve_index=None, nthreads=None, columns=None, bool_safe=True): # real signature unknown; restored from __doc__
        """
        Table.from_pandas(cls, df, Schema schema=None, preserve_index=None, nthreads=None, columns=None, bool safe=True)
        
                Convert pandas.DataFrame to an Arrow Table.
        
                The column types in the resulting Arrow Table are inferred from the
                dtypes of the pandas.Series in the DataFrame. In the case of non-object
                Series, the NumPy dtype is translated to its Arrow equivalent. In the
                case of `object`, we need to guess the datatype by looking at the
                Python objects in this Series.
        
                Be aware that Series of the `object` dtype don't carry enough
                information to always lead to a meaningful Arrow type. In the case that
                we cannot infer a type, e.g. because the DataFrame is of length 0 or
                the Series only contains None/nan objects, the type is set to
                null. This behavior can be avoided by constructing an explicit schema
                and passing it to this function.
        
                Parameters
                ----------
                df : pandas.DataFrame
                schema : pyarrow.Schema, optional
                    The expected schema of the Arrow Table. This can be used to
                    indicate the type of columns if we cannot infer it automatically.
                    If passed, the output will have exactly this schema. Columns
                    specified in the schema that are not found in the DataFrame columns
                    or its index will raise an error. Additional columns or index
                    levels in the DataFrame which are not specified in the schema will
                    be ignored.
                preserve_index : bool, optional
                    Whether to store the index as an additional column in the resulting
                    ``Table``. The default of None will store the index as a column,
                    except for RangeIndex which is stored as metadata only. Use
                    ``preserve_index=True`` to force it to be stored as a column.
                nthreads : int, default None
                    If greater than 1, convert columns to Arrow in parallel using
                    indicated number of threads. By default, this follows
                    :func:`pyarrow.cpu_count` (may use up to system CPU count threads).
                columns : list, optional
                   List of column to be converted. If None, use all columns.
                safe : bool, default True
                   Check for overflows or other unsafe conversions.
        
                Returns
                -------
                Table
        
                Examples
                --------
                >>> import pyarrow as pa
                >>> import pandas as pd
                >>> df = pd.DataFrame({'n_legs': [2, 4, 5, 100],
                ...                    'animals': ["Flamingo", "Horse", "Brittle stars", "Centipede"]})
                >>> pa.Table.from_pandas(df)
                pyarrow.Table
                n_legs: int64
                animals: string
                ----
                n_legs: [[2,4,5,100]]
                animals: [["Flamingo","Horse","Brittle stars","Centipede"]]
        """
        pass

image

Component(s)

Python

@zhuwenxing
Copy link
Author

pyarrow │ 18.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant