You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
defget_data_frame(self):
ifnotPANDAS:
raiseException('Import Missing - Failed to import DataFrame from pandas.')
returnDataFrame(self.data['data'], columns=self.data['headers'])
For example, the LeagueDashPlayerShotLocations endpoint returns a hierarchical column structure, as you can see in this screenshot pulled from the Slack discussion:
When we look at headers, we see that in this case, it's actually a list of dicts, each with a columnNames key which points to a list of column names.
I don't have a great recommendation for how to fix this in a stable, extensible way. I was able to put together a hacky one-off fix by manually creating a MultiIndex object, and passing that in as the columns argument when creating the DataFrame, but that solution depended on manually mapping the hierarchy between columns, and won't work if there are different numbers/sets of columns.
response=endpoints.LeagueDashPlayerShotLocations(distance_range='5ft Range', per_mode_detailed='PerGame')
# There are 3 unique columns for each floor zone:# FGM, FGA, FG_PCTlists= [[c] *3forcinresponse.get_dict()['resultSets']['headers'][0]['columnNames']]
# We don't want to nest the first 5 columns under a "floor zone" node, so manually add 5 other values for a custom index. all_columns= ['player_data']*5forlinlists:
all_columns+=l# Note: could do the column logic above with this shorter, less readable code:# all_columns = ['player_data']*5 + [j for i in lists for j in i]# True = passlen(response.get_dict()['resultSets']['headers'][1]['columnNames']) ==len(all_columns)
# From the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html# The MultiIndex object is the hierarchical analogue of the standard Index object which typically stores the axis labels in pandas objects. # You can think of MultiIndex as an array of tuples where each tuple is unique. col=pd.MultiIndex.from_arrays([all_columns,
response.get_dict()['resultSets']['headers'][1]['columnNames']])
player_shooting_df=pd.DataFrame(data=response.get_dict()['resultSets']['rowSet'],
columns=col)
The structure of the response data makes it a little tricky to fix; since the first 5 columns in the columns object ('PLAYER_ID', 'PLAYER_NAME', 'TEAM_ID', 'TEAM_ABBREVIATION', 'AGE',) don't actually belong to any of the column hierarchies in the SHOT_CATEGORY list.
I'm happy to help out however, I just don't have a great sense of how we should fix this.
The text was updated successfully, but these errors were encountered:
This came up in the Slack channel, and I figured it'd be worth reporting here. See the discussion here: https://nbaapi.slack.com/archives/C012E7UH022/p1617120838011400
Issue:
When the API response data includes a nested axis index,
Endpoint.get_data_frames()
throws an error:Click to see full error
Details:
The root of the issue comes from how the
get_data_frames
method constructs a DataFrame from response data. It assumes thatself.data['headers']
is a 1d list of column names, but this isn't always the case:For example, the
LeagueDashPlayerShotLocations
endpoint returns a hierarchical column structure, as you can see in this screenshot pulled from the Slack discussion:When we look at
headers
, we see that in this case, it's actually a list of dicts, each with acolumnNames
key which points to a list of column names.How to fix?
I don't have a great recommendation for how to fix this in a stable, extensible way. I was able to put together a hacky one-off fix by manually creating a
MultiIndex
object, and passing that in as thecolumns
argument when creating the DataFrame, but that solution depended on manually mapping the hierarchy between columns, and won't work if there are different numbers/sets of columns.The structure of the response data makes it a little tricky to fix; since the first 5 columns in the
columns
object ('PLAYER_ID', 'PLAYER_NAME', 'TEAM_ID', 'TEAM_ABBREVIATION', 'AGE',
) don't actually belong to any of the column hierarchies in theSHOT_CATEGORY
list.I'm happy to help out however, I just don't have a great sense of how we should fix this.
The text was updated successfully, but these errors were encountered: