Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Game Attendance Data Availability? #68

Closed
msussman opened this issue Mar 3, 2018 · 26 comments
Closed

Game Attendance Data Availability? #68

msussman opened this issue Mar 3, 2018 · 26 comments

Comments

@msussman
Copy link

msussman commented Mar 3, 2018

Wanted to reach out to see if attendance data was available through the API

@panzarino
Copy link
Owner

@msussman I'll poke around and see what data MLB provides.

@ajbowler
Copy link
Contributor

ajbowler commented Mar 4, 2018

I'm unable to find anything in the XML files around attendance, though the venue name is widely available.

@ajbowler
Copy link
Contributor

ajbowler commented Mar 4, 2018

I'm digging around in their new stats API statsapi.mlb.com/docs but the lack of documentation isn't getting me very far.

I did find attendance in their live game endpoint, buried in an info property, here's a Dodgers @ Marlins game in 2016

 "info": [
        {
          "label": "Game Scores",
          "value": "Kershaw 41; Fernandez 84."
        },
        {
          "label": "HBP",
          "value": "Suzuki, I (by Fields, J)."
        },
        {
          "label": "Pitches-strikes",
          "value": "Kershaw 66-46; Coleman 10-6; Norris, B 15-7; Howell 1-1; Fields, J 9-3; Avilan 20-12; Ravin 17-11; Liberatore 7-5; Fernandez 102-66; Barraclough 14-11; Rodney 16-11; Ramos 11-8."
        },
        {
          "label": "Groundouts-flyouts",
          "value": "Kershaw 3-1; Coleman 1-1; Norris, B 1-1; Howell 0-0; Fields, J 0-0; Avilan 3-0; Ravin 1-0; Liberatore 0-2; Fernandez 4-1; Barraclough 2-0; Rodney 0-0; Ramos 0-1."
        },
        {
          "label": "Batters faced",
          "value": "Kershaw 14; Coleman 3; Norris, B 4; Howell 1; Fields, J 2; Avilan 5; Ravin 3; Liberatore 3; Fernandez 27; Barraclough 4; Rodney 3; Ramos 3."
        },
        {
          "label": "Inherited runners-scored",
          "value": "Howell 2-1; Fields, J 2-1; Avilan 3-0; Ramos 2-0."
        },
        {
          "label": "Umpires",
          "value": "HP: Brian Knight. 1B: Tony Randazzo. 2B: Bill Miller. 3B: Tom Woodring."
        },
        {
          "label": "Weather",
          "value": "77 degrees, roof closed."
        },
        {
          "label": "Wind",
          "value": "0 mph, None."
        },
        {
          "label": "First pitch",
          "value": "7:11 PM."
        },
        {
          "label": "T",
          "value": "3:08."
        },
        {
          "label": "Att",
          "value": "22,940."
        },
        {
          "label": "Venue",
          "value": "Marlins Park"
        },
        {
          "label": "September 9, 2016"
        }
      ],

It might be available through some smaller endpoints as this one is ENORMOUS, but would require some more research. As it is, I'm thinking this would be a good feature for mlbgame v3 (Stats API usage) since this info doesn't appear to be available in their XML files.

@trevor-viljoen
Copy link
Contributor

trevor-viljoen commented Mar 4, 2018

It's available through rawboxscore.xml. Here's a random example: http://gd2.mlb.com/components/game/mlb/year_2017/month_04/day_16/gid_2017_04_16_milmlb_cinmlb_1/rawboxscore.xml

<boxscore wind="16 mph, R to L" game_type="R" venue_name="Great American Ball Park" attendance="12,625" home_sport_code="mlb" official_scorer="Mike Cameron" game_pk="490277" date="April 16, 2017" status_ind="F" home_league_id="104" elapsed_time="2:56" game_id="2017/04/16/milmlb-cinmlb-1" venue_id="2602" start_time="1:10 PM" weather="73 degrees, overcast" gameday_sw="P">

@panzarino
Copy link
Owner

@trevor-viljoen If you have time, could you go ahead and add that in a PR. I don't really have much time but I could get to it if you can't.

@trevor-viljoen
Copy link
Contributor

@panzarino I'll try to find some time to do it this week. I'll also take a look at rawboxscore vs boxscore and see how different they are from each other. The fix might be as simple as using rawboxscore instead of boxscore.

@Pertempto
Copy link
Contributor

Is anyone working on this? I've done some work with MLB's rawboxscore.xml and boxscore.xml in another project. I'd be interested in doing this.

@ajbowler
Copy link
Contributor

Have at it, I could use another release soon.

@Pertempto
Copy link
Contributor

Which class should the attendance attribute go in? Each of the classes is associated with a corresponding xml file in the MLB API: Overview goes with linescore.xml, GameBoxScore goes with boxscore.xml, and GameScoreboard goes with scoreboard.xml. I think attendance data would fit the best in the Overview class, where were you expecting it to go?

@panzarino
Copy link
Owner

@Pertempto I think that it would fit well with the other stats provided by the Overview class.

@Pertempto
Copy link
Contributor

Great! I'm working on it now, and I'll probably have a pull request in the next few hours.

@Pertempto
Copy link
Contributor

Pertempto commented Mar 16, 2018

I've implemented the attendance feature, but I was wondering if I should add all the top level attributes from rawboxscore.xml to the game overviews. Here is an example. This would add useful data like the weather, wind, elapsed time, and exact start time. The only problem I found is that rawboxscore.xml includes a venue_name attribute, while the Overview class already has an venue attribute. Is it bad to have two attributes with the same value? Maybe I should add all the attributes from rawboxscore.xml and remove the venue_name attribute.

@panzarino
Copy link
Owner

@Pertempto It would be great if you could add that. It is fine to have duplicate attributes, just report everything.

@Pertempto
Copy link
Contributor

@panzarino Where am I supposed to "report everything"? Is it as simple as adding the new attributes to the Overview docstring or do I need to document these new attributes somewhere else as well?

@Pertempto
Copy link
Contributor

I've created a pull request with all the new attributes. #71

@panzarino
Copy link
Owner

Merged

@msussman
Copy link
Author

msussman commented Mar 25, 2018 via email

@ajbowler
Copy link
Contributor

I could definitely use a 2.5. Only missing piece from the LED board before the regular season starts is the probable starters which is just waiting for a new tag.

@panzarino
Copy link
Owner

I'll work on updating a few things and hopefully pushing out a new release in the coming days.

@msussman
Copy link
Author

msussman commented Mar 25, 2018 via email

@panzarino
Copy link
Owner

@msussman Sorry to disappoint, but I just looked at my schedule and I have almost no time this week so I'll have to push it back to sometime next week.

@msussman
Copy link
Author

msussman commented Mar 26, 2018 via email

@msussman
Copy link
Author

@panzarino, I just updated to the new release and am finding some issues with the attendance implementation.

  1. Mixed Data Times: When attendance is data is missing due to a double header, the attendance attribute is set to 0 (ex. Gameid 2011/07/02/pitmlb-wasmlb-1, 'attendance': 0), but when it's populated it's a string (ex. '2011/07/02/pitmlb-wasmlb-2', 'attendance': '39,636')

  2. Game Ids not found with Game_Overview method: I had pulled all games for the Nationals 2010-2018 previously without issue, but when I tried with the new release I'm getting quite a few games where the Game_Overview method returns this error "ValueError: Could not find a game with that id." Example - 2010_04_03_bosmlb_wasmlb_1

@panzarino
Copy link
Owner

@Pertempto could you look into this?

@Pertempto
Copy link
Contributor

@panzarino just updated my local repo, hopefully I'll be able to look at it in next week.

@Pertempto
Copy link
Contributor

Sorry, but I was busy last week and wasn't able to work on this. Someone else might want to do this because I can't promise that I'll have the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants