Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments #1

Open
SarvagyaVaish opened this issue Feb 15, 2014 · 49 comments
Open

Comments #1

SarvagyaVaish opened this issue Feb 15, 2014 · 49 comments

Comments

@SarvagyaVaish
Copy link
Owner

Leave your comments here...

@xissy
Copy link

xissy commented Feb 15, 2014

wow, this is amazing. inspired by your practical ML approach.

@iandanforth
Copy link

You should get a tapsterbot! https://github.com/hugs/tapsterbot

@Aaron1011
Copy link

This is incredible!

@joeyslater
Copy link

That's what's up man.

@dend
Copy link

dend commented Feb 16, 2014

Awesome job, Survy!

@halfdan
Copy link

halfdan commented Feb 16, 2014

Nice job - please add a proper reference to the source of the pseudo code though. It's clearly taken out of a publication.

@Giszmo
Copy link

Giszmo commented Feb 16, 2014

I never did image analysis but I assume it to be trivial to do with a camera for your android bot. You said the image (screenshot) takes 2s to get to the PC? A cam should be much much faster. The image analysis would basically just need to scan the right side of the screen for green-notgreen-green. The timing is constant.

@ztl2004
Copy link

ztl2004 commented Feb 16, 2014

Dude, this is fantastic and it's what I be thought about for a long time, I ve noticed that u want to do this on mobile, I ve studied ios private Apis and I ve done the screen capture and touch simulation, do u think there is a possibility that we work it out

@bolte-17
Copy link

Any thought to adding either bird's current velocity (or as a proxy, time since last tap) to the state space? That seems to be the only missing parameter.

@ztl2004
Copy link

ztl2004 commented Feb 16, 2014

but I think it's hard to get

在 Feb 16, 2014,2:57 PM,bolte-17 [email protected] 写道:

Any thought to adding either bird's current velocity (or as a proxy, time since last tap) to the state space? That seems to be the only missing parameter.


Reply to this email directly or view it on GitHub.

@cbbayburt
Copy link

Actually, simulating the game's dynamics might lead to a simpler and more precise solution. the game doesn't really involve sophisticated decision steps which requires ML. Since it is really a pure physics problem, simpler solution depends on some simple observations though:

  • Through observation, I found out that keeping the bird level requires a tapping period of 600ms.
  • Let's say the bird's jump height is hb. So in original game, every tap makes the bird go up hb units, while every 600ms its height goes down hb units.
  • Ascending and descending actions are achieved by simply modifying the tapping period (Smaller to ascend, lower to descend).
  • Actual amount of ascend can be calculated as: dH = hb - (hb*ptap/600) From here, the required tapping period to achieve a specific ascend/descend amount 'dH' can be calculated as: ptap = 600 - (600 * dH / hb)

flappy

So the algorithm would be:

hb: The bird's jump height for a single tap, in other words, the amplitude of the bird's harmonic motion in level flight (it is constant and can be measured in means of pixels).
hBird: Height of the middle point of the bird's harmonic motion.
hObstacle: Height of the middle point of the space between the pipes.
ptap: Waiting period before the next tap.
dh: The height difference between the bird and the obstacle path.

for each immediate uncleared obstacle:
  while(obstacle_not_cleared)
    dh <- hObstacle - hBird
    ptap <- 600 - (600 * dh / hb)
    if ptap < 0 then ptap <- 0  //Gonna fall, tap immediately
    sleep(ptap)
    tap()

This algorithm can make the flappy lips fly forever. For android, instead of requesting .png screenshots which really takes about 1-2 seconds, you can analyze specific pixels in the raw frame buffer (some unix device file like /dev/graphics/fb0) which gives you enough speed to run the algorithm. But for that, you obviously need a rooted device.

@SarvagyaVaish
Copy link
Owner Author

Analyzing the "specific pixels in the raw frame buffer" is worth a shot! Thanks!
And I agree with the solution being nicer if I simulated the game dynamics, but I wanted to approach the problem using machine learning. Thanks for the solution though.

@savraj
Copy link

savraj commented Feb 16, 2014

I'd love a deeper walkthrough of this -- maybe a youtube video.

@metaylor
Copy link

This is very cool. Good idea to pick a popular game and show that ML can solve it! I'm going to bring up your project as a discussion topic in the graduate reinforcement learning class I'm currently teaching.

http://www.eecs.wsu.edu/~taylorm/14_580/index.html

@SarvagyaVaish
Copy link
Owner Author

That is awesome!! I am honored. Thanks :)
May I ask how you found the link?

@metaylor
Copy link

My brother pointed me to it. I'm not sure how he found out about it though.

Best,
Matt


Matt Taylor
http://eecs.wsu.edu/~taylorm

On Sun, Feb 16, 2014 at 7:03 PM, Sarvagya Vaish [email protected]:

That is awesome!! I am honored. Thanks :)
May I ask how you found the link?

Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-35225526
.

@ztl2004
Copy link

ztl2004 commented Feb 17, 2014

maybe reddit

在 Feb 17, 2014,11:06 AM,metaylor [email protected] 写道:

My brother pointed me to it. I'm not sure how he found out about it though.

Best,
Matt


Matt Taylor
http://eecs.wsu.edu/~taylorm

On Sun, Feb 16, 2014 at 7:03 PM, Sarvagya Vaish [email protected]:

That is awesome!! I am honored. Thanks :)
May I ask how you found the link?

Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-35225526
.


Reply to this email directly or view it on GitHub.

@billhao
Copy link

billhao commented Feb 17, 2014

this is very cool!

@ataugeron
Copy link

Get this to work on a mobile phone!! If anyone has any ideas , please let me know in the comments :)

Did you try using monkeyrunner (Python, Android) or UIAutomation (Javascript, iOS)?

@SarvagyaVaish
Copy link
Owner Author

monkeyrunner took about 1-2 seconds to get a screenshot.. so not responsive enough.
Haven't tried UIAutomation, but do you know if the response time is any better?

@cxt120
Copy link

cxt120 commented Feb 17, 2014

Is the training only works on a specific map?

@SarvagyaVaish
Copy link
Owner Author

There is no "map". There is randomness as far as the pipe height is concerned, but the game is basically just one never ending randomized "map" of pipes coming towards you.

@cooperjay
Copy link

i just another working method over here : http://flappybirdhack.hol.es/

@thebino
Copy link

thebino commented Mar 3, 2014

How do you want to grap /dev/graphics/fb0 and use the result image for calculating? Do you want to write something like an TestCase with Events injection on the WindowManager?

@Eniac-Xie
Copy link

Is Q[s,a] just a large array?Or a function like BP Neural Networks?

@SarvagyaVaish
Copy link
Owner Author

Yeah. Q is a multi-dimensional array representing the entire state space.

@Eniac-Xie
Copy link

I'm a little curious. I think the bird's speed should also be considered. I mean that birds with the same position but different speed will lead to different result,didn't it?

@SarvagyaVaish
Copy link
Owner Author

Based on the game dynamics, the bird always gets the same upward velocity irrespective of its velocity at the time of input. So weirdly enough, two birds at the same position with different speeds will end up at the same position when the user tell them to jump.

@Eniac-Xie
Copy link

thank you!

2014-04-24 0:16 GMT+08:00 Sarvagya Vaish [email protected]:

Based on the game dynamics, the bird always gets the same upward velocity
irrespective of its velocity at the time of input. So weirdly enough, two
birds at the same position with different speeds will end up at the same
position when the user tell them to jump.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-41181440
.

@Eniac-Xie
Copy link

I try it myself but find Q cannot converge in a short time, maybe my Q is too large(160_401_2. it seems that 160_401_2 is not large). How large is your Q?

@SarvagyaVaish
Copy link
Owner Author

It takes about 6-8 hours at regular game speed for flappy to learn a good model.

@andreydung
Copy link

How do you run the code? Is it simply running index.html?

@SarvagyaVaish
Copy link
Owner Author

Yeah. Just start up a local server (wamp, xampp) and open the index.html

@junzhez
Copy link

junzhez commented Jun 5, 2014

Just a quick question. Can the vertical distance to pipe bottom be negative?

@SarvagyaVaish
Copy link
Owner Author

Yes, if the bird is below the pipe :)

@junzhez
Copy link

junzhez commented Jun 7, 2014

Thanks for your reply. May I ask you about the dimension of your state space? I am trying to reproducing you work with another copy of Flappy Bird. It seems that my state space is way to large.

@SarvagyaVaish
Copy link
Owner Author

I dont remember exactly, but it was huge! Takes a while to train. Check out http://sarvagyavaish.github.io/FlappyBirdRL/ for more details.

@tropicdome
Copy link

Nice work, I love application of RL for something some fun like this, kudos :)

I have tried out your implementation but with different resolutions since this can greatly decrease the number of states. Using a resolution of 10 instead of 4 lowered the state space from 12150 states to 1944. Here is my data after running this

  • 14 points after 7 min
  • 17 points after 9 min
  • 48 points after 10 min
  • 62 points after 14 min
  • 145 points after 19 min
  • 496 points after 25 min
  • 1000+ after 1h 10min

One question, does it or should it take the distance to the ground into account? When you get a pipe that is really close to the ground, the bird would sometimes like to go below and then jump, which it obviously can't, but it is not learning from this?

@SarvagyaVaish
Copy link
Owner Author

Thanks for crunching the numbers! Its cool to see that the state space affects the learning times so drastically.
About the distance from the ground, its true that the model doesn't learn that it should jump when close to the ground. I didn't want to add another dimension to my state space, and that's primarily why i don't take that into account. But for better results, you could probably add a general (non-learned) rule that says that the bird must jump when close to the ground. Another idea would be to add that third dimension of distance to ground but only have two state in it - less than xx units from the ground, more than xx units from the ground. That way you would only be doubling the state space, but can have the system learn the rule anyway :)

@SteveRik
Copy link

SteveRik commented Jan 5, 2015

Very nice blog. Thanks for sharing! Is there any possibility that the vertical distance to pipe bottom be negative? Please advise. Thanks! https://intellipaat.com/

@SarvagyaVaish
Copy link
Owner Author

Yes. It is possible and the model accounts for that :)

On Mon, Jan 5, 2015, 06:42 SteveRik [email protected] wrote:

Very nice blog. Thanks for sharing! Is there any possibility that the
vertical distance to pipe bottom be negative? Please advise. Thanks!
https://intellipaat.com/


Reply to this email directly or view it on GitHub
#1 (comment)
.

@xoancosmed
Copy link

It is Open Source ?

@SarvagyaVaish
Copy link
Owner Author

Yes.

On Mon, Apr 6, 2015, 5:35 AM Xoán Carlos Cosmed Peralejo <
[email protected]> wrote:

It is Open Source ?


Reply to this email directly or view it on GitHub
#1 (comment)
.

@AIForex
Copy link

AIForex commented Nov 29, 2015

I'm working on a similar program that would involve the Forex market ,

Actions per bar would be as follows 1 buy open exit close, 2) sell open exit close 3) Do nothing soon to be published on www.marketcheck.co.uk

Peter
[email protected]

@Aytros
Copy link

Aytros commented Jun 11, 2016

This is great! I recently graduated with a degree in Comp. Sci. My last semester I took Intro to AI and our final project was to implement this on our own and we were provided with a working python flappy bird. My agent was not very effiecient but it did learn a little so I did well. NOw that I am graduated, I would like to iprove my agent for my own sake. Would you be able to look over my algorithm and give some feedback on how I might be able to improve?

@paulocastroo
Copy link

paulocastroo commented Apr 6, 2018

6-7 hours is not good at all, made this flappy bird bot training in 3 minutes with random forest, I'll see if I could fit some room of improvement with your code in qlearn

@SarvagyaVaish
Copy link
Owner Author

@paulocastroo that's because i was running the flappy bird in realtime using the game engine. If you could speed up the simulation, training would end up being significantly faster.
Curious to learn how you used random forest to train. Let me know :) Thanks!

@tropicdome
Copy link

tropicdome commented Apr 6, 2018

For classic Q-learning @SarvagyaVaish implementation is already quite good. Doesn't have to take 6-7 hours, besides the real-time perspective as @SarvagyaVaish mentioned, you could/should optimize your state space representation. For example, change the resolution to e.g. 20 to reduce the state space significantly (which is reasonable) and it will train in <15 minutes running in real-time, or even 30 and it trained for me in 2.5 min.

@paulocastroo
Copy link

@SarvagyaVaish oh sorry was not paying attention with the real life. I made some changes with the states, I tried to compress the states as much as possible, it ended as difference/distance between the height of the bird and the pipe hole making the overall matrix much smaller, here's a demo: https://planktonfun.github.io/q-learning-js/step-6.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests