Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qlearn draft #112

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions js/ai/qlearn/changelog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# changelog

## 6.0.0

* Start a changelog
* Change the learn calculation
* Up to ecma2021
* rename actionNames into previousActions inside learn
3 changes: 2 additions & 1 deletion js/ai/qlearn/examples/dodgeShoot/dodgeShoot.js
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ const actions = {
actor[0] = futureX;
},
};
// do not change over time
const actionNames = Object.keys(actions);

const updateGame = (action, state, reward) => {
Expand Down Expand Up @@ -102,7 +103,7 @@ const start = (options) => {
previousStateActions,
stateActions,
previousAction: actionName,
actionNames,
previousActions: actionNames,
reward: scoreDifference,
});
}
Expand Down
38 changes: 19 additions & 19 deletions js/ai/qlearn/examples/dodgeShoot/results2.txt
Original file line number Diff line number Diff line change
@@ -1,31 +1,31 @@
random positive reward 2000 frames score: 123
qLearn(reduceStateAndActionSeeAll)(learn) positive reward 2000 frames score: 463
qLearn(reduceStateAndActionSeeAll)(learnWithAverage) positive reward 2000 frames score: 465
qLearn(reduceStateAndActionSeeAllDistance)(learn) positive reward 2000 frames score: 420
qLearn(reduceStateAndActionSeeAllDistance)(learnWithAverage) positive reward 2000 frames score: 214
qLearn(reduceStateAndActionSeeAllDistance)(learn) positive reward 2000 frames score: 266
qLearn(reduceStateAndActionSeeAllDistance)(learnWithAverage) positive reward 2000 frames score: 240
qLearn(reduceStateAndActionSeeNearestOnly)(learn) positive reward 2000 frames score: 233
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) positive reward 2000 frames score: 129
random negative reward 2000 frames score: -120
qLearn(reduceStateAndActionSeeAll)(learn) negative reward 2000 frames score: -18
qLearn(reduceStateAndActionSeeAll)(learnWithAverage) negative reward 2000 frames score: -23
qLearn(reduceStateAndActionSeeAllDistance)(learn) negative reward 2000 frames score: -83
qLearn(reduceStateAndActionSeeAllDistance)(learnWithAverage) negative reward 2000 frames score: -128
qLearn(reduceStateAndActionSeeNearestOnly)(learn) negative reward 2000 frames score: -49
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) negative reward 2000 frames score: -60
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) positive reward 2000 frames score: 137
qLearn(reduceStateAndActionSeeAll)(learn) negative reward 2000 frames score: -14
qLearn(reduceStateAndActionSeeAll)(learnWithAverage) negative reward 2000 frames score: -18
qLearn(reduceStateAndActionSeeAllDistance)(learn) negative reward 2000 frames score: -71
qLearn(reduceStateAndActionSeeAllDistance)(learnWithAverage) negative reward 2000 frames score: -109
qLearn(reduceStateAndActionSeeNearestOnly)(learn) negative reward 2000 frames score: -19
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) negative reward 2000 frames score: -15
random positive reward 20000 frames score: 1241
qLearn(reduceStateAndActionSeeAll)(learn) positive reward 20000 frames score: 4963
qLearn(reduceStateAndActionSeeAll)(learnWithAverage) positive reward 20000 frames score: 4965
qLearn(reduceStateAndActionSeeAllDistance)(learn) positive reward 20000 frames score: 4920
qLearn(reduceStateAndActionSeeAllDistance)(learnWithAverage) positive reward 20000 frames score: 4168
qLearn(reduceStateAndActionSeeAllDistance)(learn) positive reward 20000 frames score: 3266
qLearn(reduceStateAndActionSeeAllDistance)(learnWithAverage) positive reward 20000 frames score: 4740
qLearn(reduceStateAndActionSeeNearestOnly)(learn) positive reward 20000 frames score: 2333
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) positive reward 20000 frames score: 1438
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) positive reward 20000 frames score: 2165
random negative reward 20000 frames score: -1235
qLearn(reduceStateAndActionSeeAll)(learn) negative reward 20000 frames score: -180
qLearn(reduceStateAndActionSeeAll)(learnWithAverage) negative reward 20000 frames score: -255
qLearn(reduceStateAndActionSeeAllDistance)(learn) negative reward 20000 frames score: -554
qLearn(reduceStateAndActionSeeAllDistance)(learnWithAverage) negative reward 20000 frames score: -983
qLearn(reduceStateAndActionSeeNearestOnly)(learn) negative reward 20000 frames score: -549
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) negative reward 20000 frames score: -621
qLearn(reduceStateAndActionSeeAll)(learn) negative reward 20000 frames score: -28
qLearn(reduceStateAndActionSeeAll)(learnWithAverage) negative reward 20000 frames score: -28
qLearn(reduceStateAndActionSeeAllDistance)(learn) negative reward 20000 frames score: -95
qLearn(reduceStateAndActionSeeAllDistance)(learnWithAverage) negative reward 20000 frames score: -135
qLearn(reduceStateAndActionSeeNearestOnly)(learn) negative reward 20000 frames score: -19
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) negative reward 20000 frames score: -15
random positive reward 200000 frames score: 12442
qLearn(reduceStateAndActionSeeAll)(learn) positive reward 200000 frames score: 49963
qLearn(reduceStateAndActionSeeAll)(learnWithAverage) positive reward 200000 frames score: 49965
Expand All @@ -39,4 +39,4 @@ qLearn(reduceStateAndActionSeeAll)(learnWithAverage) negative reward 200000 fram
qLearn(reduceStateAndActionSeeAllDistance)(learn) negative reward 200000 frames score: -4823
qLearn(reduceStateAndActionSeeAllDistance)(learnWithAverage) negative reward 200000 frames score: -9479
qLearn(reduceStateAndActionSeeNearestOnly)(learn) negative reward 200000 frames score: -5707
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) negative reward 200000 frames score: -6299
qLearn(reduceStateAndActionSeeNearestOnly)(learnWithAverage) negative reward 200000 frames score: -6299
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ const start = ({ display, MAX_FRAMES }) => {
previousStateActions: stateActions,
stateActions: stateActionsAfter,
previousAction: actionName,
actionNames,
previousActions: actionNames,
reward,
});
frame += 1;
Expand All @@ -105,4 +105,4 @@ const start = ({ display, MAX_FRAMES }) => {
};

step();
};
};
Loading