Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternate fix for #120 bad keyreport issue #148

Merged
merged 8 commits into from
Sep 15, 2018
Merged

Alternate fix for #120 bad keyreport issue #148

merged 8 commits into from
Sep 15, 2018

Conversation

blahlicus
Copy link
Contributor

@blahlicus blahlicus commented Sep 10, 2018

Hi, I want to get #146 moving as soon as possible so I I'd like to see if this version of the fix would be more acceptable to you. The code looks dumbed down because it has several for loops instead of a single loop, but this seems to be the best way IMO to properly separate the tasks involved.

You cited readability concerns over #121 so my solution has no pointers and should be somewhat faster than Steve's other solution even though I can't prove it.

Please allow me to part my post into 3 sections

  • Code structure
  • Speed test (Steve pointer code, original code, my code)
  • Validity test

Code Structure

The original issue is caused by the following bit of code

		for (uint8_t i = 0; i < sizeof(_keyReport.keycodes); i++)
		{
			auto key = _keyReport.keycodes[i];
			
			// Is key already in the list or did we found an empty slot?
			if (s && (key == uint8_t(k) || key == KEY_RESERVED)) {
				_keyReport.keycodes[i] = k;
				return 1; <============== OVER HERE
			}
			
			// Test the key report to see if k is present. Clear it if it exists.
			if (!s && (key == k)) {
				_keyReport.keycodes[i] = KEY_RESERVED;
				return 1;
			}
}

The original code creates a duplicate of k into keycodes if it iterates to an empty slot before reaching a slot with k because it exits once it finds a blank slot or a slot with k

My code iterates through keycodes looking for an existing k first, if k is found, then it exits right away and does nothing as intended. If no k was found, then it iterates through the array until an empty slot is found then inputs k into the empty slot.

My code has several implications in terms of speed, there are 1 or 2 for loops depending on whether you are adding or removing from the array, if you are removing, then it should be as fast as before. If you are adding, then you are at at most O(2n) which is the same as O(n). Also, if you are adding to keycodes and it already exists, then it should also be as fast as the original because the loop would stop at the first instance of k.


On Speed in The Real World

Steve kindly provided a test case in #121 but I can't run it independently because I don't have a PS2 keyboard on hand and my soldering irons are at work not at home. I made my own tests running on a pro micro, below are the two simple tests that I've ran.

I ran the trapezoid test on my own solution, #121, and the original code. I did not run the stairs test with the original code because I fear it might bug it.

Surprisingly, All three versions returned the same results for both tests as follows:

  • 1 key down then up: 2000ms
  • 2 key down then up: 4000ms
  • 3 key down then up: 6000ms
  • 4 key down then up: 8000ms
  • 5 key down then up: 10000ms
  • 6 key down then up: 12000ms

It is possible that my test is flawed but it seems like there are other overheads involved with HID such that it always takes 2ms per scancode to go through, so my solution does not seem to be slower than a solution utilising pointers.


Validity Test

To make sure my code works, I ran this test to make sure key2 in #120 is properly releasing. I tested the original code, Steve's code, and my own. the original code is indeed bugged, and both Steve's and my code solves the problem.

EDIT: spelling

@NicoHood
Copy link
Owner

Hi,
thanks for your PR, I also want to get this fixed.

The release code is quite simpler, the add code more complex. I assume a press() releaseAll() combination is used more often, so it would be good to have the complex code inside the release code. It should be technically possible, right?

Also please remove the markers where your code start and ends, we can see it through the git diff :)

@blahlicus
Copy link
Contributor Author

blahlicus commented Sep 11, 2018

Hey, no, you cannot relegate tasks in the add code to the release code without introducing more bugs.

If I am reading it right, you are suggesting that I move the duplicate check to the release code, but this will not prevent multiple presses of key1 from flooding the keycodes array and hence prevent any other keys from being pressed until key1 was released.

To be clear, this new bug would not be as bad as the current bug at #120, but this still breaks HID specs because you aren't supposed to have multiple scancodes of the same value in your HID report and it affects the rollover ability of the keyboard greatly, which is a real life concern.

@NicoHood
Copy link
Owner

Well you need to reorder the keycodes on removing then. No matter what, your code looks good. You said you tested it, so I will merge it if you say its ready.

Does @SteveBenz want to give it another review? Otherwise I will merge it soon, remind me if I forget.

@blahlicus
Copy link
Contributor Author

As per your suggestion I reordered the keycodes on release instead so that we could keep the original very simple code for adding.

The release part now moves the contents of the last occupied slot to k instead, so that there will never be KEY_RESERVED in between occupied slots, so the add code part could safely iterate through the list and write itself on first instance of KEY_RESERVED or first instance of k.

I ran the staircase speed test on this new iteration and it is faster at the following speeds

  • 1 key down then up: 2000ms
  • 2 key down then up: 3000ms
  • 3 key down then up: 5000ms
  • 4 key down then up: 6000ms
  • 5 key down then up: 8000ms
  • 6 key down then up: 9000ms

I ran the key report test and also a rollover test to make sure this commit is working fine. Everything works fine.

This version is slightly less readable compared to before but it is a lot faster on adds as per your suggestion. I think it is currently still acceptable in terms of readability. If you think it's acceptable, then maybe just let it sit for a day or two and see if Steve would like to give some comments? Otherwise I think this is ready for merge as is.

@NicoHood
Copy link
Owner

The release code looks too complicated. Please have a look at my newer implementation here:
https://github.com/NicoHood/avr/blob/master/lib/USB_KEYBOARD/src/usb_keyboard.c#L224

Maybe you can reuse some of the code. It looks way cleaner to me.

@blahlicus
Copy link
Contributor Author

blahlicus commented Sep 11, 2018

The commits just look bad when you diff it with split screen on. Anyway, I refactored it somewhat by using the same for loop instead of 2 and also reformatted the curly bracket style to the existing code style.

The implementation you've linked to is a lot slower compared to all the other solutions because it reads and writes every filled slot until the last element is reached.

See below example where the user removes key1:

Before removal

  • [1][2][3][4][n][n]

Your implementation: (bolded text are rewritten slots)

  • [2][3][4][n][n][n]: 5 reads 4 writes

My implementation: (bolded text are rewritten slots)

  • [4][2][3][n][n][n]: 5 reads 2 writes

}

// if the next element is null or if this is the last element
if ((i < keycodesSize-1 && _keyReport.keycodes[i+1] == KEY_RESERVED) || i == keycodesSize-1) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the key doesn't appear in the list at all, this will remove the last key pressed from the list. You can fix that in one of two ways - one is that you can add keyIndex!=255 to the condition, the other, which would probably be a little more efficient, is to record the lastIndex rather than lastElement, and then move the code that changes the array down into the "if target key found" block.

Copy link
Contributor Author

@blahlicus blahlicus Sep 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, my previous commit (e2145e8) did not have this bug, it got introduced in the "refactor" commit (b733010).

As per your suggestion lastIndex is now recorded and handled in the target key found block instead in the most recent commit. I rewrote some tests and confirmed this bug is solved in the most recent commit. It also looks more readable so that is nice too.

EDIT: I'm not sure about etiquette regarding resolving conversations so I'll leave it unresolved.

_keyReport.keycodes[i] = k;
return 1;
// get size of keycodes during compile time
const uint8_t keycodesSize = sizeof(_keyReport.keycodes);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, this is a dangerous pattern in C - the correct way to do it is "sizeof(array)/sizeof(*array)". In this case, you can get away with what you got because sizeof(*array) happens to be 1 and is (reasonably) sure to stay that way. No point in fixing it here, as there's lots of this sort of thing in the code. Just something to think about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just followed the original code style because I did not want to make disruptive changes. I do recognise that changing the data type of the subject in question would mess it up. Like you say, HID spec scancodes won't be changing sizes any time soon so this should not matter but it's a good habit to always do it regardless anyway especially considering that this is calculated in compile time anyway so its not like checking sizeof(array)/sizeof(array[0]) is slower than the other way.

@blahlicus
Copy link
Contributor Author

blahlicus commented Sep 12, 2018

The most recent commit solved the bug in this conversation and makes the code more readable, I think this PR as it is should be ready for merge. If there's no other issues concerning the code I'll remind @NicoHood about it after a few days. If there are other concerns then please do let me know.

@NicoHood
Copy link
Owner

Did you have a look at the code that I posted? What about it? Would it also work? It looks way simpler to me.

@blahlicus
Copy link
Contributor Author

@NicoHood Yeah, I replied to you here.

@NicoHood
Copy link
Owner

NicoHood commented Sep 12, 2018

Okay you are right. I am terribly sorry for the long back and forth. I tend to like the first approach more. It is better readable and understandable. Also I dont know if the keycode reordering causes problems on some platform, we never know. Even if it is a little slower, who cares!? If you need speed, you would not use the arduino core.

Edit: How about iterating through the list (on adding), finding the first empty slot but still iterating through the whole list and if the key was found it returns? This should be fast and simple.

for(i=0; i<6;i++){ // iterating from 5 to 0 might be better?
  if(slot is empty){
    emptyslot = i;
   }
   else if(slot == new key){
    return;
   }
}
save key to empty slot. Make sure the report is not full!

@blahlicus
Copy link
Contributor Author

blahlicus commented Sep 12, 2018

Yeah, I agree, the original approach was good enough speed-wise for me which was why I submitted the original PR as is in the first place, so I'm not opposed to going back to that approach. And I agree reordering the slots might mess with compatibility, it might release, then re-press the reordered slots.

I think the approach you've suggested might be slower than my original solution. Both solutions read at least 6 times checking for duplicates but needing to save the empty slot index to a temp variable doubles the amount of writes from once to twice.

If the original solution is readable as is, then I think it's good enough for the merge. I'll defer to your judgement if you think having a single loop is really preferable in spite of the speed penalty.

@NicoHood
Copy link
Owner

I am not sure what is faster. I thought a single loop would be faster since remembering an empty slot could be done in the same loop. But I might be wrong. I'd go for the smaller code size though, and readability.

But I want to get this done, so this fix should do it at least for now. @SteveBenz any objections?

@SteveBenz
Copy link

This code isn't the tightest or fastest, but it's plain to see that it works. Ship it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants