Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Buffer.lastIndexOf #4846

Closed
wants to merge 2 commits into from
Closed

feature: Buffer.lastIndexOf #4846

wants to merge 2 commits into from

Conversation

dcposch
Copy link
Contributor

@dcposch dcposch commented Jan 24, 2016

Fixes #4604

work done

  • Added support for Buffer.lastIndexOf to match Buffer.indexOf
  • Can search for a string, another Buffer, or a specific byte value, consistent with Buffer.indexOf
  • For specific byte values, behavior is consistent with Uint8Array.lastIndexOf, which is now shadowed by Buffer.lastIndexOf, so existing code should continue to work
  • Added test cases

work left to do

  • Optimization. The implementation of reverse search in string_search.cc is naive and just uses a double for loop. Ideally we'd adapt BoyerMooreSearch to support reverse search, so that lastIndexOf will be equally fast as indexOf

@silverwind silverwind added buffer Issues and PRs related to the buffer subsystem. semver-minor PRs that contain new features and should be released in the next minor version. labels Jan 24, 2016
@Fishrock123
Copy link
Contributor

cc @trevnorris :)

@mikeal
Copy link
Contributor

mikeal commented Jan 25, 2016

Oh wow, I've been waiting years for this :)

This should make writing parsers a lot nicer :)

@dcposch
Copy link
Contributor Author

dcposch commented Jan 27, 2016

@mikeal cool. I think I'm more or less done!
Let me know if it looks reasonable.

performance

I've refactored so that both lastIndexOf and indexOf use the same fast algorithms: Boyer-Moore / Boyer-Moore-Horspool.

To do this with a minimum of messiness and without any code duplication, I expanded the Vector<Char> class that was already in string_search.h so that it can provide a reversed view onto a the underlying buffer. Then, the existing algorithms (Linear, Boyer-Moore, Boyer-Moore-Horspool) can be applied unmodified.

This works because lastIndexOf(haystack, needle) can be calculated from indexOf(reverse(haystack), reverse(needle)). Reversing the inputs is done via a lightweight view onto the original input buffer: it's not actually copying the buffers or doing anything slow like that.

testing

I added some additional test cases, to make sure that all of the search algorithms were being exercised.

(Background: Under the hood, the existing indexOf starts with Linear search, then if that's too slow, switches to Boyer-Moore-Horspool, which requires only a quick precomputation but has O(nm) worst-case complexity for finding a needle of size m in a haystack of size n. Finally, if that turns out to be too slow, it switches to Boyer-Moore, which requires more precomputation but has linear worst-case complexity.)

Surprisingly, the existing test cases never seem to exercise the Boyer-Moore fallback at all!

I added a test case that does go there.

If you want to reproduce the output below, compile with DEBUG_STRING_SEARCH defined and add a failing assertion to the bottom of test-buffer-indexof.js (otherwise the test runner swallows the output).

=== release test-buffer-indexof ===                                            
Path: parallel/test-buffer-indexof
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search BOYER-MOORE-HORSPOOL
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search LINEAR
forward search LINEAR
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR

(Above: existing test cases for indexOf. Note it never runs Boyer-Moore.)

(Below: new test cases, added in this PR, for lastIndexOf. Uses the same code to do all the heavy lifting, no code duplication. Tests Boyer-Moore.)

reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE

@dcposch
Copy link
Contributor Author

dcposch commented Jan 27, 2016

Also, this PR uses memrchr to search for a single byte value in a Buffer, back to front. Like memchr, it's a lot faster than just looping over bytes.

Unfortunately memrchr is a GNU extension, not part of POSIX like memchr.

  1. Does Node have to compile in places where memrchr isn't available?
  2. If so, should I find a polyfill, for example this one?
    https://github.com/c9/node-gnu-tools/blob/master/grep-src/lib/memrchr.c
  3. Alternatively, we can use the magic bits method described here:
    http://cebka.blogspot.com/2015/04/how-fast-is-your-memchr.html

@jasnell jasnell added the wip Issues and PRs that are still a work in progress. label Jan 27, 2016
@feross
Copy link
Contributor

feross commented Jan 28, 2016

Thanks for tackling this PR, @dcposch!

@@ -847,31 +882,25 @@ void IndexOfString(const FunctionCallbackInfo<Value>& args) {
SPREAD_ARG(args[0], ts_obj);

Local<String> needle = args[1].As<String>();
int64_t offset_i64 = args[2]->IntegerValue();
bool is_forward = args[4]->BooleanValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Faster to do IsTrue()

@trevnorris
Copy link
Contributor

Started to review, but have to step away. Will finish this tomorrow.

@dcposch Nice job on the inline comments. Makes the patch easier to follow. Especially for one this size.

@dcposch
Copy link
Contributor Author

dcposch commented Jan 29, 2016

@trevnorris thx. Fixed all the things you pointed out so far

} else if (byteOffset < -0x80000000) {
byteOffset = -0x80000000;
}
if (typeof byteOffset !== 'number' || isNaN(byteOffset)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be necessary. To follow string.lastIndexOf() the offset should be coerced to a primitive. So for example 'abcde'.lastIndexOf('c', [1]) === -1. Basically the byteOffset >>= 0 below should be enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need this to match the behavior of Buffer.indexOf

  • buf.indexOf('foo') searches the whole buffer, as does buf.indexOf('foo', null), buf.indexOf('foo', 'foo'), etc
  • buf.indexOf('foo', 0) searches starting from index 0, which also searches the whole buffer
  • buf.lastIndexOf('foo') should def search the whole buffer, but
  • buf.lastIndexOf('foo', 0) does a reverse search starting from index 0, so it only checks for a match at index 0

So a minimum, we have to special-case undefined

I think it's best if buf.lastIndexOf('foo', null), buf.lastIndexOf('foo', NaN) etc match buf.lastIndexOf('foo') -- in other words, they should search the whole buffer. That means they're NOT equivalent to buf.lastIndexOf('foo', 0)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The operation is as simple as offset = +offset. This will coerce all isNaN() values to NaN, which can then be checked by Number.isNaN(). It will also coerce values like [2] to 2, which is also how String#lastIndexOf() operates.

So any value that returns true for Number.isNaN() after the coercion is set to the default value. Though note this does exclude null. Which is the same way strings work. e.g. 'abc'.lastIndexOf('b', null) === -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. i added a test case to ensure 'abc'.lastIndexOf('b', null) === -1

@dcposch
Copy link
Contributor Author

dcposch commented Feb 1, 2016

@trevnorris notice me senpai

@Fishrock123
Copy link
Contributor

@dcposch some of us who work on this more do take the weekends off. ;)

#define DEBUG_TRACE(s) printf("%s search %s\n", \
subject.forward() ? "forward" : "reverse", s);
#else
#define DEBUG_TRACE(s) // no-op
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnoordhuis have any comments on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. Note though that this was the only way I noticed some serious missing test coverage: the whole Boyer-Moore algorithm (nontrivial code, rarely exercised) was never reached by the unit tests before. (The unit tests previously only cover Boyer-Moore-Horspool, Linear, and Single-Char. After this PR they cover Boyer-Moore as well.)

@trevnorris
Copy link
Contributor

@dcposch Left few more comments. Now that the weekend is over will be more attentive. :)

@dcposch
Copy link
Contributor Author

dcposch commented Feb 2, 2016

@trevnorris fixed. Thanks for checking it out!

@trevnorris
Copy link
Contributor

Excellent work. CI: https://ci.nodejs.org/job/node-test-pull-request/1515/

@dcposch
Copy link
Contributor Author

dcposch commented Feb 2, 2016

@trevnorris I clicked Authorize, but it says

Access Denied
dcposch is missing the Overall/Read permission

@rvagg
Copy link
Member

rvagg commented Feb 3, 2016

Sorry @dcposch, we have CI in lockdown until we get our security releases out, you'll have to rely on collaborators to get you info on how the jobs have gone until it's opened back up again next week. / #4857

There are compile errors on OSX:

In file included from ../src/node_buffer.cc:7:
../src/string_search.h:287:11: error: use of undeclared identifier 'memrchr'; did you mean 'memchr'?
    pos = memrchr(subject.start(), pattern_first_char, subj_len - index);
          ^~~~~~~
          memchr
/usr/include/string.h:70:7: note: 'memchr' declared here
void *memchr(const void *, int, size_t);
      ^
../src/node_buffer.cc:1061:11: error: use of undeclared identifier 'memrchr'; did you mean 'memchr'?
    ptr = memrchr(ts_obj_data, needle, offset + 1);
          ^~~~~~~
          memchr
/usr/include/string.h:70:7: note: 'memchr' declared here
void *memchr(const void *, int, size_t);
      ^
In file included from ../src/node_buffer.cc:7:
../src/string_search.h:252:18: error: use of undeclared identifier 'memrchr'
      void_pos = memrchr(subject.start(), search_byte, bytes_to_search);
                 ^
../src/string_search.h:308:10: note: in instantiation of function template specialization 'node::stringsearch::FindFirstCharacter<unsigned short>' requested here
  return FindFirstCharacter(search->pattern_, subject, index);
         ^
../src/string_search.h:107:22: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::SingleCharSearch' requested here
        strategy_ = &SingleCharSearch;
                     ^
../src/string_search.h:607:22: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::StringSearch' requested here
  StringSearch<Char> search(pattern);
                     ^
../src/string_search.h:636:36: note: in instantiation of function template specialization 'node::stringsearch::SearchString<unsigned short>' requested here
  size_t pos = node::stringsearch::SearchString(
                                   ^
../src/node_buffer.cc:928:16: note: in instantiation of function template specialization 'node::SearchString<unsigned short>' requested here
      result = SearchString(reinterpret_cast<const uint16_t*>(haystack),
               ^
In file included from ../src/node_buffer.cc:7:
../src/string_search.h:326:9: error: no matching function for call to 'FindFirstCharacter'
    i = FindFirstCharacter(pattern, subject, i);
        ^~~~~~~~~~~~~~~~~~
../src/string_search.h:110:20: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::LinearSearch' requested here
      strategy_ = &LinearSearch;
                   ^
../src/string_search.h:607:22: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::StringSearch' requested here
  StringSearch<Char> search(pattern);
                     ^
../src/string_search.h:636:36: note: in instantiation of function template specialization 'node::stringsearch::SearchString<unsigned short>' requested here
  size_t pos = node::stringsearch::SearchString(
                                   ^
../src/node_buffer.cc:928:16: note: in instantiation of function template specialization 'node::SearchString<unsigned short>' requested here
      result = SearchString(reinterpret_cast<const uint16_t*>(haystack),
               ^
../src/string_search.h:235:15: note: candidate template ignored: substitution failure [with Char = unsigned short]
inline size_t FindFirstCharacter(Vector<const Char> pattern,
              ^
../src/string_search.h:575:11: error: no matching function for call to 'FindFirstCharacter'
      i = FindFirstCharacter(pattern, subject, i);
          ^~~~~~~~~~~~~~~~~~
../src/string_search.h:113:18: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::InitialSearch' requested here
    strategy_ = &InitialSearch;
                 ^
../src/string_search.h:607:22: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::StringSearch' requested here
  StringSearch<Char> search(pattern);
                     ^
../src/string_search.h:636:36: note: in instantiation of function template specialization 'node::stringsearch::SearchString<unsigned short>' requested here
  size_t pos = node::stringsearch::SearchString(
                                   ^
../src/node_buffer.cc:928:16: note: in instantiation of function template specialization 'node::SearchString<unsigned short>' requested here
      result = SearchString(reinterpret_cast<const uint16_t*>(haystack),
               ^
../src/string_search.h:235:15: note: candidate template ignored: substitution failure [with Char = unsigned short]
inline size_t FindFirstCharacter(Vector<const Char> pattern,
              ^
5 errors generated.
make[2]: *** [/Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/obj.target/node/src/node_buffer.o] Error 1
make[2]: *** Waiting for unfinished jobs....
  c++ '-D_DARWIN_USE_64_BIT_INODE=1' '-DNODE_ARCH="x64"' '-DNODE_WANT_INTERNALS=1' '-DV8_DEPRECATION_WARNINGS=1' '-DHAVE_OPENSSL=1' '-DHAVE_DTRACE=1' '-D__POSIX__' '-DNODE_PLATFORM="darwin"' '-DHTTP_PARSER_STRICT=0' '-D_LARGEFILE_SOURCE' '-D_FILE_OFFSET_BITS=64' -I../src -I../tools/msvs/genfiles -I../deps/uv/src/ares -I/Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/obj/gen -I../deps/v8 -I../deps/cares/include -I../deps/v8/include -I../deps/openssl/openssl/include -I../deps/zlib -I../deps/http_parser -I../deps/uv/include  -Os -gdwarf-2 -mmacosx-version-min=10.5 -arch x86_64 -Wall -Wendif-labels -W -Wno-unused-parameter -std=gnu++0x -fno-rtti -fno-exceptions -fno-threadsafe-statics -fno-strict-aliasing -MMD -MF /Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/.deps//Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/obj.target/node/src/tls_wrap.o.d.raw   -c -o /Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/obj.target/node/src/tls_wrap.o ../src/tls_wrap.cc
In file included from ../src/string_search.cc:1:
../src/string_search.h:287:11: error: use of undeclared identifier 'memrchr'; did you mean 'memchr'?
    pos = memrchr(subject.start(), pattern_first_char, subj_len - index);
          ^~~~~~~
          memchr
/usr/include/string.h:70:7: note: 'memchr' declared here
void *memchr(const void *, int, size_t);
      ^
1 error generated.

Windows

c:\workspace\node-compile-windows\label\win-vs2013\src\string_search.h(287): error C3861: 'memrchr': identifier not found (src\node_buffer.cc) [c:\workspace\node-compile-windows\label\win-vs2013\node.vcxproj]

smartos

In file included from ../src/node_buffer.cc:7:0:
../src/string_search.h: In function 'std::size_t node::stringsearch::FindFirstCharacter(node::stringsearch::Vector<const Char>, node::stringsearch::Vector<const Char>, std::size_t) [with Char = unsigned char; std::size_t = long unsigned int]':
../src/string_search.h:287:72: error: 'memrchr' was not declared in this scope
     pos = memrchr(subject.start(), pattern_first_char, subj_len - index);
                                                                        ^
../src/node_buffer.cc: In function 'void node::Buffer::IndexOfNumber(const v8::FunctionCallbackInfo<v8::Value>&)':
../src/node_buffer.cc:1061:50: error: 'memrchr' was not declared in this scope
     ptr = memrchr(ts_obj_data, needle, offset + 1);
                                                  ^
In file included from ../src/node_buffer.cc:7:0:
../src/string_search.h: In instantiation of 'size_t node::stringsearch::FindFirstCharacter(node::stringsearch::Vector<const Char>, node::stringsearch::Vector<const Char>, size_t) [with Char = short unsigned int; size_t = long unsigned int]':
../src/string_search.h:308:61:   required from 'static size_t node::stringsearch::StringSearch<Char>::SingleCharSearch(node::stringsearch::StringSearch<Char>*, node::stringsearch::Vector<const Char>, size_t) [with Char = short unsigned int; size_t = long unsigned int]'
../src/string_search.h:107:21:   required from 'node::stringsearch::StringSearch<Char>::StringSearch(node::stringsearch::Vector<const Char>) [with Char = short unsigned int]'
../src/string_search.h:607:36:   required from 'size_t node::stringsearch::SearchString(node::stringsearch::Vector<const Char>, node::stringsearch::Vector<const Char>, size_t) [with Char = short unsigned int; size_t = long unsigned int]'
../src/string_search.h:637:49:   required from 'size_t node::SearchString(const Char*, size_t, const Char*, size_t, size_t, bool) [with Char = short unsigned int; size_t = long unsigned int]'
../src/node_buffer.cc:933:39:   required from here
../src/string_search.h:252:71: error: 'memrchr' was not declared in this scope
       void_pos = memrchr(subject.start(), search_byte, bytes_to_search);
                                                                       ^
node.target.mk:157: recipe for target '/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/obj.target/node/src/node_buffer.o' failed
make[2]: *** [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/obj.target/node/src/node_buffer.o] Error 1

the test also failed to run on armv8 but that was a jenkins problem as far as I can tell.

Looks like there's still quite a bit of work to do on cross-platform compat here.

@dcposch
Copy link
Contributor Author

dcposch commented Feb 3, 2016

@rvagg @trevnorris thanks!

Yeah this goes back to my first question at the top of the PR, about whether node builds can use memrchr. Looks like they can on some systems but not on others.

I added a fallback for those systems. LMK if the CI is happier now!

@jasnell
Copy link
Member

jasnell commented Apr 8, 2016

Whatever makes it in before I start working on it on Monday ;)
On Apr 8, 2016 4:29 PM, "Trevor Norris" [email protected] wrote:

@jasnell https://github.com/jasnell If this lands before Monday, could
it make it in, or have all the RC commits been chosen?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#4846 (comment)

@trevnorris
Copy link
Contributor

Seems there's a minor performance hit with this PR, but I believe it's within an acceptable level. Anyone have any objections? If not then let's land it.

@Fishrock123
Copy link
Contributor

🚢

@jasnell
Copy link
Member

jasnell commented Apr 15, 2016

Works for me.

@jasnell
Copy link
Member

jasnell commented Apr 22, 2016

@trevnorris ... there's still time to get this in. Is it ready to go?
@dcposch ... can you rebase?

@dcposch
Copy link
Contributor Author

dcposch commented Apr 22, 2016

@dcposch ... can you rebase?

@jasnell yes, will be done v soon

dcposch added 2 commits April 22, 2016 15:45
* Remove unnecessary templating from SearchString

  SearchString used to have separate PatternChar and SubjectChar template type
  arguments, apparently to support things like searching for an 8-bit string
  inside a 16-bit string or vice versa. However, SearchString is only used from
  node_buffer.cc, where PatternChar and SubjectChar are always the same. Since
  this is extra complexity that's unused and untested (simplifying to a single
  Char template argument still compiles and didn't break any unit tests), I
  removed it.

* Use Boyer-Hoore[-Horspool] for both indexOf and lastIndexOf

  Add test cases for lastIndexOf. Test the fallback from BMH to
  Boyer-Moore, which looks like it was totally untested before.

* Extra bounds checks in node_buffer.cc

* Extra asserts in string_search.h

* Buffer.lastIndexOf: clean up, enforce consistency w/ String.lastIndexOf

* Polyfill memrchr(3) for non-GNU systems
@dcposch
Copy link
Contributor Author

dcposch commented Apr 23, 2016

@jasnell fixed

@jasnell
Copy link
Member

jasnell commented Apr 23, 2016

One final CI before landing: https://ci.nodejs.org/job/node-test-pull-request/2371/

@jasnell jasnell added this to the 6.0.0 milestone Apr 23, 2016
@dcposch
Copy link
Contributor Author

dcposch commented Apr 23, 2016

@jasnell looks like test-tls-inception failed on FreeBSD and tests passed on the other platforms. I don't know if it's related to this change--looks unlikely. Want to try re-running it?

Failing build: https://ci.nodejs.org/job/node-test-commit-freebsd/2167/
Failing test: https://ci.nodejs.org/job/node-test-commit-freebsd/2167/nodes=freebsd10-64/tapTestReport

@jasnell
Copy link
Member

jasnell commented Apr 23, 2016

New CI: https://ci.nodejs.org/job/node-test-pull-request/2372/

@dcposch
Copy link
Contributor Author

dcposch commented Apr 23, 2016

Sweet, everything worked that time including FreeBSD

jasnell pushed a commit that referenced this pull request Apr 25, 2016
* Remove unnecessary templating from SearchString

  SearchString used to have separate PatternChar and SubjectChar template type
  arguments, apparently to support things like searching for an 8-bit string
  inside a 16-bit string or vice versa. However, SearchString is only used from
  node_buffer.cc, where PatternChar and SubjectChar are always the same. Since
  this is extra complexity that's unused and untested (simplifying to a single
  Char template argument still compiles and didn't break any unit tests), I
  removed it.

* Use Boyer-Hoore[-Horspool] for both indexOf and lastIndexOf

  Add test cases for lastIndexOf. Test the fallback from BMH to
  Boyer-Moore, which looks like it was totally untested before.

* Extra bounds checks in node_buffer.cc

* Extra asserts in string_search.h

* Buffer.lastIndexOf: clean up, enforce consistency w/ String.lastIndexOf

* Polyfill memrchr(3) for non-GNU systems

PR-URL: #4846
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
@jasnell
Copy link
Member

jasnell commented Apr 25, 2016

Landed in 6c1e5ad

@jasnell jasnell closed this Apr 25, 2016
joelostrowski pushed a commit to joelostrowski/node that referenced this pull request Apr 25, 2016
* Remove unnecessary templating from SearchString

  SearchString used to have separate PatternChar and SubjectChar template type
  arguments, apparently to support things like searching for an 8-bit string
  inside a 16-bit string or vice versa. However, SearchString is only used from
  node_buffer.cc, where PatternChar and SubjectChar are always the same. Since
  this is extra complexity that's unused and untested (simplifying to a single
  Char template argument still compiles and didn't break any unit tests), I
  removed it.

* Use Boyer-Hoore[-Horspool] for both indexOf and lastIndexOf

  Add test cases for lastIndexOf. Test the fallback from BMH to
  Boyer-Moore, which looks like it was totally untested before.

* Extra bounds checks in node_buffer.cc

* Extra asserts in string_search.h

* Buffer.lastIndexOf: clean up, enforce consistency w/ String.lastIndexOf

* Polyfill memrchr(3) for non-GNU systems

PR-URL: nodejs#4846
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
@trevnorris
Copy link
Contributor

@jasnell Was the squash/merge button used? I'm trying to figure out why the Author: field was changed from the listed commits.

@jasnell
Copy link
Member

jasnell commented Apr 25, 2016

No, I squashed like normal. Didn't notice that the author changed :-/
On Apr 25, 2016 3:48 PM, "Trevor Norris" [email protected] wrote:

@jasnell https://github.com/jasnell Was the squash/merge button used?
I'm trying to figure out why the Author: field was changed from the
listed commits.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#4846 (comment)

@trevnorris
Copy link
Contributor

Strange. Oh well. Nothing serious.

jasnell pushed a commit that referenced this pull request Apr 26, 2016
* Remove unnecessary templating from SearchString

  SearchString used to have separate PatternChar and SubjectChar template type
  arguments, apparently to support things like searching for an 8-bit string
  inside a 16-bit string or vice versa. However, SearchString is only used from
  node_buffer.cc, where PatternChar and SubjectChar are always the same. Since
  this is extra complexity that's unused and untested (simplifying to a single
  Char template argument still compiles and didn't break any unit tests), I
  removed it.

* Use Boyer-Hoore[-Horspool] for both indexOf and lastIndexOf

  Add test cases for lastIndexOf. Test the fallback from BMH to
  Boyer-Moore, which looks like it was totally untested before.

* Extra bounds checks in node_buffer.cc

* Extra asserts in string_search.h

* Buffer.lastIndexOf: clean up, enforce consistency w/ String.lastIndexOf

* Polyfill memrchr(3) for non-GNU systems

PR-URL: #4846
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. semver-minor PRs that contain new features and should be released in the next minor version. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants