-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix regression that closed the warc filereader too early #83
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
github.com/nlnwa/whatwg-url changed serialization for empty query values
This reverts commit b48b83b. The commit caused the warc records to be closed too early.
This commit ignores validation errors when loading records because the record may be valuable even tough it is not valid.
This commit adds specific error messages such that it is possible to differentiate between different errors.
This commit changes the behaviour when searching for one url with or without schema if the match type is exact or prefix. A search for http and https is enforced to get as many results as possible. If one would like to differentiate between schemas, use the coreserver API.
This commit refactors the ListStorageRef method such that all writes to the result channel are handled in the same code block for better readability.
This commit refactors code to be easier to comprehend. The cdx response creation is wrapped in a lambda function.
This commit uses the known prefix (key) to narrow the search to prefix.
This commit fixes the test to continue with next case if the previous test failed.
This commit substitutes CRLF for LF and simplifies the code by returning LF after every response. See also https://jsonlines.org/ which states that LF is standard.
The old test cases are no longer valid after the ssurt package changed it's behavior.
This commit returns an empty response when the search was empty. This enables the handler to return an 404 and be sure nothing has already been written in response.
This commit enables filtering and limiting responses when using tikv closest api.
This commit comes in the name of readability.
This commit ensures a response when limit is set to 0 in tikv methods. Default to TiKV MaxRawKVScanLimit.
This commit skips adding a record to the write batch if the cdx key is larger than tikv max key size.
This commit refactors the badger search methods such that all writes to the result channel are handled in the same code block for better readability.
This commit enables filtering and limiting responses when using badger closest api.
This commits refactors sorted parallel search by inlining the traversal of sorted items. This reduces the abstraction level to make it easier to follow the logic.
This commit removes a comment that adds no value.
This commit bumps dependencies and as a consequence of this updates the error checking that changed in the github.com/nlnwa/whatwg-url library.
This commit refines the release tag to match semver only and not any tag starting with the letter 'v'.
This commit changes the test workflow to trigger on all push events.
This commit updates the build base image to golang 1.21.
This commit changes the error messages in the filestorageloader to provide more information about the error.
This commit returns a special error to signal that the given WARC-Refers-To id could not be found, meaning it is not in the index.
This commit fixes the resolve method in badger implementation to check if key is not found. This means that the function will return an empty storageRef and nil error if key is not found.
This commit adds a check to see if value slice is empty in resolve method.
This commit changes the behaviour to return HTTP status code 404 instead of 500 if the referred record in the revisit record could not be found.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR reverts the commit that caused the WARC file reader to be closed prematurely.
In addition a lot of fixes and updates of dependencies have been made.