-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patches for v6.5 release #748
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
All parameters are required so a missing (null) header value will already prevent startup. Throwing an exception here fails immediately, preventing reporting of any other missing parameters.
Use standard HttpUtils.getRequestFiles in OpportunityDatasetController which causes all uploaded files to be disk files rather than in-memory byte buffers. Uploads were failing due to small Shapefile sidecars not being written to disk. Also added clearer messages to the checkState calls that were failing.
Add a dataGroupId to OpportunityDatasets. We were assigning an (unsaved) DataSource, but not putting them in DataGroups. Also stronger typing on parameters (pass DataGroup not String id)
also generalized one log message for local use
workers do not stick to a single projectId, only to a single bundleId
previously we were using the uppercase enum value names
on newly started backends, we were fetching only the shp and not others
This allows us to remove the old non-array fields in mongo records for older regional analysis results by copying their values into single-element arrays. We will still find older files by fallback.
Also return empty string instead of null in LocalFilesController and document why - when Spark framework sees a null response it thinks the path is not mapped to a controller.
Regional output array migration
abyrd
commented
Sep 27, 2021
some of these scenarios are huge and we're considering removing them from the database
These are usually built in bundleController and cached. When they were rebuilt by gtfsCache, the feedId was falling back on the filename (which was a bundle-scoped feedId rather than just the feedId) or even on the feedId declared by the feed itself, not known to be unique.
Add "Access-Control-Max-Age" header set to one day to all preflight requests. This allows browsers to cache the "Options" preflight request instead of sending it for each HTTP request.
Not generic nodes etc. which may lack lat/lon values
Add "Access-Control-Max-Age" header
Assertions elsewhere assume no travel time will exceed the request.maxTripDurationMinutes limit. This appears to only be an issue when cutoff times are lower than access mode-specifc travel time limits
e15bff0
to
ab05f16
Compare
This was observed in testing when running many single point and regional analyses. Perhaps a regional worker started up and was assigned the same IP address as a single point worker that had recently shut down from inactivity. The resulting exception must be caught so this IP will be unregistered for this workerCategory and a new one started/found.
returning partial regional analysis results was only half-implemented, which was causing us to mishandle requests on incomplete jobs.
Ideally we'd infer from the database record that this analysis does not have any gridded results and immediately return a 404. As a short-term fix, we check assertions only when an older-format file is actually found, and return a 404 to cover a wider range of cases: files that are missing because they are not even supposed to exist, and files that should exist but have gone missing for other unknown reasons.
rather than silently taking a legacy code path
Add method to check for spanning 180 degree meridian. Move methods into GeometryUtils for reuse. Check envelope of uploaded OSM and GTFS data. Check envelope of WebMercatorGridPointSet when created. Check envelope of StreetLayer when it's loaded. Text describing range-checked object is now substituted into messages.
Consistently throw more general exception type. Check that data actually contains points near the antimeridian.
these files should be valid except for the fact that they cross 180 deg
This reverts commit f399f90. This is a double-revert, re-establishing the original use of getPriority methods instead of priority fields. The hastily applied patch where we shifted to using the field instead of methods was based on a false belief that old MapDBs had this field. In fact the field was a temporary addition during development, and the original state was without the field.
We don't want to save them to the MapDB file anyway, and simply adding error instances to MapDB files causes (de)serializer configuration to be stored in the file. This makes it more fragile: any change to error class fields can break reopening MapDB files even after all instances of those errors have been cleared out. We can't readily stream the errors out to JSON without accumulating in memory, because we also want to summarize them for storage in Mongo. Conceivably we could build up the summary at the same time as streaming them out to JSON.
ansoncfit
approved these changes
Nov 11, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving for merge to dev
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I am accumulating any patches here leading up to the September 2021 v6.5 release.
In addition to the items described in detail below, this PR contains various small but important fixes to:
In-memory upload FileItems
I ran into a problem creating new opportunity datasets. On upload, a checkState that all uploaded items are readable files failed. This is because unlike Bundle and DataSource uploads, OpportunityDataset uploads were not using the method HttpUtils.getRequestFiles() which customizes the FileUploadItemFactory to ensure even tiny files (like Shapefile projection sidecars) are actually put in a disk file instead of memory. In the case of creating opportunity dataset grids from shapefiles, this wasn't hurting anything because the conversion process explicitly writes the shapefile FileItems out to a temporary directory. Rather than remove the checkState (which actually does check required conditions for other code paths handling uploaded files) I just changed all such code paths to use the same HttpUtils utility method.
Long term it would probably be better to remove the zero-threshold factory, allowing small FileItems (including non-file form fields) to remain in memory, then revise and use moveFilesIntoStorage to explicitly FileItem.write() to a temporary location or directly into FileStorage.
Confusion was caused here by the fact that contrary to its documentation, FileItem does not return null when getStoreLocation is called on an in-memory FileItem. It returns the name of a temporary file where it would save the data if the file were to exceed the size threshold.
LODES extract progress and DataGroups
We were extracting LODES data in a background task that shows up on
/api/activity
, but not reporting any progress, not putting the resulting layers into a datagroup, and not reporting the WorkProduct. Because we still have the "classic" LODES upload status reports, I considered just bypassing the background task activity reporting, but didn't want to move backward. So with this change we are now reporting both kinds of progress. The new one shows much better incremental progress, so I believe we could now just stop displaying the classic progress updates at the top left of the UI. Observations: I noticed seeing the two side by side that the new progress text transitions from green to blue upon completion, while in the classic progress the entire background flips from blue to green when it finishes. The new progress reporting is clearly better, but the "flip to green" effect of the old progress reporting is more satisfying, and I always feel more instantly aware that the task has finished. This is now reporting a work product of OPPORTUNITY_DATASET with group=true, which the UI does not know how to link to. Suggested approach for the release is to leave both progress mechanisms enabled, but temporarily stop setting the workProduct on the backend so the UI doesn't try to display a broken link.Note also that OpportunityDatasets have a
sourceId
field which should probably bedataSourceId
to match AggregationArea, but I'm not changing this now because it might require a database migration.