Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backend] Duplicate Datapoints in Data Upload API upload response #31

Closed
terryttsai opened this issue Sep 20, 2022 · 1 comment
Closed
Assignees

Comments

@terryttsai
Copy link
Contributor

Lili noticed that the count of overwrites don't add up correctly, and when I dug into this, I found that sometimes we receive duplicate datapoints in the datapoints array in the Data Upload API response.

One example of this is when uploading the prison_warnings sheet, under Admissions you see 39 overwrites instead of 26 (or some other numbers).

I see that all the aggregate values are given twice.

Screen Shot 2022-09-20 at 4.31.15 PM.png

@lilidworkin
Copy link
Collaborator

Thank you SO much for writing up this super detailed issue, @terryttsai !

helperbot-recidiviz pushed a commit to Recidiviz/pulse-data that referenced this issue Oct 18, 2022
…iviz-data#15759)

## Description of the change

This PR does a bunch of things (sorry, I'm breaking my cardinal rule!):

### 1) Gets rid of BulkUploadSheetNameError and instead just directly
adds invalid sheet names to a list
Test plan: Made sure Invalid Sheet Name error still appears as usual:
<img width="400" alt="Screen Shot 2022-10-02 at 8 27 07 AM"
src="https://user-images.githubusercontent.com/99899591/193453878-ed95c0b7-0f0d-4ff3-bb6a-43b32a0bbc20.png">

### 2) Makes it possible for multiple errors to be thrown on the same
sheet (e.g. before, if you had multiple invalid values, only one error
would be thrown; now all of them will be)
Test plan: Sheet with multiple errors now shows both, with different
time ranges:
<img width="532" alt="Screen Shot 2022-10-02 at 8 27 59 AM"
src="https://user-images.githubusercontent.com/99899591/193453910-dff5baca-866f-4ad7-aec5-daca8e12864c.png">

### 3) Fixes an issue with the infer aggregate value logic, to make sure
values aren't being overwritten incorrectly
Test plan: Uploaded defense_errors followed by defense_warnings twice,
and couldn't replicate bad behavior.

### 4) If total value are already in the DB, we were previously showing
them on the Review page, which was jarring. Now we don't infer them nor
do we show them on the Review page.
Test plan: Total staff values in DB; uploaded a spreadsheet with staff
by type, but not total staff. Total staff in DB is unchanged.

https://user-images.githubusercontent.com/99899591/193454158-b23690f1-9de2-4d04-8655-9e4e881ef8fb.mov

### 5) Fix an issue where duplicate datapoints were being returned in
the response
Test plan: This is covered by a unit test, where there were previously
duplicate values.

### 6) Fix a bug where the UI would say "X values overwritten" when only
Y < X values were actually getting overwritten.
Ensured number of overwritten values is correct now:

https://user-images.githubusercontent.com/99899591/193454283-ea08d4b4-ea8b-4142-a97a-b51d9ddf14e2.mov

## Type of change

> All pull requests must have at least one of the following labels
applied (otherwise the PR will fail):

| Label | Description |
|-----------------------------
|-----------------------------------------------------------------------------------------------------------
|
| Type: Bug | non-breaking change that fixes an issue |
| Type: Feature | non-breaking change that adds functionality |
| Type: Breaking Change | fix or feature that would cause existing
functionality to not work as expected |
| Type: Non-breaking refactor | change addresses some tech debt item or
prepares for a later change, but does not change functionality |
| Type: Configuration Change | adjusts configuration to achieve some end
related to functionality, development, performance, or security |
| Type: Dependency Upgrade | upgrades a project dependency - these
changes are not included in release notes |

## Related issues

Closes Recidiviz/justice-counts#31

## Checklists

### Development

**This box MUST be checked by the submitter prior to merging**:
- [ ] **Double- and triple-checked that there is no Personally
Identifiable Information (PII) being mistakenly added in this pull
request**

These boxes should be checked by the submitter prior to merging:
- [ ] Tests have been written to cover the code changed/added as part of
this pull request

### Code review

These boxes should be checked by reviewers prior to merging:

- [ ] This pull request has a descriptive title and information useful
to a reviewer
- [ ] This pull request has been moved out of a Draft state, has no
"Work In Progress" label, and has assigned reviewers
- [ ] Potential security implications or infrastructural changes have
been considered, if relevant

GitOrigin-RevId: 5d811ff00a68188d6d66c8bc30d3a0dc5fb03ee2
helperbot-recidiviz pushed a commit to Recidiviz/pulse-data that referenced this issue Apr 19, 2023
…iviz-data#15759)

## Description of the change

This PR does a bunch of things (sorry, I'm breaking my cardinal rule!):

### 1) Gets rid of BulkUploadSheetNameError and instead just directly
adds invalid sheet names to a list
Test plan: Made sure Invalid Sheet Name error still appears as usual:
<img width="400" alt="Screen Shot 2022-10-02 at 8 27 07 AM"
src="https://user-images.githubusercontent.com/99899591/193453878-ed95c0b7-0f0d-4ff3-bb6a-43b32a0bbc20.png">

### 2) Makes it possible for multiple errors to be thrown on the same
sheet (e.g. before, if you had multiple invalid values, only one error
would be thrown; now all of them will be)
Test plan: Sheet with multiple errors now shows both, with different
time ranges:
<img width="532" alt="Screen Shot 2022-10-02 at 8 27 59 AM"
src="https://user-images.githubusercontent.com/99899591/193453910-dff5baca-866f-4ad7-aec5-daca8e12864c.png">

### 3) Fixes an issue with the infer aggregate value logic, to make sure
values aren't being overwritten incorrectly
Test plan: Uploaded defense_errors followed by defense_warnings twice,
and couldn't replicate bad behavior.

### 4) If total value are already in the DB, we were previously showing
them on the Review page, which was jarring. Now we don't infer them nor
do we show them on the Review page.
Test plan: Total staff values in DB; uploaded a spreadsheet with staff
by type, but not total staff. Total staff in DB is unchanged.

https://user-images.githubusercontent.com/99899591/193454158-b23690f1-9de2-4d04-8655-9e4e881ef8fb.mov

### 5) Fix an issue where duplicate datapoints were being returned in
the response
Test plan: This is covered by a unit test, where there were previously
duplicate values.

### 6) Fix a bug where the UI would say "X values overwritten" when only
Y < X values were actually getting overwritten.
Ensured number of overwritten values is correct now:

https://user-images.githubusercontent.com/99899591/193454283-ea08d4b4-ea8b-4142-a97a-b51d9ddf14e2.mov

## Type of change

> All pull requests must have at least one of the following labels
applied (otherwise the PR will fail):

| Label | Description |
|-----------------------------
|-----------------------------------------------------------------------------------------------------------
|
| Type: Bug | non-breaking change that fixes an issue |
| Type: Feature | non-breaking change that adds functionality |
| Type: Breaking Change | fix or feature that would cause existing
functionality to not work as expected |
| Type: Non-breaking refactor | change addresses some tech debt item or
prepares for a later change, but does not change functionality |
| Type: Configuration Change | adjusts configuration to achieve some end
related to functionality, development, performance, or security |
| Type: Dependency Upgrade | upgrades a project dependency - these
changes are not included in release notes |

## Related issues

Closes Recidiviz/justice-counts#31

## Checklists

### Development

**This box MUST be checked by the submitter prior to merging**:
- [ ] **Double- and triple-checked that there is no Personally
Identifiable Information (PII) being mistakenly added in this pull
request**

These boxes should be checked by the submitter prior to merging:
- [ ] Tests have been written to cover the code changed/added as part of
this pull request

### Code review

These boxes should be checked by reviewers prior to merging:

- [ ] This pull request has a descriptive title and information useful
to a reviewer
- [ ] This pull request has been moved out of a Draft state, has no
"Work In Progress" label, and has assigned reviewers
- [ ] Potential security implications or infrastructural changes have
been considered, if relevant

GitOrigin-RevId: de8b925f6bcd9d00cb71d7e962f11ef0ffe75552
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants