Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vx followup #6732

Merged
merged 38 commits into from
Jan 17, 2023
Merged

Vx followup #6732

merged 38 commits into from
Jan 17, 2023

Conversation

normanrz
Copy link
Member

@normanrz normanrz commented Jan 10, 2023

  • Optimizes inserts, when many events are sent in a single request, by grouping them into single database queries
  • Optimizes workflow listing by batching database queries
  • Provides consolidated view of tasks that have been resumed, i.e. combining the information about the chunks of multiple runs
  • Removes some cyclic dependencies in the frontend
  • Adds support for Enumerations in SqlInterpolation (stolen from Use new sql interpolator in more DAOs, clean up #6723)

Steps to test:

  • Run a voxelytics workflow with multiple chunks, interrupt the run and resume
  • Once complete, the workflow report should provide accurate information about all completed chunks

(Please delete unneeded items, merge only when none are left open)

@normanrz normanrz self-assigned this Jan 10, 2023
Copy link
Member

@fm3 fm3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scala Backend mostly LGTM. I added a few comments.

I did not dive deeply into the very complex SQL queries. If you think they should be reviewed in detail, maybe we could schedule a call.

app/utils/sql/SqlInterpolation.scala Outdated Show resolved Hide resolved
app/controllers/VoxelyticsController.scala Outdated Show resolved Hide resolved
app/models/voxelytics/VoxelyticsService.scala Outdated Show resolved Hide resolved
app/models/voxelytics/VoxelyticsService.scala Outdated Show resolved Hide resolved
app/models/voxelytics/VoxelyticsDAO.scala Outdated Show resolved Hide resolved
Copy link
Member

@philippotto philippotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm mostly 👍 only left some comments regarding how the progress is calculated and represented. ignore my comments if you think, the current way makes more sense. I didn't use the new reports heavily yet, so I can only comment coming from an outside perspective.

frontend/javascripts/admin/voxelytics/statistics_tab.tsx Outdated Show resolved Hide resolved
frontend/javascripts/admin/voxelytics/statistics_tab.tsx Outdated Show resolved Hide resolved
frontend/javascripts/admin/voxelytics/task_view.tsx Outdated Show resolved Hide resolved
@normanrz
Copy link
Member Author

@daniel-wer what do you think about @philippotto's comments? I think you have spent the most time with the reports so far.

@daniel-wer
Copy link
Member

In my opinion, the main thing that is interesting to me when looking at a report with a currently running task is:

  • how many chunks are already done, have failed, still need to be computed (whether the done tasks were computed in this specific run or before is not really important, but the distinction could be helpful)
  • what is the ETA

In the workflow list, the progress bar is chunked with different colors. Couldn't the progress bar for a running task also show this distinction? Always show the overall number of chunks as the total, then color the skipped ones gray, the finished ones blue or green and the failed ones red.

When looking at the statistics tab, I'd like to see the aggregated stats for all completed chunks.

@normanrz
Copy link
Member Author

normanrz commented Jan 10, 2023

still need to be computed (whether the done tasks were computed in this specific run or before is not really important, but the distinction could be helpful)

should "remaining" include failed and cancelled chunks?

When looking at the statistics tab, I'd like to see the aggregated stats for all completed chunks.

Should that include running and failed chunks?

@daniel-wer
Copy link
Member

still need to be computed (whether the done tasks were computed in this specific run or before is not really important, but the distinction could be helpful)

should "remaining" include failed and cancelled chunks?

I don't know the distinction between failed and canceled chunks, but think it should contain neither of them. Otherwise, they would be counted twice (once as failed/cancelled and once as remaining) which could be confusing, I think.

When looking at the statistics tab, I'd like to see the aggregated stats for all completed chunks.

Should that include running and failed chunks?

Yes, I would say so. Every bit of information is helpful, especially the RAM consumption of the failed chunks. For the run times it's not that important, but in the end every chunk should only be counted once (even if it failed once and then succeeded in a following run).

@normanrz normanrz requested a review from fm3 January 11, 2023 15:34
@normanrz normanrz mentioned this pull request Jan 12, 2023
1 task
@normanrz
Copy link
Member Author

I now solved the performance issue by restructuring the database: voxelytics_{run,task,chunk}StateChangeEvents are gone in favor or state, beginTime, endTime fields are added to voxelytics_{runs,tasks,chunks}. The API with vx is unchanged.
The forward evolution is straightforward, but lossy. Therefore, I don't know how to do the reverse evolution. I'd argue to not implement a full reverse evolution and just keep the old tables until we know everything works. @fm3 wdyt?

@fm3
Copy link
Member

fm3 commented Jan 13, 2023

@normanrz that sounds fair to me. If the app image is rolled back after we notice that something is wrong, the old tables are filled again by the old code. Will that be okay / create an acceptable state? If so, fine by me. Also, I’d ask you to add a follow-up issue to drop the unused tables in another evolution after a grace period

@normanrz
Copy link
Member Author

Ingesting 120k chunks now only took a few seconds (on my Mac) 🎉 💯
Screenshot 2023-01-16 at 17 50 26

@bulldozer-boy bulldozer-boy bot merged commit b6239d8 into master Jan 17, 2023
@bulldozer-boy bulldozer-boy bot deleted the vx-followup branch January 17, 2023 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants