Skip to content

LG-7470 | Very crude start of background job [WIP]#7085

Closed
n1zyy wants to merge 3 commits intomainfrom
mattw/LG-7470_barebones
Closed

LG-7470 | Very crude start of background job [WIP]#7085
n1zyy wants to merge 3 commits intomainfrom
mattw/LG-7470_barebones

Conversation

@n1zyy
Copy link
Contributor

@n1zyy n1zyy commented Oct 4, 2022

🎫 Ticket

LG-7470 TK

🛠 Summary of changes

This is currently so barebones as to be useless, but I wanted to get something up.

📜 Testing Plan

Provide a checklist of steps to confirm the changes.

  • Step 1
  • Step 2
  • Step 3

👀 Screenshots

If relevant, include a screenshot or screen capture of the changes.

Before:
After:

🚀 Notes for Deployment

Include any special instructions for deployment.

key = key(timestamp)
redis_pool.with do |client|
# see client.hscan which refs https://redis.io/commands/scan/
# but it's... a lil' bit weird.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably, this change would be the big win, in allowing us to read and write in batches rather than fetching everything into memory. Will need to play around with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a good task for tomorrow is to write something quick to bulk-generate events. We're going to want something like that for generating a large sample file for the IRS as well.

My gut feeling is that hgetall won't be a great choice with a huge number of events, and they'll all be loaded in memory. I like the idea of being able to hscan and write them to the file as we fetch them.

It is occurring to me tonight that we've talked about doing this "background processing" to fetch all the events, put them in a flat file, and then store that in Redis rather than S3. Is that actually saving us anything over what we have now? All the more reason to generate a ton of events and bang on this.

# Get this to run at the early part of the hour

def perform(subject_timestamp)
puts 'Howdy, partner'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously this should not be merged; just an easy way to see when this runs.

key = key(timestamp)
redis_pool.with do |client|
# see client.hscan which refs https://redis.io/commands/scan/
# but it's... a lil' bit weird.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a good task for tomorrow is to write something quick to bulk-generate events. We're going to want something like that for generating a large sample file for the IRS as well.

My gut feeling is that hgetall won't be a great choice with a huge number of events, and they'll all be loaded in memory. I like the idea of being able to hscan and write them to the file as we fetch them.

It is occurring to me tonight that we've talked about doing this "background processing" to fetch all the events, put them in a flat file, and then store that in Redis rather than S3. Is that actually saving us anything over what we have now? All the more reason to generate a ton of events and bang on this.

file.write event
end
file.close
file.path
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting in case someone else ends up picking this up.

Right now, this reads all the events for a given hour out of Redis, writes them to a temp file, and then returns the file path (but nothing is looking at the return value, except me in the console). This is not very useful.

We want to figure out where to store this. We've discussed either S3 or Redis. With multiple servers behind a load balancer, and instances periodically recycled, we can't rely on saving the file locally.

We also want to apply encryption and gzip on this. See the fetch_events rake task, at least for the encryption bit.

The other bit of work on this is, once all that's working, change the endpoint to return that file, wherever it's stored, rather than generating it on the fly.

@zachmargolis
Copy link
Contributor

Can/should we close this in favor of #7259?

@Rwolfe-Nava
Copy link
Contributor

Can/should we close this in favor of #7259?

I think so. Matt is aware that I took this on.

@n1zyy n1zyy deleted the mattw/LG-7470_barebones branch October 11, 2023 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants