-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unzip Federal Restrictions File #3869
Comments
Just a note for me to grab a sample and update ACs to be more fulsome |
## Federal Restrictions - Unzip ZIP files coming from SFTP - [x] Adjusted the existing regex to match the federal restrictions file with `.zip` and `.ZIP` format. - [x] Used a library [adm-zip](https://github.com/cthackers/adm-zip) to extract the `.zip` file. - [x] Updated the `sftp-integration-base` to use encoding `null` while reading only the compressed file to avoid data corruption. - [x] For the federal restrictions integration, the first file inside the downloaded compressed archive is processed assuming there will always be only one file in the `.zip` archive. ## Technical Investigations and performance findings ### APPROACH 1 Based on documentation and also testing, the nodejs in built library Zlib supports archiving and extraction of only gunzip (.gz) files. It does not support the same operations on a .zip files. Extracting .zip with Zlib Gunzip (Doesn't support) ![image](https://github.com/user-attachments/assets/4ce66725-8ec0-4377-8983-cc424e0e9e19) ![image](https://github.com/user-attachments/assets/12b4d6c9-e17e-4f65-97cf-22843520e191) Extracting .gz with Zlib Gunzip (Works Perfectly) ![image](https://github.com/user-attachments/assets/3b485ac8-5c36-4457-9034-fc2ed083316b) ### APPROACH 2 - Third party library https://github.com/cthackers/adm-zip Tested code(Not the final code) ![image](https://github.com/user-attachments/assets/e815c661-2c8d-411a-9813-e4649ddfc47c) It also provides non blocking method to read data. (getDataAsync) It works perfectly. ![image](https://github.com/user-attachments/assets/3c37594a-b719-43d0-a0de-e35f77fd6c14) Tested the upload with 139MB file with around 140,000 records. Time taken by the lib to read the file is 666ms ![image](https://github.com/user-attachments/assets/049965e3-2d38-4162-9c4f-7569b1384e33)
…obs with more resource demand (#4184) ## Issue While running the federal restrictions file with bulk volume of data (139MB File), we noticed that the required resource limit is going beyond that maximum limit for both CPU and Memory. ![image](https://github.com/user-attachments/assets/6fa5174b-d2d2-41e7-9531-296f78f390e7) ![image](https://github.com/user-attachments/assets/ebf10fea-5a8c-40d2-9d4e-32272b592011) ## Solution Hence to meet the demand for job(s) requiring more resources(Currently federal restrictions is the only one such job), CPU and Memory limits and requested have been bumped up a little. ## Outcome (Federal Restrictions file 139MB) ![image](https://github.com/user-attachments/assets/7d9be43a-f1d9-4b34-8eb8-8472af561555)
While running the federal restrictions file with bulk volume of data (139MB File), we noticed that the required resource limit is going beyond that maximum limit for both CPU and Memory. Hence to meet the demand for job(s) requiring more resources(Currently federal restrictions is the only one such job), CPU and Memory limits and requested have been increased a little. |
User Story
The federal restrictions file is passed to SIMS as a zip file. We need to unzip the file and then import the .txt file. Right now we are not expecting to unzip the file.
Acceptance Criteria
Technical
The text was updated successfully, but these errors were encountered: