Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup Workable XML Importer & Amazon S3 Integration #200

Open
5 tasks
Ches-ctrl opened this issue Jun 19, 2024 · 0 comments · May be fixed by #213
Open
5 tasks

Setup Workable XML Importer & Amazon S3 Integration #200

Ches-ctrl opened this issue Jun 19, 2024 · 0 comments · May be fixed by #213

Comments

@Ches-ctrl
Copy link
Owner

Ches-ctrl commented Jun 19, 2024

Instructions

  • Setup the XML importer from Workable's XML feed
  • Current code: Services/Importer/XML/workable.rb (also a background job)
  • Docs: https://help.workable.com/hc/en-us/articles/4420464031767-Utilizing-the-XML-Job-Feed
  • The importer needs to
    • Request the import
      • NB. the Workable XML feed is sensitive to 403 Forbidden errors (too many requests)
    • Stream and store the XML file locally & in Amazon S3
      • Stream the XML file using Faraday (gem already installed)
      • Use a block to parse the XML systematically
      • NB. The feed includes urls of the format: https://apply.workable.com/j/9A5B371BA0 (i.e. without the ats_identifier)
      • Build an S3 uploader so that (some code already exists in the repo)
      • Chat with Charlie about the best folder structure to store this
    • Find the redirected URLs
      • i.e. URLs with the ats_identifier of the company e.g. papier - https://apply.workable.com/papier/j/67BE15191B/
      • There is often a staging page that needs to be waited for - check for the interim response code and use a different IP address to bypass if required
      • This may require using a proxy e.g. SCRAPE_UP is included in the repo
    • Save the redirected URLs locally
      • Please submit a check in with Charlie at this point - once you have the list of URLs with ats_identifiers

    • Pass the URL to URL::CreateJobFromUrl

Notes

  • Request contributor access on the repo
  • Setup a draft PR in the repo with your code to submit

Acceptance Criteria

  • Can use rake xml:workable to import all the jobs from the XML feed
  • Must count the number of URLs in the XML
  • Can save the XML file locally
  • Can save the redirect URLs in a JSON locally (format TBD)
  • Can create jobs for all the URLs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants