start

when thinking of resume and JD matching how can one approach to solve the problem if they dont have NLP/ML knowledge

One obvious way we can approach this problem by using naive word matching using pattern matching algorithm and finding each skills listed in the resume by searching the JD required skills for a matching pattern of a skill found in the resume. We will do this for each skill found in the resume.

For example for a JD with skills required "aws, python, leadership, machine learning" a perfect match would be a resume with skills "aws, python, leadership, machine learning"

But if the resume sills have the following "python, team leader, support vector machine, cloud computing" the algorithm fails to match the resume with the JD with high score even though all the skills in the resume "support vector machine" is type of "machine learning", "team leader" and "leadership" are same thing, and "cloud computing" includes "aws".

So how do we solve this problem

Lets understand Word2Vec

Understanding Doc2vec

Tutorials

Gensim, Doc2vec: https://radimrehurek.com/gensim/auto_examples/tutorials/run_doc2vec_lee.html#sphx-glr-auto-examples-tutorials-run-doc2vec-lee-py

dataset

job posting dataset - https://data.world/promptcloud/indeed-usa-job-listing-data
resumes corpus - https://github.com/florex/resume_corpus

ToDos

baseline doc2vec model using cosine distance matching
organize resumes_corpus in to one json.gz file with {label: "label fileName.lbl", text: "resume text fileName.txt"}
cluster resumes and job descriptions by category
improve resume matching using clustering and doc2vec model
Organize datasets
- resume dataset
- job description
- use the categories from the jd dataset to be applied to the

Findings

Baseline performance:
- Doc2Vec model trained using a single article from (https://www.analyticsvidhya.com/blog/2023/04/apoorvas-journey-of-challenges-and-growth-as-a-data-scientist/)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
tutorials		tutorials
with_resume_datasets		with_resume_datasets
.gitignore		.gitignore
README.md		README.md
Roboto-BlackItalic.ttf		Roboto-BlackItalic.ttf
baseline.py		baseline.py
data.csv		data.csv
doc2Vec.model		doc2Vec.model
doc2Vec_baseline.model		doc2Vec_baseline.model
doc2vec_indeed_large_jds_nostopwords_test_vectors.csv		doc2vec_indeed_large_jds_nostopwords_test_vectors.csv
doc2vec_indeed_large_jds_nostopwords_test_vectors_metadata.csv		doc2vec_indeed_large_jds_nostopwords_test_vectors_metadata.csv
doc2vec_indeed_large_jds_test_vectors.csv		doc2vec_indeed_large_jds_test_vectors.csv
doc2vec_indeed_large_jds_test_vectors_metadata.csv		doc2vec_indeed_large_jds_test_vectors_metadata.csv
doc2vec_indeed_small_jds_test_vectors.csv		doc2vec_indeed_small_jds_test_vectors.csv
doc2vec_indeed_small_jds_test_vectors_metadata.csv		doc2vec_indeed_small_jds_test_vectors_metadata.csv
doc2vec_latest_pynb		doc2vec_latest_pynb
doc2vec_latest_pynb_2		doc2vec_latest_pynb_2
jd.csv		jd.csv
main_resume_jd_matchin.ipynb		main_resume_jd_matchin.ipynb
mallet_en_stoplist.txt		mallet_en_stoplist.txt
matched_pairs.txt		matched_pairs.txt
naive_word_matching.py		naive_word_matching.py
naive_word_matching_solution.py		naive_word_matching_solution.py
nlp-workspace.code-workspace		nlp-workspace.code-workspace
playground.ipynb		playground.ipynb
resume1.pdf		resume1.pdf
resume2.pdf		resume2.pdf
resume3.pdf		resume3.pdf
resume4.pdf		resume4.pdf
resume_jd_matching.py		resume_jd_matching.py
skills_extraction.ipynb		skills_extraction.ipynb
small_indeed_jd.ldjson.gz		small_indeed_jd.ldjson.gz
surgeai_resume_dataset.csv		surgeai_resume_dataset.csv
tfidf_job_matching_solution.py		tfidf_job_matching_solution.py
tfidf_job_matching_solution2.py		tfidf_job_matching_solution2.py
tfidf_no_dataset.py		tfidf_no_dataset.py
word2vec.py		word2vec.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

start

So how do we solve this problem

Tutorials

dataset

ToDos

Findings

About

Releases

Packages

Languages

alemtgetu/resume_job_desc_matching

Folders and files

Latest commit

History

Repository files navigation

start

So how do we solve this problem

Tutorials

dataset

ToDos

Findings

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages