Skip to content

Data-Science-for-Linguists-2024/Colexification-Across-the-Globe

Repository files navigation

Colexification Across the Globe

Author: Teresa Davison

Description: Investigation into colexification of concepts within language families or geographical areas based on semantic field and ontological catgory using machine learning algorithms like Naive Bayes, Random Forest classification, SVC and K-means clustering.

Data: Data was sourced from the SQL database underlying the Database of Cross-Linguistic Colexifications(CLICS3).

Directory:

  • final_report.md: Final synthesis of the project process as well as results and analysis.
  • notebooks: folder of jupyter notebooks showing the progress of the project at certain intervals.
  • progress_report.md: includes summaries of what was accomplished in each of the notebook files.
  • project_plan.md: original motivation for project and plan for analysis.
  • LICENSE.md: licensing information for project.
  • LING1340FinalPresentation.pdf: pdf version of final presentation slides.
  • data-samples: folder of sampled data for each of the relevant dataframes in csv and pkl format as well as csv format feature dataframes created for Russian, Tamil, and German during the development phase.
  • figures: a folder of any figures, graphs or images used in the final report.

Guest book: Please follow this link to visit my guestbook!

Releases

No releases published

Packages

No packages published