Skip to content

Latest commit

 

History

History
64 lines (41 loc) · 2.68 KB

File metadata and controls

64 lines (41 loc) · 2.68 KB

Common Japanese Morphemes in News

Showcase visualizations and code base about the common Japanese morphemes that appear in news.

Morphemes are the smallest units of meaning in a language.

Data was collected from 'https://www3.nhk.or.jp'

Data collecting period: 25 May 2024 - 4 July 2024

Status

Common Japanese Morphemes in News: 🎉 Project Completed 🎉

CodeQL

Scraper Test

Daily News Scraper

Latest Update

Common Japanese Morphemes in News Latest Update: 30 July 2024

Visualizations

Common Japanese Morphemes in News:

Data

Located in data folder

Contain Japanese morphemes data collected from the NHK News website.

Total morphemes collected: 1,015,285

Contain urls which link to the news that the morphemes were collected from.

Total Url collected: 896

Urls in this file should follow https://www3.nhk.or.jp if you want to see the source.

For example: https://www3.nhk.or.jp/news/html/20240523/k10014458551000.html

How to Web-Scrape Japanese News to Extract Japanese Morphemes

Scrape data from NHK News daily, automated with GitHub Action.