Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Korean input priority for card names and text #26

Open
kevinlul opened this issue Feb 8, 2023 · 23 comments
Open

Korean input priority for card names and text #26

kevinlul opened this issue Feb 8, 2023 · 23 comments

Comments

@kevinlul
Copy link
Contributor

kevinlul commented Feb 8, 2023

  1. https://github.com/DawnbrandBots/yaml-yugi-ko/blob/master/overrides.tsv (ultimately, this should not be needed)
  2. Yugipedia, if it contains ruby text
  3. Official database contents otherwise preferred

Highlight discrepancies between Yugipedia and the official database and correct Yugipedia

kevinlul added a commit that referenced this issue Feb 19, 2023
kevinlul added a commit that referenced this issue Feb 19, 2023
kevinlul added a commit that referenced this issue Feb 19, 2023
@kevinlul
Copy link
Contributor Author

Need to wrap in LiteralScalarString

@kevinlul
Copy link
Contributor Author

https://github.com/DawnbrandBots/yaml-yugi/actions/runs/4218732281/jobs/7323445739 download logs and take action on override items as appropriate

@kevinlul
Copy link
Contributor Author

kevinlul commented Feb 20, 2023

984 overrides are unnecessary: unnecessary.log (cc: @Ice-Pendragon)

@kevinlul
Copy link
Contributor Author

680 Yugipedia-official name discrepancies. Many are differences in spacing or dash used, but some are typographic: discrepancy.log

@kevinlul
Copy link
Contributor Author

4616 card text discrepancies, though these could be due to all sorts of factors like whitespace: text.log
99 for Pendulum text, ditto: pendulum.log

@Ice-Pendragon
Copy link

Need to wrap in LiteralScalarString

Now it looks great. Thanks!

@kevinlul
Copy link
Contributor Author

We need to continue to refine this though or improce the quality on Yugipedia, as errata don't get entered in the Korean database :/
e.g. https://yugipedia.com/wiki/Card_Errata:Evolutionary_Bridge

@kevinlul
Copy link
Contributor Author

Per 8142872 and https://github.com/DawnbrandBots/yaml-yugi/actions/runs/5568150780/jobs/10170569310#step:6:7, we can add an automated workflow to check for missing Korean (and Japanese!) translations.

@kevinlul
Copy link
Contributor Author

@Ice-Pendragon
Copy link

Per 8142872 and https://github.com/DawnbrandBots/yaml-yugi/actions/runs/5568150780/jobs/10170569310#step:6:7, we can add an automated workflow to check for missing Korean (and Japanese!) translations.

That would be great and helpful. Though, I still don't understand what scheduling issue creation is.

@kevinlul
Copy link
Contributor Author

I created a workflow that identifies any prerelease cards that are missing placeholder IDs (fake passwords). Similarly, I could create one to identify missing translations. Taking it a step further, instead of just logging the cards that are missing content, the workflows could automatically create an issue for the missing cards and assign it to the appropriate people.

@kevinlul
Copy link
Contributor Author

Will be using this issue to track work on the proposal: https://github.com/DawnbrandBots/yaml-yugi-ko#proposal

7ab822a
97441fe

@kevinlul
Copy link
Contributor Author

kevinlul commented Aug 10, 2023

TODO (me):

  • basic validation workflow in yaml-yugi-ko for expected CSV structure
  • new Korean merge logic in job_rush.py (reading the official CSV is optional here)
  • new Korean merge logic in job_ocgtcg.py
  • new script to generate OCG SQLite file
  • new script to generate Rush SQLite file
  • update yaml-yugi-ko to use alternate method to always scrape the entire official database, and extend to Rush

@kevinlul
Copy link
Contributor Author

@Ice-Pendragon the old overrides.tsv will need to be replaced with the more comprehensive ocg-override.csv based on the work done in the Google Sheets.

@Ice-Pendragon
Copy link

@Ice-Pendragon the old overrides.tsv will need to be replaced with the more comprehensive ocg-override.csv based on the work done in the Google Sheets.

Got it. Ruby, Omitted Errata, and what else do you need?

@kevinlul
Copy link
Contributor Author

You can look at the README of that repo to see how I've designed the files to be used, and if there are any problems with this approach.

kevinlul added a commit that referenced this issue Aug 11, 2023
@kevinlul
Copy link
Contributor Author

You should be able to use the two Rush CSVs now to provide Korean translations for Bastion.

@kevinlul
Copy link
Contributor Author

update yaml-yugi-ko to use alternate method to always scrape the entire official database

This is complete but needs some clean up to be committed this evening. After this, scrapes should be much faster and always update all card text.

kevinlul added a commit to DawnbrandBots/yaml-yugi-ko that referenced this issue Aug 12, 2023
kevinlul added a commit to DawnbrandBots/yaml-yugi-ko that referenced this issue Aug 12, 2023
@kevinlul
Copy link
Contributor Author

@Ice-Pendragon I see you just did DawnbrandBots/yaml-yugi-ko@3cf7050, but actually, like half an hour before that, I just implemented the full scraper as promised, and it worked: DawnbrandBots/yaml-yugi-ko@581f062

So actually, none of the YAML data files in that repository are needed now which kind of makes it csv-yugi-ko rather than yaml-yugi-ko!

@Ice-Pendragon
Copy link

@Ice-Pendragon I see you just did DawnbrandBots/yaml-yugi-ko@3cf7050, but actually, like half an hour before that, I just implemented the full scraper as promised, and it worked: DawnbrandBots/yaml-yugi-ko@581f062

So actually, none of the YAML data files in that repository are needed now which kind of makes it csv-yugi-ko rather than yaml-yugi-ko!

Then would the YAML files rather be deleted? (after backup, if you need them)

@kevinlul
Copy link
Contributor Author

I can remove them next week after we verify that the new code works well when a new pack is released. There's no need to explicitly back them up since they'll remain in the Git history of that repository.

@kevinlul
Copy link
Contributor Author

I missed that the run there was actually triggered by you, and the automatic scheduled run only just happened. 😅

@kevinlul
Copy link
Contributor Author

kevinlul commented Oct 28, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants