Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing and reordered verses in Serval drafts #397

Closed
bhartmoore opened this issue May 30, 2024 · 19 comments
Closed

Missing and reordered verses in Serval drafts #397

bhartmoore opened this issue May 30, 2024 · 19 comments
Assignees
Labels
bug Something isn't working

Comments

@bhartmoore
Copy link
Collaborator

We've had multiple reports of missing or re-ordered verses in Serval drafts over the past few days. Its possible this should be two issues but I'm initially combining as all the reports have come in since Friday.

These chapter-final verses are missing from 2 projects, but were present in the source:
Hyolmo: Exo 40:38; Esther 10:3; Jonah 4:11
SCP_U: Esther 10:3; Jonah 4:11

Peter reports here investigating missing verses for a PNG project.

The image below shared by James Cuénod shows wrongly-ordered verses in the kyu project. (This issue was also reported in SILNLP drafts for the CTB project this week; possibly unrelated since they were not run through Serval.)
image (4)

@bhartmoore bhartmoore added the bug Something isn't working label May 30, 2024
@johnml1135 johnml1135 assigned Enkidu93 and unassigned johnml1135 Jun 4, 2024
@Enkidu93
Copy link
Collaborator

Enkidu93 commented Jun 7, 2024

@bhartmoore @pmachapman For those that were drafted through SF/Serval, would it be possible to have the engine and build ids? I'd like to make sure we have a complete list of those experiencing issues so I can verify that any solution we come up with solves all existing issues. Also, if you could send me any projects (and the file path to the experiments themselves if possible) for which silnlp had the same issue. I may be wrong, but I believe silnlp is using alignment code from machine.py which mirrors the code in Machine (C#) exactly, so it could very well be a single problem across both. Thank you!

@bhartmoore
Copy link
Collaborator Author

bhartmoore commented Jun 7, 2024

Thanks, @Enkidu93. I found two Hyolmo (scp) builds on the day Mike B reported finding the error. Both projects he reported it for were Hyolmo, but one is their "AI" project, so these most likely represent the two problemmatic builds. I am not sure how we'd access the drafts to verify missing verses.
'engine_id': '65a23e93ea0c57adad126c41'
'build_id': '664f08e1d04bc69c89cadcab'
and
'engine_id': '6608bfeba20eb39853a7b37a'
'build_id': '664f09b7d04bc69c89cae0a7'

@bhartmoore
Copy link
Collaborator Author

@Enkidu93 The silnlp draft that was pointed out was GEN c. 49 in this folder: "S:\MT\experiments\FT-Bod\NLLB.1.3B.bod_NTB-bo_CTB\infer\8000\NTB_2024"

The experiment can be found on the same path in "S:\MT\experiments\FT-Bod\NLLB.1.3B.bod_NTB-bo_CTB"

Screenshot of the chapter in Notepad++:
image

@Enkidu93
Copy link
Collaborator

Enkidu93 commented Jun 7, 2024

Thank you! This is very, very helpful!

@johnml1135
Copy link
Collaborator

Note that this issue should be resolved with sillsdev/machine#204 with release 1.4.5. It is on QA right now and will be on production hopefully by the end of the week.

@ddaspit
Copy link
Contributor

ddaspit commented Jun 10, 2024

Have we tested these projects to see if they are fixed?

@Enkidu93
Copy link
Collaborator

I'm not sure that this will have fixed all the issues we're seeing - particularly in silnlp - this didn't involve changes to machine.py (at least yet), did it? Do those just need to be ported over?

@Enkidu93
Copy link
Collaborator

Just to make sure this is recorded somewhere: There's also a bug where introductory material is not being translated. @bhartmoore, is there an example you could send my way?

@bhartmoore
Copy link
Collaborator Author

@Enkidu93 I believe this should affect anything translated since the most recent Serval update went live in SF. An example would be the Pentateuch translated from NIV11 to French in project SFDF. Here is the handy new "Admin" view of the most recent build for that project from Scripture Forge, but let me know if you need more/other information.

Diagnostic Information
Build Id: 6657975aec58cd36956de963
Corpora Ids: 6657975aec58cd36956de962
Date Finished: 2024-05-30T03:35:17.134+00:00
Message: Completed
Percent Completed: 1
Revision: 2263
Queue Depth: 0
State: COMPLETED
Step: 20000
Translation Engine Id: 66579759ec58cd36956de95f

@johnml1135
Copy link
Collaborator

@bhartmoore - the most recent fixes should be in Serval Live as of Wed. night. Can this project be re-run to see if they are still getting the same results?

@bhartmoore
Copy link
Collaborator Author

@johnml1135 Great! I'm guessing you mean the SF builds and not the silnlp one, correct? I'll need to ask Mike Bacon if he can re-request drafts for this team. His were missing chapter-final verses. We'd also want to check with Peter Chapman or James Cuénod to see if they can re-run the SF projects that saw wrongly-ordered verses.

@ddaspit
Copy link
Contributor

ddaspit commented Jun 14, 2024

Yes, these changes only affect SF/Serval. We still need to replace the USFM parser in silnlp.

@bhartmoore
Copy link
Collaborator Author

bhartmoore commented Jun 17, 2024

User Mike Bacon reports that final verses are still missing from Serval-via-SF drafts for the scp_U project generated on 6/16.

  • Peter reports that his comments were about verses missing from the target, so a different issue.
  • I've reached out to James Cuénod to ask whether he can have the team that saw both dropped and re-ordered verses regenerate their drafts. That team had reported that the missing verses were different based on which source text they used.

I will open a separate issue for the Intro and first section heading to be included in drafts.

Image

@bhartmoore
Copy link
Collaborator Author

Just saw issue #408 opened by Pchapman which sounds very much like what Mike Bacon is facing.

@Enkidu93
Copy link
Collaborator

@johnml1135 @ddaspit Back to work. Where does this stand? Have some of these issues been addressed elsewhere? Did you, @johnml1135 , verify that any were fixed by the previous changes?

@ddaspit
Copy link
Contributor

ddaspit commented Jul 1, 2024

We have fixed the missing last verse issue in #408. Once we finish #405, SF should be able to fix the rest of this issue.

@Enkidu93
Copy link
Collaborator

Enkidu93 commented Jul 1, 2024

We have fixed the missing last verse issue in #408. Once we finish #405, SF should be able to fix the rest of this issue.

Excellent! Thank you.

@Enkidu93
Copy link
Collaborator

@johnml1135 Should this be closed since the only remaining work is on the SF side?

@johnml1135
Copy link
Collaborator

Yes, I believe this is fully resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: ✅ Done
Development

No branches or pull requests

4 participants