-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a language field, provide guidance on how to publish in a language other than English #348
Comments
Sharing some guidance I prepared for CoST Thailand: You can publish the value of free-text fields (e.g. In order for your data to be interoperable and compatible with OC4IDS tools and methodologies, you cannot:
The following JSON snippet is valid OC4IDS data. {
"id": "1",
"title": "ตัวอย่างโครงการ",
"type": "construction"
} The following JSON snippet is not valid because "การก่อสร้าง" is not a valid code from the ProjectType codelist: {
"id": "1",
"title": "ตัวอย่างโครงการ",
"type": "การก่อสร้าง"
} The following JSON snippet is not valid because "ชื่อ" and "พิมพ์" are not valid field names in OC4IDS: {
"id": "1",
"ชื่อ": "ตัวอย่างโครงการ",
"พิมพ์": "construction"
} In order to ease access for non-English speakers, you can publish a spreadsheet or CSV file with field and code titles from an OC4IDS translation. Currently, OC4IDS is available in English and Spanish. If you would like to translate the schema to your own language, please contact the OC4IDS Helpdesk. The following CSV except uses field and code titles from the Spanish translation of OC4IDS:
You can use Flatten Tool to generate a spreadsheet or CSV file with translated field titles; for example, the following command converts the example OC4IDS JSON file to xlsx format using field titles from the Spanish schema: flatten-tool flatten -s https://standard.open-contracting.org/infrastructure/0.9/es/_downloads/f53c05d8f3cfd5c65a3b33cdf80c5079/project-schema.json -f xlsx --use-titles --root-id=id --root-list-path=projects example.json |
CoST Thailand requested a list of which fields can and cannot be translated. I've annotated a copy of the flattened schema with that information. We could consider including this in the documentation and/or adding these columns to the output of OCDS Kit's |
Adding to 0.9.4 milestone since the schema change is simple, although adding the guidance can be worked on as an iterative improvement in the meantime.
Another option is to add a property to the schema, which could be done manually, or using a pre-commit script that applies the logic used in the spreadsheet linked in the previous comment. |
@jpmckinney are you happy with the proposal in this issue and do you have an opinion on how best to indicate which fields can and cannot be translated? |
The proposal in the issue description looks good, and the following comment looks like a good draft of a guidance page. As far as I know, any field that does not have a codelist and has a type of |
There are some exceptions to that rule, which can be included a pre-commit script:
I think it's easier for readers to check if a field is translatable by looking it up in a table than by applying a set of rules so I'll add the script as suggested. |
Sounds good. Looks like it can be determined algorithmically at least. Edit: I'm not sure that |
Good point. The question from CoST Thailand that prompted this issue originally was about publishing in Thai, rather than translating an English publication. I've perhaps muddled the issue by using 'translate' as a synonym for 'publish in your own language' so I'll make sure to be clear about that in the guidance and the script output. |
Aha, right, so any field value that can contain non-English text can also be translated, except for IDs. (I'm just checking whether we're on the same page – the guidance can be more clearly written than that.) |
Yep, that sounds right to me. |
OC4IDS lacks a
language
field to declare the language of free-text fields and doesn't provide any guidance on language or translation.Unlike OCDS, in-file translation isn't supported (
patternProperties
were removed in #52).Proposal
Add a
language
field, using the same description and codelist from OCDS.Add guidance on language and translation to cover:
language
, publish free-text fields in your own language, don't translate codes or field names)The text was updated successfully, but these errors were encountered: