-
Notifications
You must be signed in to change notification settings - Fork 0
/
Problem_Format_(ITC_2015_Working_Draft).mw
422 lines (301 loc) · 24.6 KB
/
Problem_Format_(ITC_2015_Working_Draft).mw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
<s>This is a working draft.</s>
== Overview ==
This document describes the format of a ''Kattis problem package'', used for distributing and sharing problems for algorithmic programming contests as well as educational use.
=== General Requirements ===
The package consists of a single directory containing files as described below, or alternatively, a ZIP compressed archive of the same files using the file extension <tt>.kpp</tt> . The name of the directory or the base name of the archive must consisting solely of lower case letters a-z and digits 0-9.
All file names for files included in the package must match the following regexp
[a-zA-Z0-9][a-zA-Z0-9_.-]*[a-zA-Z0-9]
I.e., it must be of length at least 2, consist solely of lower or upper case letters a-z, A-Z, digits 0-9, period, dash or underscore, but must not begin or end with period, dash or underscore.
All text files for a problem must be UTF-8 encoded and not have a byte order mark.
All floating point numbers must be given as the external character sequences defined by IEEE 754-2008 and may use up to double precision.
=== Programs ===
There are a number of different kinds of programs that maybe provided in the problem package; submissions, input validators, output validators, graders and generators. All programs are always represented by a single file or directory. In other words, if a program consists of several files, these must be provided in a single directory. The name of the program, for the purpose of referring to it within the package is the base name of the file or the name of the directory. There can't be two programs of the same kind with the same name.
Validators and graders, but not submissions, in the form of a directory may include two POSIX-compliant scripts "build" and "run". Either both or none of these scripts must be included.
If the scripts are present, then:
* the program will be compiled by executing the build script.
* the program will be run by executing the run script.
Programs without build and run scripts are built and run according to what language is used. Language is determined by looking at the file endings. If a single language from the table below can't be determined, building fails. In the case of Python 2 and 3 which share the same file ending, language will be determined by looking at the shebang line which must match the regular expressions in the table below.
For languages where there could be several entry points, the default entry point in the table below will be used.
{| class="wikitable"
! Code !! Language !! Default entry point !! File endings !! Shebang
|-
| c || C || || .c ||
|-
| cpp || C++ || || .cc, .cpp, .cxx, .c++, .C ||
|-
| csharp || C# || || .cs ||
|-
| go || Go || || .go ||
|-
| haskell || Haskell || || .hs ||
|-
| java || Java || Main|| .java ||
|-
| javascript || JavaScript || main.js || .js ||
|-
| objectivec || Objective-C || || .m ||
|-
| pascal || Pascal || || .pas ||
|-
| php || PHP || main.php || .php ||
|-
| prolog || Prolog || || .pl ||
|-
| python2 || Python 2 || main.py || .py || Matches the regex "<tt>^#!.*python2 </tt>", and default if shebang does not match any other language
|-
| python3 || Python 3 || main.py || .py || Matches the regex "<tt>^#!.*python3 </tt>"
|}
=== Problem types ===
There are two types of problems: <em>pass-fail</em> problems and <em>scoring</em> problems. In pass-fail problems, submissions are basically judged as either accepted or rejected (though the "rejected" judgement is more fine-grained and divided into results such as "Wrong Answer", "Time Limit Exceeded", etc). In scoring problems, a submission that is accepted is additionally given a score, which is a numeric value (and the goal is to either maximize or minimize this value).
== Problem Metadata ==
Metadata about the problem (e.g., source, license, limits) are provided in a UTF-8 encoded YAML file named <tt>problem.yaml</tt> placed in the root directory of the package.
The keys are defined as below. Keys are optional unless explicitly stated. Any unknown keys should be treated as an error.
{| class="wikitable"
! Key !! Type !! Default !! Comments
|-
| uuid || String || || UUID identifying the problem.
|-
| name || String or map of strings || || Required. If a string this is the name of the problem in english. If a map the keys are language codes and the values are the name of the problem in that language. It is an error for a language to be missing if there exists a problem statement for that language.
|-
| type || String || pass-fail || One of "pass-fail" and "scoring".
|-
| author || String or sequence of strings || || Who should get author credits. This would typically be the people that came up with the idea, wrote the problem specification and created the test data. This is sometimes omitted when authors choose to instead only give source credit, but both may be specified.
|-
| source || String || || Who should get source credit. This would typically be the name (and year) of the event where the problem was first used or created for.
|-
| source_url || String || || Link to page for source event. Must not be given if source is not.
|-
| license || String || unknown || License under which the problem may be used. Value have to be one of the ones defined below.
|-
| rights_owner || String || Value of author, if present, otherwise value of source. || Owner of the copyright of the problem. If not present, author is owner. If author is not present either, source is owner. Required if license is something other than "unknown" or "public domain". Forbidden if license is "public domain".
|-
| keywords || String or sequence of strings || || Set of keywords.
|-
| limits || Map with keys as defined below || see definition below ||
|-
| libraries || String or sequence of strings || || Set of libraries as defined below.
|-
| validation || String || default || One of "default" or "custom". If "custom", may be followed by some subset of "score" and "interactive", where "score" indicates that the validator produces a score (this is only valid for scoring problems), and "interactive" specifies that the validator is run interactively with a submission. For example, "custom interactive score".
|-
| scoring || Map with keys as defined below || See definition below || Must only be used on scoring problems.
|}
=== license ===
Allowed values for license.
Values other than ''unknown'' or ''public domain'' requires rights_owner to have a value.
{| class="wikitable"
! Value !! Comments !! Link
|-
| unknown || The default value. In practice means that the problem can not be used. ||
|-
| public domain || There are no known copyrights on the problem, anywhere in the world. || http://creativecommons.org/about/pdm
|-
| cc0 || CC0, "no rights reserved" || http://creativecommons.org/about/cc0
|-
| cc by || CC attribution || http://creativecommons.org/licenses/by/3.0/
|-
| cc by-sa || CC attribution, share alike || http://creativecommons.org/licenses/by-sa/3.0/
|-
| educational || May be freely used for educational purposes ||
|-
| permission || Used with permission. The author must be contacted for every additional use. ||
|}
=== limits ===
A map with the following keys:
{| class="wikitable"
! Key !! Comments !! Default || Typical system default
|-
| time_multiplier || optional || 5 ||
|-
| time_safety_margin || optional || 2 ||
|-
| memory || optional, in MiB || system default || 2048
|-
| output || optional, in MiB || system default || 8
|-
| code || optional, in kiB || system default || 128
|-
| compilation_time || optional, in seconds || system default || 60
|-
| compilation_memory || optional, in MiB || system default || 2048
|-
| validation_time || optional, in seconds || system default || 60
|-
| validation_memory || optional, in MiB || system default || 2048
|-
| validation_output || optional, in MiB || system default || 8
|}
For most keys the system default will be used if nothing is specified. This can vary, but you SHOULD assume that it's reasonable. Only specify limits when the problem needs a specific limit, but do specify limits even if the "typical system default" is what is needed.
=== scoring ===
A map with the following keys:
{| class="wikitable"
! Key !! Type !! Default !! Comments
|-
| objective || String || max || One of "min" or "max" specifying whether it is a minimization or a maximization problem.
|-
| range || String || -inf +inf || Two numbers A and B ("inf", "-inf", "+inf" are allowed for plus/minus infinity) specifying the range of possible scores.
|-
|}
=== libraries ===
A set from elements below. A library will be available for the languages listed.
{| class="wikitable"
! Value !! Library !! Languages
|-
| gmp || GMP - The GNU Multiple Precision Arithmetic Library || C, C++
|-
| boost || Boost || C++
|}
== Problem Statements ==
The problem statement of the problem is provided in the directory <tt>problem_statement/</tt>.
This directory must contain one file per language, for at least one language, named <tt>problem.<language>.<filetype></tt>, that contains the problem text itself, including input
and output specifications, but not sample input and output. Language must be given as the shortest ISO 639 code. If needed a hyphen and a ISO 3166-1 alpha-2 code may be appended to ISO 639 code. Optionally, the language code can be left out, the default is then English (<tt>en</tt>). Filetype can be either <tt>tex</tt> for LaTeX files, <tt>md</tt> for Markdown, or <tt>pdf</tt> for PDF.
Please note that many kinds of transformations on the problem statements, such as conversion to HTML or styling to fit in a single document containing many problems will not be possible for PDF problem statements, so using this format should be avoided if at all possible.
Auxiliary files needed by the problem statement files must all be in <tt><short_name>/problem_statement/</tt> , <tt>problem.<language>.<filetype></tt> should reference auxiliary files as if the
working directory is <tt><short_name>/problem_statement/</tt>. Image file formats supported are <tt>.png</tt>, <tt>.jpg</tt>, <tt>.jpeg</tt>, and <tt>.pdf</tt>.
A LaTeX file may include the Problem name using the LaTeX command <tt>\problemname</tt> in case LaTeX formatting of the title is wanted. If it's not included the problem name specified in <tt>problem.yaml</tt> will be used.
The problem statements must only contain the actual problem statement, no sample data.
== Attachments ==
Public, i.e. non-secret, files to be made available in addition to the problem statement and sample test data are provided in the directory <tt>attachments/</tt>.
== Test data ==
If input generators are used the files described here might not be available in this directory. This section describes what must be the case after running the generators.
The test data are provided in subdirectories of <tt>data/</tt>. The sample data in <tt>data/sample/</tt> and the secret data in <tt>data/secret/</tt>.
All input and answer files have the filename extension <tt>.in</tt> and <tt>.ans</tt> respectively.
=== Annotations ===
Optionally a hint, a description and an illustration file may be provided.
The hint file is a text file with filename extension<tt>.hint</tt> giving a hint for solving an input file. The hint file is meant to be given as feedback, i.e. to somebody that fails to solve the problem.
The description file is a text file with filename extension <tt>.desc</tt> describing the purpose of an input file. The description file is meant to be privileged information that explains the purpose of the related test file, e.g. what cases it's supposed to test.
The Illustration is an image file with filename extension <tt>.png</tt>, <tt>.jpg</tt>, <tt>.jpeg</tt>, or <tt>.svg</tt>. The illustration is meant to be privileged information illustrating the related test file.
Input, answer, description, hint and image files are matched by the base name.
=== Test Data Groups ===
The test data for the problem can be organized into a tree-like structure. Each node of this tree is represented by a directory and referred to as a test data group. Each test data group may consist of zero or more test cases (i.e., input-answer files) and zero or more subgroups of test data (i.e., subdirectories).
At the top level, the test data is divided into exactly two groups: <tt>sample</tt> and <tt>secret</tt>, but these two groups may be further split into subgroups as desired.
The <em>result</em> of a test data group is computed by applying a <em>grader</em> to all of the sub-results (test cases and subgroups) in the group. See [[#Graders|Graders]] for more details.
Test files and groups will be used in lexicographical order on file base name. If a specific order is desired a numbered prefix such as 00, 01, 02, 03, and so on, can be used.
In each test data group, a file <tt>testdata.yaml</tt> may be placed to specify how the result of the test data group should be computed. If such a file is not provided for a test data group then the settings for the parent group will be used. The format of <tt>testdata.yaml</tt> is as follows:
{| class="wikitable"
! Key !! Type !! Default !! Comments
|-
| on_reject || String || break || One of "break" or "continue". Specifies how judging should proceed when a submission gets a non-Accept judgement on an individual test file or subgroup. If "break", judging proceeds immediately to grading. If "continue", judging continues judging the rest of the test files and subgroups within the group.
|-
| grader || String or map with the keys "name" and "flags" || empty string || If a string this is the name of the grader that will be used for this test data group. If a map then this is the name as well as the flags that will be passed to the grader. The name of the default grader is "default".
|-
| input_validator || String or map with the keys "name" and "flags" || empty string || If a string this is the name of the input validator that will be used for this test data group. If a map then this is the name as well as the flags that will be passed to the input validator.
|-
| output_validator || String or map with the keys "name" and "flags" || empty string || If a string this is the name of the output validator that will be used for this test data group. If a map then this is the name as well as the flags that will be passed to the output validator. The name of the default output validator is "default".
|-
| accept_score || Number || 1 || Score for accepted input files.
|-
| reject_score || Number || 0 || Score for rejected input files.
|}
== Included Code ==
Code that should be included with all submissions are provided in one directory per supported language, called <tt>include/<language>/</tt>.
The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, overwriting files from the submission in the case of name collision. Language must be given as one of the language codes in the language table in the overview section. If any of the included files are supposed to be the main file (i.e. a driver), that file must have the language dependent name as given in the table referred above.
== Example Submissions ==
Correct and incorrect solutions to the problem are provided in subdirectories of <tt>submissions/</tt>. The possible subdirectories are:
{| class="wikitable"
! Value !! Requirement !! Comment
|-
| accepted || Accepted as a correct solution. || At least one is required.
|-
| rejected || Not accepted as a correct solution. Time limit is not applied, so may not be rejected only because of time limit ||
|-
| time_limit_exceeded || Too slow for some test file. May also give wrong answer but not crash for any test file. ||
|}
For submissions in the <tt>accepted</tt> subdirectory the expected score may be specified in a UTF-8 encoded YAML file named <tt>score.yaml</tt>. This is only relevant for scoring problems. The file must contain a map where the keys are the name of a submission and the value is the expected score for that submission. Not all submissions must be mentioned in the file, if a submission is not mentioned any score is accepted (but the submission may not be rejected).
For submissions in the <tt>rejected</tt> subdirectory the expected reason for rejection may be specified in a UTF-8 encoded YAML file named <tt>judgement.yaml</tt>. The file must contain a map where the keys are the name of a submission and the value is the reason for rejection for that submission. The reason for rejection must be one of <tt>WA</tt> or <tt>RTE</tt>. Not all submissions must be mentioned in the file, if a submission is not mentioned any rejection is accepted.
Every file or directory in these directories represents a separate solution. Same requirements as for submissions with regards to filenames. It is mandatory to provide at least one accepted solution.
Submissions must read input data from standard input, and write output to standard output.
== Input Validators ==
Input Validators, for verifying the correctness of the input files, are provided in <tt>input_validators/</tt>. Input validators can be specified as VIVA-files (with file ending <tt>.viva</tt>), Checktestdata-file (with file ending <tt>.ctd</tt>), or as a program.
All input validators provided will be run on every input file. Validation fails if any validator fails.
=== Invocation ===
An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.
All input validators provided will be run on every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.
When invoked the input validator will get the input file on stdin.
The validator should be possible to use as follows on the command line:
<s>Needs to be clarified to fit Java and Python and similar</s>
./validator [arguments] < inputfile
=== Output ===
The input validator may output debug information on stdout and stderr. This information may be displayed to the user upon invocation of the validator.
=== Exit codes ===
The input validator must exit with code 42 on successful validation. Any other exit code means that the input file could not be confirmed as valid.
==== Dependencies ====
The validator MUST NOT read any files outside those defined in
the Invocation section. Its result MUST depend only on these
files and the arguments.
== Output Validators ==
Output Validators are used if the problem requires more complicated output validation than what is provided by the default diff variant described below. They are provided in <tt>output_validators/</tt>, and must adhere to the [[Output validator]] specification.
All output validators provided will be run on the output for every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.
=== Default Output Validator Specification ===
The default output validator is essentially a beefed-up diff. In its default mode, it
tokenizes the files to compare and compares them token by token. It supports the
following command-line arguments to control how tokens are compared.
{| class="wikitable"
! Arguments !! Description
|-
| <tt>case_sensitive</tt> || indicates that comparisons should be case-sensitive.
|-
| <tt>space_change_sensitive</tt> || indicates that changes in the amount of whitespace should be rejected (the default is that any sequence of 1 or more
whitespace characters are equivalent).
|-
| <tt>float_relative_tolerance ε</tt> || indicates that floating-point tokens should be accepted if they are within relative error ≤ ε (see below for details).
|-
| <tt>float_absolute_tolerance ε</tt> || indicates that floating-point tokens should be accepted if they are within absolute error ≤ ε (see below for details).
|-
| <tt>float_tolerance ε</tt> || short-hand for applying ε as both relative and absolute tolerance.
|}
When supplying both a relative and an absolute tolerance, the semantics are that a token is accepted if it is within either of the two tolerances.
When a floating-point tolerance has been set, any valid formatting of floating point numbers is accepted for floating point tokens. So for instance if a token in the answer file says <tt>0.0314</tt>, a token of <tt>3.14000000e-2</tt> in the output file would be accepted. If no floating point tolerance has been set, floating point tokens are treated just like any other token and has to match exactly.
== Graders ==
Graders are programs that are given the sub-results of a test data group and aggregate a result for the group. They are provided in <tt>graders/</tt> .
For pass-fail problems, this grader will typically just set the verdict to accepted if all sub-results in the group were accepted and otherwise select the "worst" error in the group (see below for definition of "worst"), though it is possible to write a custom grader which e.g. accepts if at least half the sub-results are accepted. For scoring problems, one common grader behaviour would be to always set the verdict to Accepted, with the score being the sum of scores of the items in the test group.
=== Invocation ===
A grader program must be an application (executable or interpreted) capable of being invoked with a command line call.
When invoked the grader will get the judgement for test files or groups on stdin and is expected to produce an aggregate result on stdout.
The grader should be possible to use as follows on the command line:
./grader [arguments] < judgeresults
=== Input ===
A grader simply takes a list of results on standard input, and produces a single result on standard output. The input file will have the one line per test file containing the result of judging the testfile, using the code from the table below, followed by whitespace, followed by the score. <s>Format to be extended.</s>
{| class="wikitable"
! Code !! Meaning
|-
| AC || Accepted
|-
| WA || Wrong Answer
|-
| RTE || Run-Time Error
|-
| TLE || Time-Limit Exceeded
|}
The score is taken from the <tt>score.txt</tt> files produced by the ouput validator. If no <tt>score.txt</tt> exists the score will be as defined by the grading accept_score and reject_score setting from problem.yaml.
=== Output ===
The grader must output the aggregate result on stdout in the same format as its input. Any other output, including no output, will result in a Judging Error.
For pass-fail problems, or for non-Accepted results on scoring problems, the score provided by the grader will always be ignored.
The grader may output debug information on stderr. This information may be displayed to the user upon invocation of the grader.
=== Default Grader Specification ===
The default grader has two different modes for aggregating the verdict -- ''no_errors'' and ''always_accept'' -- and four different modes for aggregating the score -- ''sum'', ''avg'', ''min'', ''max''. These modes can be set by providing their names as command line arguments (through the "grader_flags" option in [[#Test Data Groups|testdata.yaml]]). If multiple conflicting modes are given, the last one is used. Their meaning are as follows.
{| class="wikitable"
! Argument !! Description
|-
| <tt>no_errors</tt> || verdict is accepted if all subresults are accepted, otherwise it is the first of RTE, TLE, WA that is the subresult of some item in the test case group. This is the default mode for pass-fail problems. Note that in combination with the on_reject:break policy in testdata.yaml, the result will be the first error encountered.
|-
| <tt>always_accept</tt> || verdict is always accepted. This is the default mode for scoring problems.
|-
| <tt>sum</tt> || score is sum of input scores (default).
|-
| <tt>avg</tt> || score is average of input scores.
|-
| <tt>min</tt> || score is minimum of input scores.
|-
| <tt>max</tt> || score is maximum of input scores.
|}
== Generators ==
Input generators are programs that generates input. They are provided in <tt>generators/</tt>.
=== Invocation ===
A generator program must be an application (executable or interpreted) capable of being invoked with a command line call.
The generators will be run with the test data directory (<tt>data/</tt>) as the working directory. The generator may read any existing files in that directory and should create any kind of test data file as defined in the test data section. The generator may not read or write anything outside the test data directory. The generators will be run in lexicographical order on name. If a specific order is desired a numbered prefix such as 00, 01, 02, 03, and so on, can be used.
The generators must be deterministic, i.e. always produce the same input file when give the same arguments.
The generators must be idempotent, i.e. running them multiple times should result in the same contents of the test data directory as running them once.
== See also ==
* [[Output validator]]
* [[Problem format directory structure]]
* [[Problem Format Verification]]