Rbiter (Arbiter) is a wrapper of Sympy's expression evaluator intended to be used for automatic grading of CMIMC Math Contest submissions.
The intended workflow is:
- Submissions are imported from a google spreadsheet
- Submissions are evaluated for correctness
- Graded results are reuploaded to a different column in the google spreadsheet
To run this project, you need:
- Python 3.7
- pip package installer
- A Google Cloud project with the API enabled (refer to this)
- Add the
https://www.googleapis.com/auth/drive.file
scope to the OAuth consent screen
- Add the
- Auth credentials for a desktop app on Google Cloud (refer to this)
- A Google account
- Add this Google account to the Google Cloud project's allowed testers
- Clone this repository into a python 3.7 conda environment
- Run
pip install -r requirements.txt
- Move the json credentials file generated in step 4 of the prerequisites into the rbiter root directory and rename it to
credentials.json
- Run
main.py
. A login window will appear, and you need to login to a Google account that has read and write access to the sheet with the submissions
- Get the sheet id of your google spreadsheet from its url and replace the given id in
main.py
(for example, inhttps://docs.google.com/spreadsheets/d/aaaaaaaaaaaaaaa/edit
the id isaaaaaaaaaaaaaaa
) - For each column to grade, call
grade_column()
with the appropriate parameters (documentation below). The submissions will be automatically graded and the results will be printed accordingly on the spreadsheet.
Example code:
sheet_id = 'YOUR SHEET ID HERE'
# DO NOT MODIFY
creds = get_credentials()
service = build('sheets', 'v4', credentials=creds)
sheet = service.spreadsheets()
# test
correct = 'CORRECT LATEX EXPRESSION'
grade_column(sheet, sheet_id, correct, 'Sheet!A1:A2', 'Sheet!B1:B2') # A1 notation
This takes all entries on the sheet with id sheet_id
from the range Sheet!A1:A2
, compares them to correct
, and outputs the graded results onto the range Sheet!B1:B2
.
Evaluates a latex string into a sympy expression
Required parameters:
exp
(string): latex string of an expression
Optional parameters:
mode
(string)(default =symbolic
): mode of sympy evaluation to use. Available modes:numeric
: evaluateexp
to a numbersymbolic
: evaluateexp
to a symbol
maxn
(int)(default =100
): number of digits of precision to use during numerical evaluation
Returns:
- sympy expression containing evaluated latex
Compares a sympy expression to a latex string
Required parameters:
exp1
: sympy expression. Note it is not a latex string so a latex answer key only needs to be ran throughevaluate_latex()
once before being compared to all submissions.exp2
(string): latex string of an expression
Optional parameters:
mode
(string)(default =symbolic
): mode of sympy evaluation to use. Available modes:numeric
: evaluateexp
to a numbersymbolic
: evaluateexp
to a symbol
maxn
(int)(default =100
): number of digits of precision to use during numerical evaluationsuspicious_threshold
(int)(default =6
): inclusive threshold for number of consecutive digits necessary to mark as suspicious (IE if it's 6 the number 123456 is suspicious but 12345 is not). Note you can suppress suspicion by flooring the output.numeric_threshold
(float)(default =0
): inclusive threshold for grading two numerically-evaluated expressions as equivalent (IE |a-b| <= numerical_threshold means a and b are considered equivalent). Note this is unused during symbolic evaluation.
Returns:
- (float) number representing the result of the comparison:
0
: not equivalent0.1
: not equivalent and exp2 is suspicious1
: equivalent1.1
: equivalent and exp2 is suspicious
Compares a latex expression to a list of latex expressions
Required parameters:
exp
(string): latex string of an expression to compare to all expressions inexp_list
. THIS SHOULD BE THE CORRECT ANSWER!!exp_list
(List[string]): list of latex strings of expressions to compare toexp
Optional parameters:
mode
(string)(default =symbolic
): mode of sympy evaluation to use. Available modes:numeric
: evaluateexp
to a numbersymbolic
: evaluateexp
to a symbol
exec_limit
(int)(default =1
): limit of execution time per expression evaluation in secondsmaxn
(int)(default =100
): number of digits of precision to use during numerical evaluationsuspicious_threshold
(int)(default =6
): inclusive threshold for number of consecutive digits necessary to mark as suspicious (IE if it's 6 the number 123456 is suspicious but 12345 is not). Note you can suppress suspicion by flooring the output.numeric_threshold
(float)(default =0
): inclusive threshold for grading two numerically-evaluated expressions as equivalent (IE |a-b| <= numerical_threshold means a and b are considered equivalent). Note this is unused during symbolic evaluation.
Returns:
- (List[float]) list containing floats representing the results of comparisons between
exp
and corresponding expressions inexp_list
:0
: not equivalent0.1
: not equivalent and exp2 is suspicious1
: equivalent1.1
: equivalent and exp2 is suspicious2
: took too long to evaluate3
: other
Grades a column of latex answers by comparing them to the correct answer, and outputs the results onto the provided sheet. ENSURE COLUMN AND DESTINATION ARE THE SAME SIZE AND DOES NOT INCLUDE LABELS
Required parameters:
sheet
: Google spreadsheets object generated inmain.py
sheet_id
(string) : the sheet idcorrect
(string) : latex string of the correct answercolumn
(string) : A1 notation of column to grade (see https://developers.google.com/sheets/api/guides/concepts#expandable-1)destination
(string) : A1 notation of column to output graded results to (see https://developers.google.com/sheets/api/guides/concepts#expandable-1)
Optional parameters:
mode
(string)(default =symbolic
): mode of sympy evaluation to use. Available modes:numeric
: evaluateexp
to a numbersymbolic
: evaluateexp
to a symbol
exec_limit
(int)(default =1
): limit of execution time per expression evaluation in secondsmaxn
(int)(default =100
): number of digits of precision to use during numerical evaluationsuspicious_threshold
(int)(default =6
): inclusive threshold for number of consecutive digits necessary to mark as suspicious (IE if it's 6 the number 123456 is suspicious but 12345 is not). Note you can suppress suspicion by flooring the output.numeric_threshold
(float)(default =0
): inclusive threshold for grading two numerically-evaluated expressions as equivalent (IE |a-b| <= numerical_threshold means a and b are considered equivalent). Note this is unused during symbolic evaluation.
Returns:
- None