geosolver is an end-to-end system that solves high school geometry questions. That is, its input is question text in natural language and diagram in raster graphics, and its output is the answer to the question.
geosolver is divided into four core parts: diagram praser, text parser, joint parser, and solver. Text parser transforms the text in natural language into a logical form. Diagram parser extracts information from the diagram. Joint parser combines the results of the text parser and the diagram parser and outputs the final logical form. Solver accepts the logical form from the joint parser and outputs the answer. This tutorial will first walkthrough each part independently (the modules except the joint parser can be used independently), and in the last section it will show how to connect them for an end-to-end system.
Location: geosolver.database
Required 3rd-party packages: requests
Every question is a tuple of key, text, words, diagram path, and choices (see geosolver.database.states.Question
).
You can either define it yourself, or more easily, you can download it via GeoServer, a web interface for managing and displaying geosolver database. Please refer to the GeoServer repository to see how to set it up.
-
Make sure that
geosolver.settings.GEOSERVER_URL
has a correct URL. Note that by default this is pointing to a local server, because I am hosting a local server myself. -
Import the geoserver interface (an instance of
geosolver.database.geoserver_interface.GeoServerInterface
):from geosolver import geoserver_interface
-
To download all questions tagged 'test' and print the text of each, type:
questions = geoserver_interface.download_questions('test') for id_, question in questions.iteritems(): print(question.text)
Note that the object returned by
geoserver_interface.download_questions
is adict
object. -
To download a single question with specific id (e.g. 1037), type:
questions = geoserver_interface.download_questions(1037) print(questions.values()[0].text)
Note that, regardless of the number of questions returned by
geoserver_interface.download_questions
, the returned object is always adict
object (with possibly single element, like above). -
You can access other properties of the question by
question.words
,question.diagram_path
, andquestion.choices
. See Diagram parser section to learn how to usequestion.diagram_path
, and see Text parser section to learn how to usequestion.words
andquestion.choices
.
Location: geosolver.diagram
Required 3rd-party packages: numpy, OpenCV 3.0.0 or higher (cv2)
Diagram parsing consists of five finer steps: image segment parsing, primitive paring, primitive selecting, core parsing, and graph parsing. We will explain each of them in detail below. You can also refer to geosolver.diagram.run_diagram
module to see full examples corresponding to these.
Image segment parsing is the task of obtaining the diagram segment (and label segments) from the original image. For instance, given an original image
![original image] (https://github.com/seominjoon/geosolver/blob/master/images/original.png)
we obtain the diagram segment
![diagram segment] (https://github.com/seominjoon/geosolver/blob/master/images/diagram.png)
To do so, obtain a Question
object (question
) as shown above and run the following:
from geosolver.diagram.parse_image_segments import parse_image_segments
image_segment_parse = parse_image_segments(open_image(question.diagram_path))
image_segment_parse.diagram_image_segment.display_binarized_segmented_image()
for idx, label_image_segment in image_segment_parse.label_image_segments.iteritems():
label_image_segment.display_segmented_image()
This is equivalent to geosolver.diagram.run_diagram.test_parse_image_segments
.
Primitive parsing is the task of obtaining over-gernerated, noisy primitives from the diagram segment. For instance, given the diagram segment above, we want to obtain
![primitive parse] (https://github.com/seominjoon/geosolver/blob/master/images/primitives.png)
To do so:
from geosolver.diagram.parse_primitives import parse_primitives
primitive_parse = parse_primitives(image_segment_parse)
primitive_parse.display_primitives()
This is equivalent to geosolver.diagram.run_diagram.test_parse_primitives
.
We want to select a subset of the over-generated primitives that accurately represents the diagram:
![selected primitives] (https://github.com/seominjoon/geosolver/blob/master/images/selected.png)
To do so:
from geosolver.diagram.select_primitives import select_primitives
selected = select_primitives(primitive_parse)
selected.display_primitives()
This is equivalent to geosolver.diagram.run_diagram.test_select_primitives
.
CoreParse consists of the critical (intersection) points and the circles in the diagram. It can be thought as the post-processed parse of the selected primitive parse. By parsing the core, we obtain:
![core] (https://github.com/seominjoon/geosolver/blob/master/images/core.png)
in which the blue dots are the intersection points.
To do so:
from geosolver.diagram.parse_core import parse_core
core_parse = parse_core(selected)
core_parse.display_points()
This is equivalent to geosolver.diagram.run_diagram.test_parse_core
.
GraphParse contains high level information such as whether a line exists between two points, etc. To obtain the graph parse, run:
from geosolver.diagram.parse_graph import parse_graph
from geosolver.diagram.get_instances import get_all_instances
graph_parse = parse_graph(diagram_parse)
lines = get_all_instances(graph_parse, 'line')
circles = get_all_instances(graph_parse, 'circle')
arcs = get_all_instances(graph_parse, 'arc')
angles = get_all_instances(graph_parse, 'angle')
print("Displaying lines...")
for key, line in lines.iteritems():
graph_parse.display_instances([line])
print("Displaying circles...")
for key, circle in circles.iteritems():
graph_parse.display_instances([circle])
print("Displaying arcs...")
for key, arc in arcs.iteritems():
graph_parse.display_instances([arc])
print("Displaying angles...")
for key, angle in angles.iteritems():
graph_parse.display_instances([angle])
This is equivalent to geosolver.diagram.run_diagram.test_parse_graph
.
Location: geosolver.solver
Required 3rd-party packages: numpy, scipy
For all the examples, see geosolver.solver.run_solver
. We will go through its first example here.
"In triangle ABC, AB = x, BC = 3, CA = 4. Which of the following is a possible value for x? (A) 5 (B) 8"
To solve this question, we first need to define geosolver.solver.variable_handler.VariableHandler
:
from geosolver.solver.variable_handler import VariableHandler
vh = VariableHandler()
The variable handler allows us to define variables in the (first sentence):
x = vh.number('x')
A = vh.point('A')
B = vh.point('B')
C = vh.point('C')
AB = vh.line(A, B)
BC = vh.line(B, C)
CA = vh.line(C, A)
Use vh.number
to define a numeric variable (with name as argument), vh.point
to define a point (with name as argument), and vh.line
to define a line (with two end points as arguments).
Next, we define the relations (information) mentioned in the question (AB = x, BC = 3, CA = 4):
c = vh.apply('LengthOf', AB)
a = vh.apply('LengthOf', BC)
b = vh.apply('LengthOf', CA)
p1 = vh.apply('Equals', a, 3)
p2 = vh.apply('Equals', b, 4)
p3 = vh.apply('Equals', c, x)
p1
, p2
, and p3
are the three prior relations in the text.
geosolver.solver.numeric_solver.NumericSolver
allows us to reconstruct the geometry according to the given information:
from geosolver.solver.numeric_solver import NumericSolver
ns = NumericSolver(vh, [p1, p2, p3])
NumericSolver
can analyze whether an input relation is simultaneously satisfiable with the prior relations via find_assignment
method. It returns a satisfying assignment if they are simultaneously satisfiable, and it returns None
if not. That is,
q1 = vh.apply('Equals', x, 5)
print(ns.find_assignment(q1))
will print out an assignment, and
q2 = vh.apply('Equals', x, 8)
print(ns.find_assignment(q2))
will print out None
.
Note that you can use q1 = x == 5
instead of vh.apply('Equals', x, 5)
. Other operators such as <, >, +, *, -, /, *** (power) are supported as well.
Geometric relations, such as Perpendicular, Tangent, etc., require vh.apply
to be used. For a complete list of usable relations, refer to geosolver.ontology.ontology_semantics
(To be updated soon!).