-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO
145 lines (125 loc) · 7.05 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
* Add optimizer for empty right operand to an Insertion
* Should factor out the Insertion
* Add optimizer for empty left operand to a Deletion
* Should return an Empty relation, since deleting anything from an empty
relation always results in an empty Relation
* Add optimizer for empty right operand to a Deletion
* Should factor out the Deletion
* Add an assertion to every #optimize spec that if #optimizable? is true, then
the object returned by #optimize must *not* be equivalent. This will ensure
that the guard clause in #optimizable? only evaluates to true if the
optimization is actually going to return a new object; no point in going
through the work of optimizing if it won't result in any change, and it should
be possible to test for this before carrying out the work.
* Add a shared spec for #optimize and #optimizable?. Note that the current
shared spec for #optimize is for the Relation#optimize integration specs
(it should probably be renamed to avoid confusion)
* Should ensure that when #optimize is being tested #optimizable? should
only return true.
* Should ensure #optimizable? only returns true if #optimize returns an
object that is different (should return equivalent tuples, but have
different internal state, otherwise the #optimize step was totally
unecessary)
* Rename the Util.constant? method as Util.value?
* Pull up Rename rather than pushing it down
* Currently a Rename applies to every tuple in a result-set, and
doing the rename prematurely means that lots of tuples that would
otherwise be filtered out are going to be processed. A better
approach should be to push the rename up, dropping renames that
are projected away. It should not distribute over Binary ops
unless the other side of the binary op has he same aliases.
* For Binary operations, when the left or right is materialized then
change the other relation to use a restriction to filter the records.
* This will help optimize the fetching of the underlying records.
* More optimizations:
* Union
* A Union of relations with the same base, header, and restrictions should
try to combine into a single relation with the restrictions using OR.
* Intersection
* An Intersection of relations with the same base, header, and restrictions
should try to combine into a single relation with the restrictions using
AND.
* Use the Join::RightMaterializedOperand and Join::LeftMaterializedOperand
optimizers to simplify Intersection operations
* Difference
* A Difference of relations with the same base and restrictions should
try to combine into a single relation with the restrictions using NOT.
* Summarization
* Add SortedSummarizePer for factoring out Sorted objects inside a
Summarization
* When there are no aggregate functions, drop the Summarization and
return the summarize_per (?)
* Use the UnchangedHeader optimizer as a base class
* When summarize_per is an Sorted, the Sorted can be dropped.
* Projection
* When it contains a Restriction, if the removed attributes are *not*
used in the predicate, then move the Restriction to contain the
Projection.
* When a Projection contains a Restriction, wrap the Projection
in the Restriction, projecting away any attributes not used in
the restriction. If there are any remaining attributes, then
wrap the operation in a Projection removing those attributes.
* If all attributes are being used in the Restriction do nothing
* When an attribute is renamed, then projected away, the alias should be
removed from the rename.
* When an attribute is added by a summarization, then projected away, the
attribute should be removed from the summarization.
* It does not distribute over Intersection or Difference, but see if
perhaps an exception can be made if there is a functional dependency
between the columns projected away and the one remaining. Then I *think*
it might still work, but more research will be needed.
* When a Projection contains a Join, wrap the Join with a Projection
of all the headers, minus those used in the Join. If there were
any used, then wrap the whole operation in a Projection with
the remaining attributes.
* If all the attributes are used in the Join, do nothing
* Try to use the same approach for Product
* Test if it's possible to fully distribute projections over
joins rather than splitting it up like this.
* Rename
* When wrapping a Summarization or Extension, and renaming the new attribute,
it should change the new attribute name, and remove it from the rename.
* Restriction
* Figure out how to reorganize the Restriction predicates so that all
similar operations are closer together to allow more efficient
optimizations. This would allow optimizations of stuff like this:
"attr1 = ? OR attr2 = ? OR attr1 = ?"
Into:
"attr1 IN(..) OR attr2 = ?"
* When it has an equality on a unique attribute:
* Limit with a limit >= 1 can be factored out
* Offset with an offset > 0 can be transformed into an empty
relation, since at most there can be only one match.
* When wrapping a Product and the predicate only contains
equality/conjunction between attributes from the left and right
operands, then transform into a Join.
* Sorted
* When the operation is an instance of Sorted, and the operand is sorted in
reversed order, change to a Reverse operation. This should occur after the
Sorted optimization that collapses two Sorted objects into one; it would be
expected that the operand is a Limit or Offset, but it probably isn't
necessary to test that.
* Connective
* "attr > ? OR attr > ?" -> "attr > ?", with the least restrictive value
* Do the same for >=, <, <=
* "attr > ? AND attr > ?" -> "attr > ?", with the most restrictive value
* Do the same for >=, <, <=
* "attr > 5 OR attr == 5" -> "attr >= 5"
* "attr < 5 OR attr == 5" -> "attr <= 5"
* "attr" = "string" AND "attr" =~ /string/ -> "attr" = "string"
* If the regexp matches the constant, then it should be
optimized down to a constant match. If it does not match
then it should be optimized to a Contradiction.
* Constant folding, eg:
"attr1 > attr2 AND attr1 = 5" -> "5 > attr2 AND attr1 = 5"
* This will probably only work across Conjunctions.
* "attr > 5 AND attr = 6" -> "attr = 6", because attr must be
equal to 6. this will probably be related to constant folding;
the first expression will become 6 > 5, which evaluates to a
Tautology, then the expression is a Tautology AND attr = 6,
which simplifies down to attr = 6.
* "attr < 5 AND attr = 6" -> "Contradiction", because attr must be equal to
6, and 6 < 5 evaluates to a Contradiction. A Contradiction AND attr = 6
simplifies down to a Contradiction.
* Inclusion/Exclusion
* When the enumerable is a contiguous sequence transform it into a Range