2009-03-18 04:42:48 -05:00
|
|
|
|
|
|
|
Complexity
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
Purpose of this document is to describe in a detailed way the
|
|
|
|
complexity of relational algebra operations. The evaluation will be
|
|
|
|
done on the specific implementation of this program, not on theorical
|
|
|
|
lower limits.
|
|
|
|
|
|
|
|
Latest implementation can be found at:
|
|
|
|
http://galileo.dmi.unict.it/svn/trunk
|
|
|
|
|
|
|
|
Notation
|
|
|
|
Big O notation will be used. Constant values will be ignored.
|
|
|
|
|
|
|
|
Single letters will be used to indicate relations and letters between
|
|
|
|
| will indicate the cardinality (number of tuples) of the relation.
|
2009-03-19 07:56:15 -05:00
|
|
|
|
|
|
|
Then after evaluating the big O notation, an attempt to find more
|
|
|
|
precise results will be done, since it will be important to know
|
|
|
|
with a certain precision the weight of the operation.
|
2009-03-18 04:42:48 -05:00
|
|
|
|
|
|
|
1. UNARY OPERATORS
|
|
|
|
|
2009-03-18 04:58:25 -05:00
|
|
|
Relational defines three unary operations, and they will be studied
|
|
|
|
in this section. It doesn't mean that they should have similar
|
|
|
|
complexity.
|
|
|
|
|
2009-03-18 04:42:48 -05:00
|
|
|
1.1 Selection
|
2009-03-18 04:58:25 -05:00
|
|
|
|
|
|
|
Selection works on a relation and on a python expression. For each
|
|
|
|
tuple of the relation, it will create a dictionary with name:value
|
|
|
|
where name are names of the fields in the relation and value is the
|
|
|
|
value for the specific row.
|
|
|
|
We can consider the inner cycle as constant as its value doesn't
|
|
|
|
depend on the relation itself but only on the kind of the relation
|
|
|
|
(how many field it has).
|
|
|
|
Then comes the evaluation. A python expression in truth could do
|
|
|
|
much more things than just checking if a>b. Anyway, ssuming that
|
|
|
|
nobody would ever write cycles into a selection condition, we have
|
|
|
|
another constant complexity for this operation.
|
|
|
|
Then, the tuple is inserted in a new relation if it satisfies the
|
|
|
|
condition. Since no check on duplicated tuples is performed, this
|
|
|
|
operation is constant too.
|
|
|
|
|
|
|
|
In the end we have O(|n|) as complexity for a selection on the
|
|
|
|
relation n.
|
2009-03-19 07:56:15 -05:00
|
|
|
|
|
|
|
The assumption made of considering constant the number of fields is
|
|
|
|
a bit strong. For example a relation could have hundreds of fields
|
|
|
|
and two tuples.
|
|
|
|
|
|
|
|
So in general, the complexity is something more like O(|n| * f) where
|
|
|
|
f is the number of the fields.
|
2009-03-18 04:58:25 -05:00
|
|
|
|
2009-03-18 04:42:48 -05:00
|
|
|
1.2 Rename
|
2009-03-18 04:58:25 -05:00
|
|
|
|
2009-03-19 07:56:15 -05:00
|
|
|
The rename operation itself is very simple, just modify the list
|
|
|
|
containing the name of the fields.
|
|
|
|
The big issue is to copy the content of the relation into a new
|
|
|
|
relation object, so the new one can be modified.
|
|
|
|
|
|
|
|
So the operation depends on the size of the relation: O(|n| * f).
|
|
|
|
|
|
|
|
1.3 Projection
|
|
|
|
|
2009-04-01 08:37:56 -05:00
|
|
|
The projection operation creates a copy of the original relation
|
|
|
|
using only a subset of its fields. Time for the copy is something
|
|
|
|
like O(|n|*f) where f is the number of fields to copy.
|
|
|
|
But that's not all. Since relations are set, duplicated items are not
|
|
|
|
allowed. So after extracting the wanted elements, it has to check if
|
|
|
|
the new tuple was already added to the new relation. And this brings
|
|
|
|
the complexity to O((|n|*f)²).
|
2009-03-19 07:56:15 -05:00
|
|
|
|