Complexity

Abstract
   Purpose of this document is to describe in a detailed way the
   complexity of relational algebra operations. The evaluation will be
   done on the specific implementation of this program, not on theorical
   lower limits.

   Latest implementation can be found at:
   https://github.com/ltworf/relational

Notation
   Big O notation will be used. Constant values will be ignored.

   Single letters will be used to indicate relations and letters between
   | will indicate the cardinality (number of tuples) of the relation.

   Number of tuples can't be enough. For example a relation with one
   touple and thousands of fields, will not take O(1) in general to be
   evaluated. So we assume that relations will have a reasonable and
   comparable number of fields.

   Then after evaluating the big O notation, an attempt to find more
   precise results will be done, since it will be important to know
   with a certain precision the weight of the operation.

1.  UNARY OPERATORS

   Relational defines three unary operations, and they will be studied
   in this section. It doesn't mean that they should have similar
   complexity.

1.1  Selection

   Selection works on a relation and on a python expression. For each
   tuple of the relation, it will create a dictionary with name:value
   where name are names of the fields in the relation and value is the
   value for the specific row.
   We can consider the inner cycle as constant as its value doesn't
   depend on the relation itself but only on the kind of the relation
   (how many field it has).
   Then comes the evaluation. A python expression in truth could do
   much more things than just checking if a>b. Anyway, ssuming that
   nobody would ever write cycles into a selection condition, we have
   another constant complexity for this operation.
   Then, the tuple is inserted in a new relation if it satisfies the
   condition. Since no check on duplicated tuples is performed, this
   operation is constant too.

   In the end we have O(|n|) as complexity for a selection on the
   relation n.

1.2  Rename

   The rename operation itself is very simple, just modify the list
   containing the name of the fields.
   The big issue is to copy the content of the relation into a new
   relation object, so the new one can be modified.

   So the operation depends on the size of the relation: O(|n|).

1.3  Projection

   The projection operation creates a copy of the original relation
   using only a subset of its fields. Time for the copy is something
   like O(|n|) where f is the number of fields to copy.
   But that's not all. Since relations are set, duplicated items are not
   allowed. So after extracting the wanted elements, it has to check if
   the new tuple was already added to the new relation. And this brings
   the complexity to O(|n|²).

   But the projection can also be used to "rearrange" fields, which
   makes no sense in pure relational algebra, but can be usefull to make
   two relations match (in fact it is used internally to make relations
   match if they have the same fields in different order). In this case
   there is no need to check if the tuple already exists, because it is
   assumed that the relation was correct. This gives a complexity of
   O(|n|) in the best case.

2.  BINARY OPERATORS

   Relational defines nine binary operations, and they will be studied
   in this section. Since we will deal with two relations per operation
   here, we will call them m and n, and f and g will be the number of
   their fields.

2.1  Product

   Product is a very complex operations. It is O(|n|*|m|).
   Obvious.

2.2 Intersection

   Same as product even if it does a different thing. But it has to
   compare every tuple from n with every tuple from m, to see if they
   match, and in this case, put them in the resulting relation.
   So this operation is O(|n|*|m|) as well.

2.3  Difference

   Same as above:

2.4  Union

   This operation first creates a new relation copying all the tuples
   from one of the originating relations, then compares them all with
   tuples from the other relation, and if they aren't in, they will be
   added.
   In fact it is same as above: O(|n|*|m|)

2.5  Thetajoin

   This operation is the combination of a product and a selection. So it
   is O(|n|*|m|) too.

2.6  Outer

    This operation is the union of the outer left and the outer right
    join. Makes it O(|n|*|m|) too.

2.7  Outer_left

   O(|n|*|m|), very depending on the number of the fields, because they
   are compared.

2.8  Outer_right

   Mirror operation of outer_lef

2.9  Join

   Same as above.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx