PARALLEL QUERY EVALUATION A relational query execution plan is a graph/ tree of relational algebra operators ( based on this operators can execute in parallel) and the operators in a graph can be executed in parallel. If an operator consumes the output of a second operator, we have pipelined parallelism.
Data partitioning: In this case large database are partitioned horizontally across several disk, this enables us to exploit the I/O bandwidth of the disk by reading and writing them in parallel. This can be done in the following ways:
Round Robin Partitioning:If there are n processors, the 1th tuple is assigned to processor i mod n round-robin partitioning. Round-robin partitioning is suitable for efficiently evaluating queries that access the entire relation. If only a subset of the tuples is required, hash partitioning and range partitioning are better than round-robin partitioning.
Hash partitioning: A hash function is applied to (selected fields of) a tuple to determine its processor.Hash partitioning has the additional virtue that it keeps data evenly distributed even if the data grows and shrinks over time.
Range Partitioning: Tuples are sorted and ranges are chosen for the sort key values so that each range contains roughly the same number of tuples, tuples in range, I re assigned to processor i. Range Partitioning can lead to data skew.
The Disadvantages of Parallel database 1.Implementation is highly expensive.
2. Handling Parallel database simultaneously is difficult and complex.
3. A lot of resources are needed to support and maintain the database.