TY - GEN
T1 - Fragment and replicate algorithms for non-equi-join evaluation on smart disks
AU - Stoumpos, Vassilis
AU - Delis, Alex
PY - 2009
Y1 - 2009
N2 - The predicates in a non-equi-join can be anything but equality relations. Non-equi-join predicates can be as simple as an inequality expression between two join relation fields, or as complex as a user-defined function that carries out arbitrary complex comparisons. The nature of non-equi-join calls for predicate evaluation over all possible combinations of tuples in a two-way join. In this paper, we consider the family of fragment and replicate join algorithms that facilitates non-equijoin evaluation and adapt it in a Smart Disk environment. We use Smart Disk as an umbrella term for a variety of different storage devices featuring an embedded processor that may offload data processing from the main CPU. Our approach partially replicates one of the join relations in order to harness all processing capacity in the system. However, partial replication introduces problems with synchronizing concurrent algorithmic steps, load balancing, and selection among different join evaluation alternatives. We use a processing model to avoid performance pitfalls and autonomously select algorithm parameters. Through experimentation we find our proposed algorithms to utilize all system resources and, thus, yield better performance.
AB - The predicates in a non-equi-join can be anything but equality relations. Non-equi-join predicates can be as simple as an inequality expression between two join relation fields, or as complex as a user-defined function that carries out arbitrary complex comparisons. The nature of non-equi-join calls for predicate evaluation over all possible combinations of tuples in a two-way join. In this paper, we consider the family of fragment and replicate join algorithms that facilitates non-equijoin evaluation and adapt it in a Smart Disk environment. We use Smart Disk as an umbrella term for a variety of different storage devices featuring an embedded processor that may offload data processing from the main CPU. Our approach partially replicates one of the join relations in order to harness all processing capacity in the system. However, partial replication introduces problems with synchronizing concurrent algorithmic steps, load balancing, and selection among different join evaluation alternatives. We use a processing model to avoid performance pitfalls and autonomously select algorithm parameters. Through experimentation we find our proposed algorithms to utilize all system resources and, thus, yield better performance.
KW - Active disks
KW - Array of disks
KW - Database join
KW - Fragment and replicate parallelism
KW - Non-equi-joins
KW - Smart disks
UR - http://www.scopus.com/inward/record.url?scp=70449640494&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70449640494&partnerID=8YFLogxK
U2 - 10.1109/ISADS.2009.5207358
DO - 10.1109/ISADS.2009.5207358
M3 - Conference contribution
AN - SCOPUS:70449640494
SN - 9781424443277
T3 - Proceedings - 2009 International Symposium on Autonomous Decentralized Systems, ISADS 2009
SP - 471
EP - 478
BT - Proceedings - 2009 International Symposium on Autonomous Decentralized Systems, ISADS 2009
T2 - 2009 International Symposium on Autonomous Decentralized Systems, ISADS 2009
Y2 - 23 March 2009 through 25 March 2009
ER -