Properties of the unary similarity operator integrated to the relational algebra.
Título da Revista
ISSN da Revista
Título de Volume
Conventional operators for data comparison are either based on exact matching or on total order relationship among elements. Neither of them are appropriate to manage complex data, such as multimedia data (e.g., images, audio and large texts), time series and genetic sequences. In fact, the most meaningful way to compare complex data is by similarity. However, the Relational Algebra, employed in the relational Database Management Systems (RDBMS), cannot express similarity criteria. In order to address this support, an extension to the Relational Algebra is under development at GBdI-ICMCUSP (Databases and Images Group), aiming to represent similarity queries in algebraic expressions. This technical report defines a formal framework which embodies the unary similarity operators into Relational Algebra, and precisely define their algebraic properties when used in query expressions either alone or combined with the existing exact matching and relational operators. We also show how to take advantage of such properties to optimize queries that include similarity operators, presenting a similarity query optimizer developed for SIREN (the Similarity Retrieval Engine), which uses a existing RDBMS to answer similarity queries.