Current database management systems rely on operators to compare pairs of data elements based on identity and on total ordering relationships. However, those comparison operators are not suited to handle complex data, such as multimidia data (image, audio and large text), temporal series, genetic sequences, etc.Instead the similarity among data elements is the major issue to compare complex data, leading to the corresponding similarity-based operations to query complex data.Therefore, similarity-based comparison operators are much better tailored. There are unary and bynary similarity-based operations. The unary ones are employed in selection operations, and the binary ones are employed in the join operations.The standard query language for relational databases is SQL. However, it does not provide support for similarity queries. At tge GBdI-ICMC-USP we are developing an extension to SQL to enable similarity-based queries to be expressed in SQL. The present project is part of this developement, targeting to represent similarity based search operations both in SQl as well as in a command tree, that can provide query expression rewriting to allow query execution optimization. Therefore, the proposed activities include: developing extending SQl to support similarity-based queries (mixing similarity-based and non-similarity based comparison criteria), representing queries in internal command trees, defining heuristics and techniques for optimization of queries mixing selection and join operations.
News published in Agência FAPESP Newsletter about the scholarship: