International Journal of Intelligent Systems and Applications(IJISA)
ISSN: 2074-904X (Print), ISSN: 2074-9058 (Online)
Published By: MECS Press
IJISA Vol.3, No.4, Jun. 2011
Speed up linear scan in high-dimensions by sorting one-dimensional projections
Full Text (PDF, 173KB), PP.41-48
High-dimensional indexing is a pervasive challenge faced in multimedia retrieval. Existing indexing methods applying linear scan strategy, such as VA-file and its variations, are still efficient when the dimensionality is high. In this paper, we propose a new access idea implemented on linear scan based methods to speed up the nearest-neighbor queries. The idea is to map high-dimensional points into two kinds of one-dimensional values using projection and distance computation. The projection values on the line determined by the first Principal Component are sorted and indexed using a B+-tree, and the distances of each point to a reference point are also embedded into leaf node of the B+-tree. When performing nearest neighbor search, the Partial Distortion Searching and triangular inequality are employed to prune search space. In the new search algorithm, only a small portion of data points need to be linearly accessed by computing the bounded distance on the one-dimensional line, which can reduce the I/O and processor time dramatically. Experiment results on large image databases show that the new access method provides a faster search speed than existing high-dimensional index methods.
Cite This Paper
Jiangtao Cui, Bin Xiao, Gengdai Liu, Lian Jiang,"Speed up linear scan in high-dimensions by sorting one-dimensional projections", International Journal of Intelligent Systems and Applications(IJISA), vol.3, no.4, pp.41-48, 2011. DOI: 10.5815/ijisa.2011.04.06
C. Bohm, S. Berchtold, D. A. Keim, “Searching in high-dimensional spaces-index structures for improving the performance of multimedia databases,” ACM Comput. Surv., vol. 33, no. 3, pp. 322–373, 2001.
R. Weber, H.-J. Schek, S. Blott, “A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces,” in VLDB 1998, pp. 194–205.
K. S. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, “When is ”nearest neighbor” meaningful?” in ICDT 1999, pp. 217–235.
K. V. R. Kanth, D. Agrawal, and A. Singh, “Dimensionality reduction for similarity searching in dynamic databases,” SIGMOD Rec., vol. 27, no. 2, pp. 166-176, 1998.
K. Chakrabarti and S. Mehrotra, “Local dimensionality reduction: A new approach to indexing high dimensional spaces,” in VLDB’00: Proceeding of the 26th International Conference on Very Large Data Bases. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2000, pp. 89-100.
H. T. Shen, X. Zhou, and A. Zhou, “An adaptive and dynamic dimensionality reduction method for high-dimensional indexing,” The VLDB Journal, vol. 16, no. 2, pp. 219-234, 2007.
X. Lian and L. Chen, “A general cost model for dimensionality reduction in high dimensionality spaces,” in ICDE ’07: Proceedings of the 23th International Conference on Data Engineering, 2007, pp. 66-75.
H. Ferhatosmanoglu, E. Tuncel, D. Agrawal, and A. E. Abbadi, “high-dimensional nearest neighbor searching,” Information Systems, vol. 31, no. 6, pp. 512-540, 2006.
J. Cui, S. Zhou, and J. Sun, “Efficient high-dimensional indexing by sorting principal component,” Pattern Recogn. Lett., vol. 28, no. 16, pp. 2412–2418, 2007.
S. Berchtold, C. Bohm, H. V. Jagadish, H.-P. Kreiegel, and J. Sander, “Independent quantization: An index compression technique for high-dimensional data spaces,” in ICDE ’00: Proceedings of the 16th International Conference on Data Engineering. Washington, DC, USA: IEEE Computer Society, 2000, p. 577.
G.-H. Cha and C.-W. Chung, “The gc-tree: a high-dimensional index structure for similairty search in image databases,” IEEE Trans Multimedia, vol. 4, pp. 235-247, 2002.
S. Berchtold, C. Bohm, and H.-P. Kriegal, “The pyramid-technique: towards breaking the curse of dimensionality,” SIGMOD Rec., vol. 27, no. 2, pp. 142-153, 1998.
H. V. Jagadish, B. C. Ooi, K.-L. Tan, C. Yu, and R. Zhang, “iDistance: An adaptive b+-tree based indexing method for nearest neighbor search,” ACM Trans. Database Syst., vol. 30, no. 2, pp. 364-397, 2005.
M. Ortega, Y. Rui, K. Chakrabarti, S. Mehrotra, and T. S. Huang, “Supporting similarity queries in mars,” in MULTIMEDIA 1997, pp. 403–413.
B. S. Manjunath and W. Y. Ma, “Texture features for browsing and retrieval of image data,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, no. 8, pp. 837-842, 1996
H. T. Shen, B. C. Ooi, , Z. Huang, and X. Zhou, “Towards effective indexing for very large video sequence database,” in SIGMOD 2005, pp. 730–741.
K. Chakrabarti and S. Mehrotra, “Local dimensionality reduction: A new approach to indexing high dimensional spaces,” in VLDB 2000, pp. 89–100