JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2020, Vol. 55 ›› Issue (7): 55-66.doi: 10.6040/j.issn.1671-9352.0.2019.318

Previous Articles    

Nyström extension based on circulant matrix projection

LIU Jing-shu1, WANG Li1*, LIU Jing-lei2   

  1. 1. School of Big Data, Taiyuan University of Technology, Jinzhong 030600, Shanxi, China;
    2. School of Computer and Control Engineering, Yantai University, Yantai 264005, Shandong, China
  • Published:2020-07-08

Abstract: Unlike the random sampling approximation method, a projection method based on circular matrix to approximate the kernel matrix is designed. Firstly, the approximation sketch of the matrix is obtained by random sampling, then constructing a circular matrix as a projection matrix to map the data of inputting data space to feature space approximately. At last, the traditional Nyström method is extended by singular value decomposition on the feature space. Compared with other typical matrix approximation methods, the improved Nyström method has the advantages of low time complexity and high reconstruction accuracy. Our empirical studies on four real datasets corroborate our theoretical findings and demonstrate that our proposed random projection method can indeed achieve satisfactory reconstruction precision.

Key words: Nyströ, m approach, circular matrix, dimensionality reduction, random projection, reconstruction precision

CLC Number: 

  • TP18
[1] CAO L. Data science: a comprehensive overview[J]. ACM Computing Surveys, 2017, 50(3):43:1-43:42.
[2] SHEN Y, CHEN T, GIANNAKIS G B. Random feature-based online multi-kernel learning in environments with unknown dynamics[J]. Journal of Machine Learning Research, 2019, 20(22):1-36.
[3] HUANG P S, AVRON H, SAINATH T N, et al. Kernel methods match deep neural networks on timit[C] //Proceedings of 2014 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Italy: IEEE Signal Processing Society, 2014: 205-209.
[4] SCHOLKOPF B, SMOLA A J. Learning with kernels[J]. IEEE Transactions on Signal Processing, 2004, 52(8):2165-2176.
[5] MIKA S, RATSCH G, WESTON J, et al. Fisher discriminant analysis with kernels[C] //Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop. Nevada: IEEE, 1999: 41-48.
[6] SCHOLKOPF B, SMOLA A, MULLER K. Nonlinear component analysis as a kernel eigenvalue problem[J]. Neural Computation, 1998, 10(5):1299-1319.
[7] WANG L, REGE M, DONG M, et al. Low-rank kernel matrix factorization for large scale evolutionary clustering[J]. IEEE Transactions on Knowledge & Data Engineering, 2012, 24(6):1036-1050.
[8] YANG T, LI Y, MAHDAVI M, et al. Nyström method vs random fourier features: a theoretical and empirical comparison[C] //International Conference on Neural Information Processing Systems. Lake Tahoe: Curran Associates Inc, 2012: 476-484.
[9] LAN L, ZHANG K, GE H, et al. Low-rank decomposition meets kernel learning: a generalized Nyström method[J]. Artificial Intelligence, 2017, 250:1-15.
[10] KAI Z, KWOK J T. Clustered Nyström method for large scale manifold learning and dimension reduction[J]. IEEE Transactions on Neural Networks, 2010, 21(10):1576-1587.
[11] ZHANG X, ZONG L, YOU Q, et al. Sampling for Nyström extension-based spectral clustering: incremental perspective and novel analysis[J]. ACM Transactions on Knowledge Discovery from Data, 2016, 11(1):7:1-7:25.
[12] KOPPEL A, WARNELL G, STUMP E, et al. Parsimonious online learning with kernels via sparse projections in function space[J]. Journal of Machine Learning Research, 2019, 20(3):1-44.
[13] ZHANG X, SUN H, LIU Z, et al. Robust low-rank kernel multi-view subspace clustering based on the schatten p-norm and correntropy[J]. Information Sciences, 2019, 4779(6):430-447.
[14] LAST M. Kernel methods for pattern analysis[M]. Cambridge: Cambridge University Press, 2006: 1730-1730.
[15] FOWLKES C, BELONGIE S, FAN C, et al. Spectral grouping using the Nyström method[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2):214-225.
[16] WILLIAMS C K I, SEEGER M. Using the Nyström method to speed up kernel machines[C] //Advances in Neural Information Processing Systems 13(NIPS 2000). Breckenridge: MIT Press, 2001: 682-688.
[17] TALWALKAR A, KUMAR S, ROWLEY H A. Large-scale manifold learning[J]. CVPR, 2008, 6(1):1-30.
[18] PLATT J C. Fastmap, metricmap, and landmark mds are all Nyström algorithms[C] //Proceedings of 10th International Workshop on Artificial Intelligence and Statistics. Barbados: Society for Artificial Intelligence and Statistics, 2005: 261-268.
[19] SINHA K, BELKIN M. Semi-supervised learning using sparse eigenfunction bases[C] //Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Vancouver: Curran Associates, Inc, 2009: 1687-1695.
[20] KUMAR S, MOHRI M, TALWALKAR A. Ensemble Nyström method[C] //International Conference on Neural Information Processing Systems. Vancouver: Curran Associates, Inc, 2009: 1060-1068.
[21] RASHMI M, SANKARAN P. Optimal landmark point selection using clustering for manifold modeling and data classification[J]. Journal of Classification, 2019, 36(1):1-19.
[22] SINHA A, DUCHI J C. Learning kernels with random features[C] //Proceedings of the Advances in Neural Information Processing Systems. Barcelona: Curran Associates, Inc, 2016: 1298-1306.
[23] WANG S, DANG L, QIAN G, et al. Kernel recursive maximum correntropy with Nyström approximation[J]. Neurocomputing, 2019, 329(1):424-432.
[24] HENSMAN J, DURRANDE N, SOLIN A. Variational Fourier features for Gaussian processes[J]. Journal of Machine Learning Research, 2018, 18(151):1-52.
[25] HATAMIRAD S, PEDRAM M M. Low-rank approximation of large-scale matrices via randomized methods[J]. Journal of Supercomputing, 2018, 74(2):830-844.
[26] LI H, NILANJAN R, GUAN Y, et al. Fast large-scale spectral clustering via explicit feature mapping[J]. IEEE Transactions on Cybernetics, 2019, 49(3):1058-1071.
[27] SUN S, ZHAO J, ZHU J. A review of Nyström methods for large-scale machine learning[J]. Information Fusion, 2015, 26:36-48.
[28] LI M, KWOK J T, LU B L. Making large-scale Nyström approximation possible[C] //International Conference on Machine Learning. Israel: Omnipress, 2010: 631-638.
[29] SHABAT G, SHMUELI Y, AIZENBUD Y, et al. Randomized LU decomposition[J]. Applied and Computational Harmonic Analysis, 2018, 44(2):246-272.
[30] WANG L, LI M, JI H, et al. When collaborative representation meets subspace projection: a novel supervised framework of graph construction augmented by anti-collaborative representation[J]. Neurocomputing, 2019, 328(1):157-170.
[31] BOUTSIDIS C, ZOUZIAS A, MAHONEY M W, et al. Randomized dimensionality reduction for K-means clustering[J]. IEEE Transactions on Information Theory, 2015, 61(2):1045-1062.
[32] LI M, BI W, KWOK J T, et al. Large-scale Nyström kernel matrix approximation using randomized SVD[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(1):152-164.
[33] WOODRUFF D P. Sketching as a tool for numerical linear algebra[J]. Foundations and Trends in Theoretical Computer Science, 2013, 10(1):1-157.
[34] JOHNSON W B, LINDENSTRAUSS J, SCHECHTMAN G. Extensions of lipschitz maps into banach spaces[J]. Israel Journal of Mathematics, 1986, 54(2):129-138.
[35] YU F X, BHASKARA A, KUMAR S, et al. On binary embedding using circulant matrices[J]. Journal of Machine Learning Research, 2018, 18(150):1-30.
[36] LEWIS D D, YANG Y, ROSE T G, et al. Rcv1: a new benchmark collection for text categorization research[J]. Journal of Machine Learning Research, 2004, 5(2):361-397.
[37] LUCAS S, VIDAL E, AMIRI A, et al. A comparison of syntactic and statistical techniques for off-line OCR[C] //Grammatical Inference and Applications. Berlin: Springer Berlin Heidelberg, 1994: 168-179.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!