基于循环矩阵投影的Nyström扩展

doi:10.6040/j.issn.1671-9352.0.2019.318

摘要/Abstract

摘要： 不同于采样矩阵近似方法,设计了一种基于随机循环矩阵投影来实现矩阵的近似。首先,利用随机采样得到一个初始矩阵的近似轮廓,然后构造循环嵌入矩阵,将该循环矩阵作为投影矩阵,从而将输入数据空间的初始轮廓嵌入到一个低维的特征子空间上,最后在特征子空间上进行奇异值分解,从而扩展了传统的Nyström方法。与其他典型的矩阵近似方法相比,所设计的Nyström方法具有时间复杂度低、重构精度高的优点。最后通过实验证实了所设计的循环矩阵投影方法的有效性,可以实现对传统Nyström方法的有效扩展。

关键词: Nyströ, m方法, 循环矩阵, 维度约简, 随机投影, 重构精度

Abstract: Unlike the random sampling approximation method, a projection method based on circular matrix to approximate the kernel matrix is designed. Firstly, the approximation sketch of the matrix is obtained by random sampling, then constructing a circular matrix as a projection matrix to map the data of inputting data space to feature space approximately. At last, the traditional Nyström method is extended by singular value decomposition on the feature space. Compared with other typical matrix approximation methods, the improved Nyström method has the advantages of low time complexity and high reconstruction accuracy. Our empirical studies on four real datasets corroborate our theoretical findings and demonstrate that our proposed random projection method can indeed achieve satisfactory reconstruction precision.

Key words: Nyströ, m approach, circular matrix, dimensionality reduction, random projection, reconstruction precision

中图分类号:

TP18

刘静姝,王莉,刘惊雷. 基于循环矩阵投影的Nyström扩展[J]. 《山东大学学报(理学版)》, 2020, 55(7): 55-66.

LIU Jing-shu, WANG Li, LIU Jing-lei. Nyström extension based on circulant matrix projection[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(7): 55-66.

参考文献

[1] CAO L. Data science: a comprehensive overview[J]. ACM Computing Surveys, 2017, 50(3):43:1-43:42.
[2] SHEN Y, CHEN T, GIANNAKIS G B. Random feature-based online multi-kernel learning in environments with unknown dynamics[J]. Journal of Machine Learning Research, 2019, 20(22):1-36.
[3] HUANG P S, AVRON H, SAINATH T N, et al. Kernel methods match deep neural networks on timit[C] //Proceedings of 2014 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Italy: IEEE Signal Processing Society, 2014: 205-209.
[4] SCHOLKOPF B, SMOLA A J. Learning with kernels[J]. IEEE Transactions on Signal Processing, 2004, 52(8):2165-2176.
[5] MIKA S, RATSCH G, WESTON J, et al. Fisher discriminant analysis with kernels[C] //Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop. Nevada: IEEE, 1999: 41-48.
[6] SCHOLKOPF B, SMOLA A, MULLER K. Nonlinear component analysis as a kernel eigenvalue problem[J]. Neural Computation, 1998, 10(5):1299-1319.
[7] WANG L, REGE M, DONG M, et al. Low-rank kernel matrix factorization for large scale evolutionary clustering[J]. IEEE Transactions on Knowledge & Data Engineering, 2012, 24(6):1036-1050.
[8] YANG T, LI Y, MAHDAVI M, et al. Nyström method vs random fourier features: a theoretical and empirical comparison[C] //International Conference on Neural Information Processing Systems. Lake Tahoe: Curran Associates Inc, 2012: 476-484.
[9] LAN L, ZHANG K, GE H, et al. Low-rank decomposition meets kernel learning: a generalized Nyström method[J]. Artificial Intelligence, 2017, 250:1-15.
[10] KAI Z, KWOK J T. Clustered Nyström method for large scale manifold learning and dimension reduction[J]. IEEE Transactions on Neural Networks, 2010, 21(10):1576-1587.
[11] ZHANG X, ZONG L, YOU Q, et al. Sampling for Nyström extension-based spectral clustering: incremental perspective and novel analysis[J]. ACM Transactions on Knowledge Discovery from Data, 2016, 11(1):7:1-7:25.
[12] KOPPEL A, WARNELL G, STUMP E, et al. Parsimonious online learning with kernels via sparse projections in function space[J]. Journal of Machine Learning Research, 2019, 20(3):1-44.
[13] ZHANG X, SUN H, LIU Z, et al. Robust low-rank kernel multi-view subspace clustering based on the schatten p-norm and correntropy[J]. Information Sciences, 2019, 4779(6):430-447.
[14] LAST M. Kernel methods for pattern analysis[M]. Cambridge: Cambridge University Press, 2006: 1730-1730.
[15] FOWLKES C, BELONGIE S, FAN C, et al. Spectral grouping using the Nyström method[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2):214-225.
[16] WILLIAMS C K I, SEEGER M. Using the Nyström method to speed up kernel machines[C] //Advances in Neural Information Processing Systems 13(NIPS 2000). Breckenridge: MIT Press, 2001: 682-688.
[17] TALWALKAR A, KUMAR S, ROWLEY H A. Large-scale manifold learning[J]. CVPR, 2008, 6(1):1-30.
[18] PLATT J C. Fastmap, metricmap, and landmark mds are all Nyström algorithms[C] //Proceedings of 10th International Workshop on Artificial Intelligence and Statistics. Barbados: Society for Artificial Intelligence and Statistics, 2005: 261-268.
[19] SINHA K, BELKIN M. Semi-supervised learning using sparse eigenfunction bases[C] //Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Vancouver: Curran Associates, Inc, 2009: 1687-1695.
[20] KUMAR S, MOHRI M, TALWALKAR A. Ensemble Nyström method[C] //International Conference on Neural Information Processing Systems. Vancouver: Curran Associates, Inc, 2009: 1060-1068.
[21] RASHMI M, SANKARAN P. Optimal landmark point selection using clustering for manifold modeling and data classification[J]. Journal of Classification, 2019, 36(1):1-19.
[22] SINHA A, DUCHI J C. Learning kernels with random features[C] //Proceedings of the Advances in Neural Information Processing Systems. Barcelona: Curran Associates, Inc, 2016: 1298-1306.
[23] WANG S, DANG L, QIAN G, et al. Kernel recursive maximum correntropy with Nyström approximation[J]. Neurocomputing, 2019, 329(1):424-432.
[24] HENSMAN J, DURRANDE N, SOLIN A. Variational Fourier features for Gaussian processes[J]. Journal of Machine Learning Research, 2018, 18(151):1-52.
[25] HATAMIRAD S, PEDRAM M M. Low-rank approximation of large-scale matrices via randomized methods[J]. Journal of Supercomputing, 2018, 74(2):830-844.
[26] LI H, NILANJAN R, GUAN Y, et al. Fast large-scale spectral clustering via explicit feature mapping[J]. IEEE Transactions on Cybernetics, 2019, 49(3):1058-1071.
[27] SUN S, ZHAO J, ZHU J. A review of Nyström methods for large-scale machine learning[J]. Information Fusion, 2015, 26:36-48.
[28] LI M, KWOK J T, LU B L. Making large-scale Nyström approximation possible[C] //International Conference on Machine Learning. Israel: Omnipress, 2010: 631-638.
[29] SHABAT G, SHMUELI Y, AIZENBUD Y, et al. Randomized LU decomposition[J]. Applied and Computational Harmonic Analysis, 2018, 44(2):246-272.
[30] WANG L, LI M, JI H, et al. When collaborative representation meets subspace projection: a novel supervised framework of graph construction augmented by anti-collaborative representation[J]. Neurocomputing, 2019, 328(1):157-170.
[31] BOUTSIDIS C, ZOUZIAS A, MAHONEY M W, et al. Randomized dimensionality reduction for K-means clustering[J]. IEEE Transactions on Information Theory, 2015, 61(2):1045-1062.
[32] LI M, BI W, KWOK J T, et al. Large-scale Nyström kernel matrix approximation using randomized SVD[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(1):152-164.
[33] WOODRUFF D P. Sketching as a tool for numerical linear algebra[J]. Foundations and Trends in Theoretical Computer Science, 2013, 10(1):1-157.
[34] JOHNSON W B, LINDENSTRAUSS J, SCHECHTMAN G. Extensions of lipschitz maps into banach spaces[J]. Israel Journal of Mathematics, 1986, 54(2):129-138.
[35] YU F X, BHASKARA A, KUMAR S, et al. On binary embedding using circulant matrices[J]. Journal of Machine Learning Research, 2018, 18(150):1-30.
[36] LEWIS D D, YANG Y, ROSE T G, et al. Rcv1: a new benchmark collection for text categorization research[J]. Journal of Machine Learning Research, 2004, 5(2):361-397.
[37] LUCAS S, VIDAL E, AMIRI A, et al. A comparison of syntactic and statistical techniques for off-line OCR[C] //Grammatical Inference and Applications. Berlin: Springer Berlin Heidelberg, 1994: 168-179.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed