Dosovitskiy A, Brox T. Inverting Visual Representations with Convolutional Networks[C]// Computer Vision and Pattern Recognition. IEEE, 2016:4829-4837.
Mahendran A, Vedaldi A. Understanding deep image representations by inverting them[J]. 2014:5188-5196.
Yosinski J, Clune J, Nguyen A, et al. Understanding neural networks through deep visualization[J]. arXiv preprint arXiv:1506.06579, 2015.
Ba L J, Caruana R. Do Deep Nets Really Need to be Deep?[J]. Advances in Neural Information Processing Systems, 2013:2654-2662.
Bastani O, Kim C, Bastani H. Interpreting Blackbox Models via Model Extraction[J]. 2017.
Che Z, Purushotham S, Khemani R, et al. Distilling knowledge from deep networks with applications to healthcare domain[J]. arXiv preprint arXiv:1512.03542, 2015.
Hinton G, Vinyals O, Dean J. Distilling the Knowledge in a Neural Network[J]. Computer Science, 2015, 14(7):38-39.
Harvey N, Liaw C, Mehrabian A. Nearly-tight VC-dimension bounds for piecewise linear neural networks[C]//Conference on Learning Theory. 2017: 1064-1068.
Koiran P, Sontag E D. Neural Networks with Quadratic VC Dimension[M]. Academic Press, Inc. 1997.
Sontag E D. VC Dimension of Neural Networks[J]. Nato Asi, 1998:69--95.
Zhao B, Wu X, Feng J, et al. Diversified visual attention networks for fine-grained object classification[J]. arXiv preprint arXiv:1606.08572, 2016.
Xiao T, Xu Y, Yang K, et al. The application of two-level attention models in deep convolutional neural network for fine-grained image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 842-850.
Jaderberg M, Simonyan K, Zisserman A. Spatial transformer networks[C]//Advances in Neural Information Processing Systems. 2015: 2017-2025.
Firat O, Cho K, Bengio Y. Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism[J]. 2016:866-875.
Cheng Y, Wu H, Wu H, et al. Agreement-based joint training for bidirectional attention-based neural machine translation[C]// International Joint Conference on Artificial Intelligence. AAAI Press, 2016:2761-2767.
Meng F, Lu Z, Li H, et al. Interactive Attention for Neural Machine Translation[J]. 2016.
Chen J, Zhang H, He X, et al. Attentive collaborative filtering: Multimedia recommendation with item-and component-level attention[C]//Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 2017: 335-344.
Xiao J, Ye H, He X, et al. Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks[J]. 2017:3119-3125.
Liu Q, Zeng Y, Mokhosi R, et al. STAMP: short-term attention/memory priority model for session-based recommendation[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018: 1831-1839.
Chen J, Zhuang F, Hong X, et al. Attention-driven Factor Model for Explainable Personalized Recommendation[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 2018: 909-912.