Xiao Yan(晏潇)

Research Assistant Professor

DataBase Group
Department of Computer Science & Engineering,
Sothern University of Science and Technology,
Nanshan, Shenzhen, China


I currently work in DataBaseGroup@SUSTECH led by Prof. Bo Tang. I obtained my Ph.D. degree from The Chinese University of Hong Kong under the supervision of Pof. James Cheng (2016~2020). Before coming to CUHK, I got my M.Phil. degree from Beijing University of Posts and Telecommunications under the supervision of Prof. Ping Zhang (2012~2015), and my bachelor degree from University of Electronic Science and Technology of China (2008~2012).

Research Interest

My research interest is to make large-scale data processing and machine learning efficient. The specific areas include large-scale information retrieval (e.g., search and recommendation), probabilistic algorithms (e.g., hashing and sketch) and systems for large-scale data processing (e.g., data mining and machine learning).

Student Recruitment

Our team is looking for Ph.D. students, Master students, research interns and undergraduate interns. Due to limited quota, the candidates are expected to be self-motivated, hardworking and have strong implementation or mathematical skills. Please feel free to send me your CV or schedule a discussion via email.

  • Convolutional Embedding for Edit Distance, In ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) , 2020. With Xinyan Dai, Kaiwen Zhou, Yuxuan Wang, Han Yang, James Cheng. Corresponding Author. [link].

  • Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search, In AAAI Conference on Artificial Intelligence (AAAI) , 2020. With Xinyan Dai, Kelvin K. W. Ng, Jie Liu, James Cheng. Co-first Author, Oral Presentation. [arxiv] [github].

  • Understanding and Improving Proximity Graph based Maximum Inner Product Search, In AAAI Conference on Artificial Intelligence (AAAI) , 2020. With Jie Liu, Xinyan Dai, Zhirong Li, James Cheng, Ming-Chang Yang. Co-first Author. [arxiv].

  • Wasserstein Collaborative Filtering for Item Cold-start Recommendation, In ACM Conference on User Modeling, Adaptation and Personalization (UMAP) , 2020. With Yitong Meng, Weiwen Liu, Huanhuan Wu, James Cheng. Corresponding Author. [link].

  • PMD: A Novel User Distance for Recommender Systems, In European Conference on Information Retrieval (ECIR) , 2020. Yitong Meng, Xinyan Dai, Xiao Yan, James Cheng, Weiwen Liu, Jun Guo, Benben Liao, Guangyong Chen. [arxiv].

  • Tangram: Bridging Immutable and Mutable Abstractions for Distributed Data Analytics, In USENIX Annual Technical Conference (ATC) , 2019. With Yuzhen Huang, Guanxian Jiang, Tatiana Jin, James Cheng, An Xu, Zhanhan Liu, Shuo Tu. Co-first Author. [link] [github].

  • Pyramid: A General Framework for Distributed Similarity Search, In IEEE International Conference on Big Data (IEEE BigData) , 2019. With Shiyuan Deng, Kelvin KW Ng, Chenyu Jiang, James Cheng. Co-first Author. [arxiv].

  • Grasper: A High Performance Distributed System for OLAP on Property Graphs, In ACM Symposium on Cloud Computing (SoCC) , 2019. Hongzhi Chen, Changji Li, Juncheng Fang, Chenghuan Huang, James Cheng, Jian Zhang, Yifan Hou, Xiao Yan. [link].

  • Norm-Ranging LSH for Maximum Inner Product Search, In Advances in Neural Information Processing Systems (NeurIPS) , 2018. Xiao Yan, Jinfeng Li, Xinyan Dai, Hongzhi Chen, James Cheng. [arxiv] [github].

  • A general and efficient querying method for learning to hash, In ACM SIGMOD International Conference on Management of Data(SIGMOD) , 2018. Jinfeng Li, Xiao Yan, Jian Zhang, An Xu, James Cheng, Jie Liu, Kelvin K.W. Ng, Ti-chung Cheng. [link].

  • G-Miner: an efficient task-oriented graph mining system, In European Conference on Computer Systems (EuroSys) , 2018. Hongzhi Chen, Miao Liu, Yunjian Zhao, Xiao Yan, Da Yan, James Cheng. [link].

  • FlexPS: Flexible parallelism control in parameter server architecture, In International Conference on Very Large Data Bases (VLDB) , 2018. Yuzhen Huang, Tatiana Jin, Yidi Wu, Zhenkun Cai, Xiao Yan, Fan Yang, Jinfeng Li, Yuying Guo, James Cheng. [link].

  • LoSHa: A general framework for scalable locality sensitive hashing, in ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) , 2017. Jinfeng Li, James Cheng, Fan Yang, Yuzhen Huang, Yunjian Zhao, Xiao Yan, Ruihao Zhao. [link].         
  • Services
  • International Joint Conferences on Artificial Intelligence (IJCAI), 2021
  • AAAI Conference on Artificial Intelligence (AAAI), 2021
  • International Conference on Machine Learning (ICML), 2020
  • Advances in Neural Information Processing Systems (NeurIPS), 2020
  • Very Large Data Base Journal (VLDBJ)
  • IEEE Transactions on Knowledge and Data Engineering (TKDE)
  • IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
  • Pattern Recognition

  • Contact Information
  • Office
  •         Room 1016, Nanshan iPark A7
  • Email
  •         yanxiaosunny@gmail.com