Yuxuan Lou

National Unverisity of Singapore, Singapore

Here is Yuxuan Lou, a Ph.D student majored in Computer Scinece at NUS. My advisor is Prof. Yang You. I received my B.Sc. in Applied Mathematics from Fudan University and M.Sc. in Statistics from NUS. My research interest lies in Artificial Intelligence, Deep Learning and High Performance Computing. To be specific, I am currently working on scalable machine learning algorithms and large-scale pretrain models. You can also check my CV here or visit my Google Scholar Page for further information.


National University of Singapore, Singapore

Ph.D in Computer Science
School of Computing
January 2023 - Present

National University of Singapore, Singapore

M.Sc. in Statistics
School of Statistics and Probability
August 2020 - March 2022

Fudan University, Shanghai, China

B.Sc. in Applied Mathematics
School of Data Science
August 2018 - July 2020
School of Mathematical Science
August 2016- July 2018


Cross-token Modeling with Conditional Computation

Yuxuan Lou, Fuzhao Xue, Zangwei Zheng, Yang You, 2021.
NeurIPS 2022 In Submission

[Arxiv Preprint]

One student knows all experts know: From sparse to dense

Fuzhao Xue, Xiaoxin He, Xiaozhe Ren, Yuxuan Lou, Yang You, 2022.
NeurIPS 2022 In Submission

[Arxiv Preprint]

Go Wider Instead of Deeper

Fuzhao Xue, Ziji Shi, Yuxuan Lou, Yong Liu, Yang You, 2021.
AAAI 2022 Accepted

[Arxiv Preprint]

Research Experiences

Neural Network Model Scaling with Mixture of Experts

NUS HPC-AI Lab, Advised by Prof. Yang You

Revisited and reproduce modern ViT models and MLP-like models; Designed and built large-scale models with conditional computation based on Mixture of Experts(MoE). Introduced a fully-MLP architecture with conditional computation in two dimensions. Extended MoE to spatial dimension of image representation. Distributed model training on TPU clusters. Detailed ablation stydy to investigate the contribution of different model components. Introduced parameter sharing to ViT-MoE model. Proposed a illustration of the specific LayerNorm parameters. Two papers submitted to AAAI 2022.

Key Words: Mixture of Experts, Vision Transformer, MLP-like models, Conditional computation, Large-scale model design, Parameter sharing

March 2021 - Present

Neural Network based Image Compression and Image Query System

DAS LAB, Harvard University, Advised by Prof. Stratos Idreos

Built neural network models for image compression, which include Auto-encoder, adaptive arithmetic coding and adaptive code length regularization. Built models based on Pyramid CNN and Generative Adversarial Network for different query tasks according to compressed image representations. Introduced spp-net and inverse spp-net, which is designed to better understand and summarize the multi-scale knowledge of images. Model testing and Hyper-parameters adjusting.

Key Words: Image compression, Auto-encoder, Adaptive arithmetic coding, Adaptive code length regularization, Pyramid CNN, GAN, Spp-net

July 2019 - January 2020

Score System of Figure Skating Sports Base on LSTM

CV LAB, School of Data Science, Fudan University, Advised by Prof. Yanwei FU

Reviewed video analysis methods including SVR, CNN, 3D convolution, and LSTM. Built dataset from downloading figure skating videos, including NHK, TEB, COC, 4CC, etc., and filtered the dataset by removing the videos that are not fluent or coherent. Assisted to propose a neural network architecture that includes two complementary components: Self-Attentive LSTM and Multi-scale Convolutional Skip LSTM. Compared different pooling and regression methods.

Key Words: Figure skating score system, Dataset construction, Self-attentive LSTM, Multi-scale Convolutional Skip LSTM

May 2018 - January 2019

Design of Toolkit (fastNLP) for Natural Language Processing

School of Data Science, Fudan University, Advised by Prof. Xipeng Qiu

Learned the details and baselines of dataset SQuAD. Reviewed pre-training language model and methods including OpenAI GPT, ELMO, etc. Learned and built BERT model. Analyzed the model with the tasks of masked LM, next sentece prediction on SQuAD, GLUE, etc. Participated in designing FastNLP, a modularized and extensible toolkit for Natural Language Processing.

Key Words: FastNLP, BERT, SQuAD, masked LM, Next sentence prediction

September 2018 - December 2018

Professional Experience


Deep Learning Engineer
Colossal-AI, Colossal-AI Examples co-developer. Building the open domain dialog system with internet knowledge augmentation.

Key Words: Colossal-AI, Open domain dialog, Internet knowledge agumentation

February 2022 - December 2022

Interactive Entertainment Group, Tencent

Machine Learning Engineer Intern
Data mining and data cleaning of users' comments on specific games. Built machine learning models to classify the emotional levels of comments. Built deep learning abstractive text summarization models to extract the summary of comment contexts.

Key Words: Data mining, Emotion classification, Abstractive text summarization

March 2020 - June 2020


  • Software: Matlab, Latex, MS Office
  • Programming Language: Python, R, C++, Pascal
  • Deep Learning: Tensorflow, Pytorch, Keras
  • Database: SQL, Spark

Activities & Interests

During my undergraduate years, I was a member of the debate team of Mathematical Science school. I really enjoyed presenting evidence, defending and questioning arguments and developing persuasion techniques. And we won 2017 Fudan Debating Championship.

Besides, I served as a volunteer mathematical teacher for kindergarten children in Yangpu district. I had a lot of fun introducing basic mathematical concepts and knowledge to children.

In my free time, I enjoy reading science fictions. The three-body problem and Foundation series are my favourites