State Key Laboratory of Computer Architecture, in Institute of Computing Technology, Chinese Academy of Sciences. I am an open source enthusiast. I love coding, long-distance race and cycling.
I received my B.E. degree in Computer Science and Technology from Wuhan University in July 2016 and my Ph.D. degree in Computer Architecture from Institute of Computing Technology, CAS in July 2021 under the supervision of Prof. Feng Xiaobing.
Ph.D. thesis project On the System Optimizations of DNN Accelerators from the perspective of Benchmarking, Compiler and Runtime System Optimizations for the dedicated DNN accelerators.
My research interests span Compiler Techniques, Runtime System, Programming Language, Computer Architecture, Distributed Computing, Parallel Computing, Machine Learning and etc.
I am interested in everything about the underlying infrastructures, including but not limited to Compiler, Programming Language, Operating System, Runtime, Computer Architecture and etc.
2017.06 ~ 2019.12, High performance programming language BANG construction and implementation for Cambricon neural network chips. More details about the compiler, please checkout this page. Thanks to this project, I gained the ability to build and hack a large system.
2019.01 ~ 2019.02, Google AI Machine Learning Winter Camp, Peking Site. Automatic App Name Generator - A tool to generate popular app name. More details, please checkout this GitHub repo.
2019.09 ~ 2020.03, Characterizing the end-to-end deployment of DNNs on commercial AI accelerators, e.g., Cambricon MLU100 and Huawei Atlas300. For more details, please checkout this GitHub repo.
2019.06 ~ - , A tiny DSL for DNN accelerators with formal PL specifications. For more details, please checkout this GitHub repo.
2020.06 ~ 2021.02, Application-oblivious memory scheduling support for heterogeneous computing systems. The core idea is to implement a runtime system to automatically pinpoint the memory behaviors of each device memory block and detect the memory access patterns, generate the memory scheduling plan, and thereby to reduce the memory pressure of device accelerators. The implementation is based on the Apache top level project: Singa. For more details, please checkout this repo. Baselines see this repo.
2021.01 ~ 2021.02, Xilinx Customized Computing Winter Camp. A tiny TPU-like DNN accelerator design and optimization with HLS and LLVM CIRCT. Thanks for the PYNQ-Z2 FPGA board provided by Xilinx.
Any questions or suggestions, feel free to open an issue on GitHub or email me to jsonlee@whu.edu.cn .