Binhua Zuo
Education
The University of Tokyo
M.S. in Information Science and Technology [Jul. 2018]
Pattern Recognition, Algorithm Design, Data Analysis, Robotics and Virtual Reality Systems, etc.
Beijing University of Posts and Telecommunications
B.S. in Electronic Science and Technology [Jul. 2016]
Signals and systems, Electronic and Circuit Foundation, Digital Signal Processing, etc.
Work experience
Flexiv Robotics
Senior AI Algorithm engineer (Jul. 2019 - Present)
- Conducted research and implementation on state-of-the-art computer vision algorithms, such as object detection (include 3D and rotated object), instance segmentation, etc.
- Developed a multi-task learning framework for learning: object location, mask and pose.
- Designed and built the Noema training cloud platform, a complete pipeline from data labeling, manipulation and augmentation, to module training and deployment, with up to 9 kinds of vision tasks.
- Developed a highly configurable training framework, with pluginable tools like data augmentation and model architecture customization, model compression and optimization, etc.
- Created an inference engine featuring easy-to-design multi-model workflows, with configurable and customizable data pre- and post-processing.
Baidu Computer Vision Technology Departmentx
Computer vision engineer (Nov. 2018 - Jul. 2019)
- Conducted face attribute editing in videos, such as swapping genders and ages (young or old).
- Developed ATT-GAN and ST-GAN using PaddlePaddle.
- Conducted low-quality video detection(eg. shaking videos detection) based on flow-net.
Huawei Japan Image Research Center
Computer vision researcher - Intern (Apr. - Oct. 2018)
- Conducted research on 3D reconstruction and depth estimation.
- Predict the surface normal and boundary, and then combine the raw depth image to get the complete depth image.
- Transplanted code from Torch to Caffe framework, and ran it on Kirin 980’s NPU.
DJI Japan
Computer vision researcher - Intern (Sep. - Oct. 2017)
- Conducted research on image super resolution.
- Utilized a Laplacian layer structure and two parallel networks to extract the image feature and reconstruct the image simultaneously.
- Added two residual feedbacks, increasing the PSNR by 0.1 dB.
Sato’s Lab, University of Tokyo
Master student (Oct. 2016 - Aug. 2018)
- Conducted research on hand recognition and egocentric video summarization.
- Added a prediction module in RNN to better extract important events from the video.
- Used a salience model to predict the gaze location.
Skills
- Proficient in PyTorch, PaddlePaddle, TensorFlow, Python, C++, etc.
- Experienced in
- computer vision algorithms: object detection (3D object, oriented object, etc.), semantic/instance segmentation, etc.
- model conversion, compression and optimization.
- Chinese (native), English (fluent), Japanese (fluent, N1), Spanish (basic).