Recent News

I joined Nvidia to work on Autonomous Vehicle and Robotics!
We will be hosting the tutorial of Object-centric Representations in Computer Vision in CVPR 2024. Stay tuned and see you in Seattle!
🚀 Exciting News! 📘 Our latest survey paper is now released, presenting a comprehensive analysis of hallucination phenomena in multimodal large language models (MLLMs), also known as Large Vision-Language Models (LVLMs). (Paper, Resource Repo).
Two papers got accpeted to CVPR 2024: Adaptive Slot Attention (Paper, Project Page), Learning for Transductive Threshold Calibration in Open-World Recognition (Paper).
Introduce our ICLR 2024 work, 🔥Instruct Video-to-Video🔥, an efficient approach for video editing that eliminates the need for per-video-per-model finetuning by constructing a synthetic paired video dataset. (Paper, Code)
Four papers got accpeted to ICCV 2023: OC-MOT (Paper, Code), Slot-Naming (Paper), C2F-Seg(Paper, Project Page), EoRaS(Paper).
One paper is accepted to ICLR 2023: Bridging the Gap to Real-World Object-Centric Learning. Paper link and code link.
One paper is accepted to NeurIPS 2022 (Spotlight): Self-supervised Amodal Video Object Segmentation. Paper link and code link.

About Me

I recently came back to the frontline of autonomous vehicle and robotics, working as a principal engineer at Nvidia.
I was an Applied Science Manager at Amazon Web Service AI Shanghai Lablet, leading computer vision efforts. I play a lot with objects. In this period, I will be focusing on object-centric learning, visual-language model, graph neural network and causal representation learning, exploring and exploiting their usage in applications like video analysis, 3D vision, autonomous driving and robotics. I also contributed to the Graph Neural Network framework DGL and Object-centric Learning Framework OCLF.
Before joining Amazon, I was a Staff Machine Learning Scientist at Tesla Autopilot AI/Vision team, working with Dr. Andrej Karpathy. I was one of the major contributors of the Autopilot vision neural network stack and the task owner of Autopilot (Dynamic and Static) Object Detection during 2017 - 2020. My working items have been shipped into hundreds of thousands of Tesla cars worldwide during major Autopilot releases, contributing to Autopilot functionalities like Traffic-Aware Cruise Control, Auto Lane Change, Automatic Emergency Braking, Navigation on Autopilot, Smart Summon, etc.
Prior to Tesla, I spent 3.25 years at Microsoft. I was a Software Engineer 2 at Microsoft Bing Multimedia team (now under Microsoft AI & Research Org) working with Dr. Linjun Yang, where I was working on Image-Text Semantic Embedding to contribute to functionalities like Image Annotation and Image Search in Bing Search Engine. And during my graduate years, I interned at Microsoft Research Asia, advised by Prof. Zheng Zhang and Dr. Kuiyuan Yang, where I was working on both training platform and vision applications of deep learning. I was a major contributor of the open-source deep learning training framework Minerva and also contributed to the machine learning library MXNet.
I received M.S degree in Computer Science from Wangxuan Institute Of Computer Technology, Peking University, advised by Prof. Yuxin Peng. And B.S degree in Computer Science from Nankai University.
My enthusiasm is to apply machine learning to large-scale, life-changing technologies, currently with a focus on computer vision related applications.

Tianjun Xiao

Recent News

About Me