***New website link at: https://sites.google.com/view/gugurus***
I am a Research Scientist at Google Brain, where I mainly work on problems in deep learning, reinforcement learning, robotics, and probabilistic machine learning. My recent research focuses on sample-efficient RL methods that could scale to solve difficult continuous control problems in the real-world, which have been covered by Google Research Blogpost and MIT Technology Review.
I completed PhD in Machine Learning at the University of Cambridge and the Max Planck Institute for Intelligent Systems in Tübingen, where I was co-supervised by Richard E. Turner, Zoubin Ghahramani, and Bernhard Schölkopf. During my PhD, I also collaborated closely with Sergey Levine at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. I hold my B.ASc. in Engineering Science from the University of Toronto, where I did my thesis with Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms. I also had great fun time working with Steve Mann, developing real-time HDR capture for wearable cameras/displays. I interned at Google Brain hosted by Ilya Sutskever and Vincent Vanhoucke. I also volunteered as a Lab Scientist at Creative Destruction Lab, one of the leading tech-startup incubators in Canada. My PhD was funded by Cambridge-Tübingen PhD Fellowship, NSERC and Google Focused Research Award.
I am a Japan-born Chinese Canadian, and I speak, read, and write in three languages. Having lived in Japan, China, Canada, the US, the UK, and Germany, I go under multiple names: Shane Gu, Shixiang Gu, 顾世翔, 顧世翔(ぐう せいしょう). My Chinese name means “world” (世) and “flying” (翔), and thus I “flew around the world”.
George Tucker, Dieterich Lawson, Shixiang Gu, Chris J. Maddison. “Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives”. [Arxiv]
- Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. “Data-Efficient Hierarchical Reinforcement Learning”. NIPS 2018. [Arxiv] [Videos]
- George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine. “The Mirage of Action-Dependent Baselines in Reinforcement Learning”. ICML 2018. [Arxiv]
- Vitchyr Pong*, Shixiang Gu*, Murtaza Dalal, Sergey Levine. “Temporal Difference Models: Model-Free Deep RL for Model-Based Control”. ICLR 2018. *equal contribution [Paper] [Arxiv]
- Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine. “Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning”. ICLR 2018. [Paper] [Arxiv] [Videos]
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine. “Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning”. NIPS 2017. [Paper]
- Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, Jose Miguel Hernndez Lobato, Richard E. Turner, Douglas Eck. “Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control”. ICML 2017. [Paper]
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. “Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic”. ICLR 2017 [Oral, ~3%]. [Paper]
- Eric Jang, Shixiang Gu, Ben Poole. “Categorical Reparametrization with Gumble-Softmax”. ICLR 2017. [Paper]
- Shixiang Gu*, Ethan Holly*, Timothy Lillicrap, Sergey Levine. “Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates”. ICRA 2017. [Paper] [Google Blogpost] [MIT Technology Review] [ZDNet] [Video] *equal contribution
- Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine. “Continuous Deep Q-Learning with Model-based Acceleration”. ICML 2016. [Paper] [Arxiv]
- Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih. “MuProp: Unbiased Backpropagation for Stochastic Neural Networks”. ICLR 2016. [Paper]
- Shixiang Gu, Zoubin Ghahramani, Richard E. Turner. “Neural Adaptive Sequential Monte Carlo”. NIPS 2015. [Paper] [Arxiv] [Supplementary]
- Nilesh Tripuraneni*, Shixiang Gu*, Hong Ge, Zoubin Ghahramani. “Particle Gibbs for Infinite Hidden Markov Models”. NIPS 2015. [Paper] [Arxiv] *equal contribution
- Steve Mann, Raymond Chun Hing Lo, Kalin Ovtcharov, Shixiang Gu, David Dai, Calvin Ngan, Tao Ai. “Realtime HDR (High Dynamic Range) Video for EyeTap Wearable Computers, FPGA-Based Seeing Aids, and GlassEyes”, IEEE CCECE 2012, Montreal, 2012 April 29 to May 2. 6 pages, to be indexed in IEEE Xplore. ACM SIGGRAPH 2012, Emerging Technologies Exhibition. [Paper] [BibTex] [Video]
- Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck. “Tuning Recurrent Neural Networks with Reinforcement Learning”. NIPS 2016 Deep Reinforcement Learning Workshop. [Paper] [MIT Technology Review] [Video]
- Shixiang Gu, Luca Rigazio. “Toward Deep Neural Network Architectures Robust to Adversarial Examples”. ICLR 2015 Workshop. [Paper]
- Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Vector Institute, Canada, 2017.
- Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Xiaoou Tang and Xiaogang Wang. CUHK, Hong Kong, China, 2017.
- Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Masashi Sugiyama and Matsuo Yutaka. University of Tokyo, Japan, 2016.
- Timothy Lillicrap, Shixiang Gu. “Deep RL methods in Robotics”. Reinforcement Learning Forum. Google, USA, 2016.
- Shixiang Gu. “Generalized Backprop, Neural Particle Filter, and Guided Q-Learning”. Hosted by Pieter Abbeel. UC Berkeley, USA, 2015.
- Shixiang Gu. “Algorithms for Training Deep Stochastic Neural Networks”. Hosted by Noah Goodman. Stanford University, USA, 2015.
- Shixiang Gu, Andrey Malinin. “Long Short-Term Memory Networks”. Machine Learning RCC. Cambridge University, UK, 2015.
My CV is here.
Email: <my-initials>717 at <first-three-char-of-cambridge> dot ac dot uk
Mail: Office BE4-40, Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 1PZ, UK