***New website link at: https://sites.google.com/view/gugurus***

I started PhD in Machine Learning under Cambridge-Tübingen PhD Fellowship in the fall 2014, where I am co-supervised by Richard E. Turner and Zoubin Ghahramani at University of Cambridge, and Bernhard Schölkopf at the Max Planck Institute for Intelligent Systems in Tübingen. I also collaborate closely with Sergey Levine at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. I completed my B.ASc. in Engineering Science from the University of Toronto, where I did my thesis with Prof. Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms. I also had a great fortune and fun time working with Prof. Steve Mann, developing real-time HDR capture for wearable cameras/displays.  I previously interned at Google Brain hosted by Ilya Sutskever and Vincent Vanhoucke. My PhD is funded by NSERC and Google Focused Research Award. I am a member of Jesus College, Cambridge.

I am a Lab Scientist at Creative Destruction Lab, one of the leading tech-startup incubators in Canada.

Research interests

I am looking into machine learning involving sequential processing, such as reinforcement learning and sequence prediction. I currently focus on learning-driven approaches for robotics, which have been covered by Google Research Blogpost and MIT Technology Review. I also work on deep learning, probabilistic models, and generative models.


  1. Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. “Data-Efficient Hierarchical Reinforcement Learning”. [Arxiv] [Videos]


  1. George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine. “The Mirage of Action-Dependent Baselines in Reinforcement Learning”. ICML 2018. [Arxiv]
  2. Vitchyr Pong*, Shixiang Gu*, Murtaza Dalal, Sergey Levine. “Temporal Difference Models: Model-Free Deep RL for Model-Based Control”. ICLR 2018. *equal contribution [Paper] [Arxiv]
  3. Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine. “Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning”. ICLR 2018. [Paper] [Arxiv] [Videos]
  4. Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine. “Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning”. NIPS 2017. [Paper]
  5. Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, Jose Miguel Hernndez Lobato, Richard E. Turner, Douglas Eck. “Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control”. ICML 2017. [Paper]
  6. Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. “Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic”. ICLR 2017 [Oral, ~3%][Paper]
  7. Eric Jang, Shixiang Gu, Ben Poole. “Categorical Reparametrization with Gumble-Softmax”. ICLR 2017. [Paper]
  8. Shixiang Gu*, Ethan Holly*, Timothy Lillicrap, Sergey Levine. “Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates”. ICRA 2017. [Paper] [Google Blogpost] [MIT Technology Review] [ZDNet] [Video] *equal contribution
  9. Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine. “Continuous Deep Q-Learning with Model-based Acceleration”. ICML 2016. [Paper] [Arxiv]
  10. Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih. “MuProp: Unbiased Backpropagation for Stochastic Neural Networks”. ICLR 2016. [Paper]
  11. Shixiang Gu, Zoubin Ghahramani, Richard E. Turner. “Neural Adaptive Sequential Monte Carlo”. NIPS 2015. [Paper] [Arxiv] [Supplementary]
  12. Nilesh Tripuraneni*, Shixiang Gu*, Hong Ge, Zoubin Ghahramani. “Particle Gibbs for Infinite Hidden Markov Models”. NIPS 2015. [Paper] [Arxiv] *equal contribution
  13. Steve Mann, Raymond Chun Hing Lo, Kalin Ovtcharov, Shixiang Gu, David Dai, Calvin Ngan, Tao Ai. “Realtime HDR (High Dynamic Range) Video for EyeTap Wearable Computers, FPGA-Based Seeing Aids, and GlassEyes”, IEEE CCECE 2012, Montreal, 2012 April 29 to May 2. 6 pages, to be indexed in IEEE Xplore. ACM SIGGRAPH 2012, Emerging Technologies Exhibition. [Paper] [BibTex] [Video]


  1. Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck. “Tuning Recurrent Neural Networks with Reinforcement Learning”. NIPS 2016 Deep Reinforcement Learning Workshop. [Paper] [MIT Technology Review] [Video] 
  2. Shixiang Gu, Luca Rigazio. “Toward Deep Neural Network Architectures Robust to Adversarial Examples”. ICLR 2015 Workshop. [Paper]


  1. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Vector Institute, Canada, 2017.
  2. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Xiaoou Tang and Xiaogang Wang. CUHK, Hong Kong, China, 2017.
  3. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Masashi Sugiyama and Matsuo Yutaka. University of Tokyo, Japan, 2016.
  4. Timothy Lillicrap, Shixiang Gu. “Deep RL methods in Robotics”. Reinforcement Learning Forum. Google, USA, 2016.
  5. Shixiang Gu. “Generalized Backprop, Neural Particle Filter, and Guided Q-Learning”. Hosted by Pieter Abbeel. UC Berkeley, USA, 2015.
  6. Shixiang Gu. “Algorithms for Training Deep Stochastic Neural Networks”. Hosted by Noah Goodman. Stanford University, USA, 2015.
  7. Shixiang Gu, Andrey Malinin. “Long Short-Term Memory Networks”. Machine Learning  RCC. Cambridge University, UK, 2015.


Having lived in Japan, China, Canada, US, UK, and Germany, I go under multiple names: Shane Gu, Shixiang Gu, 顾世翔, 顧世翔(ぐう せいしょう).

Curriculum vitae

My CV is here.


Email: <my-initials>717 at <first-three-char-of-cambridge> dot ac dot uk
Mail: Office BE4-40, Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 1PZ, UK