***New website link at: https://sites.google.com/view/gugurus***

I am a Research Scientist at Google Brain, where I mainly work on problems in deep learning, reinforcement learning, robotics, and probabilistic machine learning. My recent research focuses on sample-efficient RL methods that could scale to solve difficult continuous control problems in the real-world, which have been covered by Google Research Blogpost and MIT Technology Review.

I completed PhD in Machine Learning at the University of Cambridge and the Max Planck Institute for Intelligent Systems in Tübingen, where I was co-supervised by Richard E. TurnerZoubin Ghahramaniand Bernhard SchölkopfDuring my PhD, I also collaborated closely with Sergey Levine at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. I hold my B.ASc. in Engineering Science from the University of Toronto, where I did my thesis with Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms. I also had great fun time working with Steve Mann, developing real-time HDR capture for wearable cameras/displays.  I interned at Google Brain hosted by Ilya Sutskever and Vincent Vanhoucke. I also volunteered as a Lab Scientist at Creative Destruction Lab, one of the leading tech-startup incubators in Canada. My PhD was funded by Cambridge-Tübingen PhD FellowshipNSERC and Google Focused Research Award.

I am a Japan-born Chinese Canadian, and I speak, read, and write in three languages. Having lived in Japan, China, Canada, the US, the UK, and Germany, I go under multiple names: Shane Gu, Shixiang Gu, 顾世翔, 顧世翔(ぐう せいしょう). My Chinese name means “world” (世) and “flying” (翔), and thus I “flew around the world”.


  1. Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. “Data-Efficient Hierarchical Reinforcement Learning”. [Arxiv] [Videos]


  1. George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine. “The Mirage of Action-Dependent Baselines in Reinforcement Learning”. ICML 2018. [Arxiv]
  2. Vitchyr Pong*, Shixiang Gu*, Murtaza Dalal, Sergey Levine. “Temporal Difference Models: Model-Free Deep RL for Model-Based Control”. ICLR 2018. *equal contribution [Paper] [Arxiv]
  3. Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine. “Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning”. ICLR 2018. [Paper] [Arxiv] [Videos]
  4. Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine. “Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning”. NIPS 2017. [Paper]
  5. Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, Jose Miguel Hernndez Lobato, Richard E. Turner, Douglas Eck. “Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control”. ICML 2017. [Paper]
  6. Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. “Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic”. ICLR 2017 [Oral, ~3%][Paper]
  7. Eric Jang, Shixiang Gu, Ben Poole. “Categorical Reparametrization with Gumble-Softmax”. ICLR 2017. [Paper]
  8. Shixiang Gu*, Ethan Holly*, Timothy Lillicrap, Sergey Levine. “Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates”. ICRA 2017. [Paper] [Google Blogpost] [MIT Technology Review] [ZDNet] [Video] *equal contribution
  9. Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine. “Continuous Deep Q-Learning with Model-based Acceleration”. ICML 2016. [Paper] [Arxiv]
  10. Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih. “MuProp: Unbiased Backpropagation for Stochastic Neural Networks”. ICLR 2016. [Paper]
  11. Shixiang Gu, Zoubin Ghahramani, Richard E. Turner. “Neural Adaptive Sequential Monte Carlo”. NIPS 2015. [Paper] [Arxiv] [Supplementary]
  12. Nilesh Tripuraneni*, Shixiang Gu*, Hong Ge, Zoubin Ghahramani. “Particle Gibbs for Infinite Hidden Markov Models”. NIPS 2015. [Paper] [Arxiv] *equal contribution
  13. Steve Mann, Raymond Chun Hing Lo, Kalin Ovtcharov, Shixiang Gu, David Dai, Calvin Ngan, Tao Ai. “Realtime HDR (High Dynamic Range) Video for EyeTap Wearable Computers, FPGA-Based Seeing Aids, and GlassEyes”, IEEE CCECE 2012, Montreal, 2012 April 29 to May 2. 6 pages, to be indexed in IEEE Xplore. ACM SIGGRAPH 2012, Emerging Technologies Exhibition. [Paper] [BibTex] [Video]


  1. Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck. “Tuning Recurrent Neural Networks with Reinforcement Learning”. NIPS 2016 Deep Reinforcement Learning Workshop. [Paper] [MIT Technology Review] [Video] 
  2. Shixiang Gu, Luca Rigazio. “Toward Deep Neural Network Architectures Robust to Adversarial Examples”. ICLR 2015 Workshop. [Paper]


  1. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Vector Institute, Canada, 2017.
  2. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Xiaoou Tang and Xiaogang Wang. CUHK, Hong Kong, China, 2017.
  3. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Masashi Sugiyama and Matsuo Yutaka. University of Tokyo, Japan, 2016.
  4. Timothy Lillicrap, Shixiang Gu. “Deep RL methods in Robotics”. Reinforcement Learning Forum. Google, USA, 2016.
  5. Shixiang Gu. “Generalized Backprop, Neural Particle Filter, and Guided Q-Learning”. Hosted by Pieter Abbeel. UC Berkeley, USA, 2015.
  6. Shixiang Gu. “Algorithms for Training Deep Stochastic Neural Networks”. Hosted by Noah Goodman. Stanford University, USA, 2015.
  7. Shixiang Gu, Andrey Malinin. “Long Short-Term Memory Networks”. Machine Learning  RCC. Cambridge University, UK, 2015.

Curriculum vitae

My CV is here.


Email: <my-initials>717 at <first-three-char-of-cambridge> dot ac dot uk
Mail: Office BE4-40, Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 1PZ, UK