***New website link at: https://sites.google.com/view/gugurus***
I started PhD in Machine Learning under Cambridge-Tübingen PhD Fellowship in the fall 2014, where I am co-supervised by Richard E. Turner and Zoubin Ghahramani at University of Cambridge, and Bernhard Schölkopf at the Max Planck Institute for Intelligent Systems in Tübingen. I also collaborate closely with Sergey Levine at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. I completed my B.ASc. in Engineering Science from the University of Toronto, where I did my thesis with Prof. Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms. I also had a great fortune and fun time working with Prof. Steve Mann, developing real-time HDR capture for wearable cameras/displays. I previously interned at Google Brain hosted by Ilya Sutskever and Vincent Vanhoucke. My PhD is funded by NSERC and Google Focused Research Award. I am a member of Jesus College, Cambridge.
I am a Lab Scientist at Creative Destruction Lab, one of the leading tech-startup incubators in Canada.
I am looking into machine learning involving sequential processing, such as reinforcement learning and sequence prediction. I currently focus on learning-driven approaches for robotics, which have been covered by Google Research Blogpost and MIT Technology Review. I also work on deep learning, probabilistic models, and generative models.
- Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. “Data-Efficient Hierarchical Reinforcement Learning”. [Arxiv] [Videos]
- George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine. “The Mirage of Action-Dependent Baselines in Reinforcement Learning”. ICML 2018. [Arxiv]
- Vitchyr Pong*, Shixiang Gu*, Murtaza Dalal, Sergey Levine. “Temporal Difference Models: Model-Free Deep RL for Model-Based Control”. ICLR 2018. *equal contribution [Paper] [Arxiv]
- Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine. “Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning”. ICLR 2018. [Paper] [Arxiv] [Videos]
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine. “Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning”. NIPS 2017. [Paper]
- Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, Jose Miguel Hernndez Lobato, Richard E. Turner, Douglas Eck. “Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control”. ICML 2017. [Paper]
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. “Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic”. ICLR 2017 [Oral, ~3%]. [Paper]
- Eric Jang, Shixiang Gu, Ben Poole. “Categorical Reparametrization with Gumble-Softmax”. ICLR 2017. [Paper]
- Shixiang Gu*, Ethan Holly*, Timothy Lillicrap, Sergey Levine. “Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates”. ICRA 2017. [Paper] [Google Blogpost] [MIT Technology Review] [ZDNet] [Video] *equal contribution
- Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine. “Continuous Deep Q-Learning with Model-based Acceleration”. ICML 2016. [Paper] [Arxiv]
- Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih. “MuProp: Unbiased Backpropagation for Stochastic Neural Networks”. ICLR 2016. [Paper]
- Shixiang Gu, Zoubin Ghahramani, Richard E. Turner. “Neural Adaptive Sequential Monte Carlo”. NIPS 2015. [Paper] [Arxiv] [Supplementary]
- Nilesh Tripuraneni*, Shixiang Gu*, Hong Ge, Zoubin Ghahramani. “Particle Gibbs for Infinite Hidden Markov Models”. NIPS 2015. [Paper] [Arxiv] *equal contribution
- Steve Mann, Raymond Chun Hing Lo, Kalin Ovtcharov, Shixiang Gu, David Dai, Calvin Ngan, Tao Ai. “Realtime HDR (High Dynamic Range) Video for EyeTap Wearable Computers, FPGA-Based Seeing Aids, and GlassEyes”, IEEE CCECE 2012, Montreal, 2012 April 29 to May 2. 6 pages, to be indexed in IEEE Xplore. ACM SIGGRAPH 2012, Emerging Technologies Exhibition. [Paper] [BibTex] [Video]
- Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck. “Tuning Recurrent Neural Networks with Reinforcement Learning”. NIPS 2016 Deep Reinforcement Learning Workshop. [Paper] [MIT Technology Review] [Video]
- Shixiang Gu, Luca Rigazio. “Toward Deep Neural Network Architectures Robust to Adversarial Examples”. ICLR 2015 Workshop. [Paper]
- Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Vector Institute, Canada, 2017.
- Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Xiaoou Tang and Xiaogang Wang. CUHK, Hong Kong, China, 2017.
- Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Masashi Sugiyama and Matsuo Yutaka. University of Tokyo, Japan, 2016.
- Timothy Lillicrap, Shixiang Gu. “Deep RL methods in Robotics”. Reinforcement Learning Forum. Google, USA, 2016.
- Shixiang Gu. “Generalized Backprop, Neural Particle Filter, and Guided Q-Learning”. Hosted by Pieter Abbeel. UC Berkeley, USA, 2015.
- Shixiang Gu. “Algorithms for Training Deep Stochastic Neural Networks”. Hosted by Noah Goodman. Stanford University, USA, 2015.
- Shixiang Gu, Andrey Malinin. “Long Short-Term Memory Networks”. Machine Learning RCC. Cambridge University, UK, 2015.
Having lived in Japan, China, Canada, US, UK, and Germany, I go under multiple names: Shane Gu, Shixiang Gu, 顾世翔, 顧世翔(ぐう せいしょう).
My CV is here.
Email: <my-initials>717 at <first-three-char-of-cambridge> dot ac dot uk
Mail: Office BE4-40, Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 1PZ, UK