I am a research scientist at Disney Research Zurich heading the Deep Learning and Analytics group. I am particularly interested in deep learning, randomized algorithms and optimization for large scale learning.
I completed my Ph.D. in 2012 in the Statistics Section of the Department of Mathematics Imperial College, London under the supervision of Giovanni Montana.
Previously, I did the MSc. in Informatics specialising in Machine Learning, Neuroinformatics and Intelligent Robotics at the University of Edinburgh and a BEng. in Computer Systems Engineering at the University of Warwick.
Until August 2015 I was a postdoctoral researcher and lecturer in the Institute of Machine Learning at ETH Zurich.
The Shattered Gradients Problem: If resnets are the answer, then what is the question?
D Balduzzi, M Frean, L Leary, JP Lewis, K Ma, B McWilliams. arXiv.
Neural Taylor Approximation: Convergence and Exploration in Rectifier Networks.
D Balduzzi, B McWilliams, T Butler-Yeoman. arXiv.
Preserving Differential Privacy Between Features in Distributed Estimation.
C Heinze-Deml, B McWilliams, N Meinshausen. arXiv.
Kernel-predicting Convolutional Networks for Denoising Monte Carlo Renderings.
S Bako, T Vogels, B McWilliams, M Meyer, J Novak, A Harvill, T DeRose, P Sen, F Rouselle. SIGGRAPH 2017.
Scalable Adaptive Stochastic Optimization Using Random Projections.
G Krummenacher, B McWilliams, Y Kilcher, J Buhmann, N Meinshausen. NIPS 2016. arXiv.
A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation.
F Perazzi, J Pont-Tuset, B McWilliams, M Gross, L Van Gool, A Sorkine-Hornung. CVPR 2016.Project page.
DUAL-LOCO: Distributing Statistical Estimation Using Random Projections.
C Heinze, B McWilliams, N Meinshausen. AISTATS 2016. arXiv. Software (maintained by Christina).
Variance Reduced Stochastic Gradient Descent with Neighbors.
T Hofmann, A Lucchi, S Lacoste-Julien, B McWilliams. NIPS 28. arXiv.
DUAL-LOCO: Preserving privacy between features in distributed estimation.
C Heinze, B McWilliams, N Meinshausen. NIPS WS on Learning and privacy with incomplete data and weak supervision.
LOCO: Distributing Ridge Regression with Random Projections.
C Heinze, B McWilliams, N Meinshausen, G Krummenacher. arXiv.
Software (maintained by Christina).
Learning Representations for Outlier Detection on a Budget.
B Micenková, B McWilliams, I Assent. arXiv.
A Variance Reduced Stochastic Newton Method.
A Lucchi, B McWilliams, T Hofmann. arXiv.
RadaGrad: Random Projections for Adaptive Stochastic Optimization.
G Krummenacher, B McWilliams. 7th NIPS Workshop on Optimization for Machine Learning. pdf.
Fast and Robust Least Squares Estimation in Corrupted Linear Models.
B McWilliams, G Krummenacher, M Lučić, J Buhmann. NIPS 27. arXiv. Software (maintained by Gabriel). Slides and video from the Zuri ML meetup.
Learning Outlier Ensembles: The Best of Both Worlds – Supervised and Unsupervised.
B Micenková, B McWilliams, I Assent. KDD Workshop on Outlier Detection & Description under Data Diversity (ODD²). pdf.
- Subspace clustering of high-dimensional data: a predictive approach. B McWilliams, G Montana. Data Mining and Knowledge Discovery. 28(3): 736-772. arXiv.
- Correlated random features for fast semi-supervised learning.
B McWilliams, D Balduzzi, J Buhmann. In Advances in Neural Information Processing Systems (NIPS) 26. arXiv. Matlab code. Poster.
- Pruning random features with correlated kitchen sinks (1 page abstract).
B McWilliams, D Balduzzi. SPARS 2013.
2012 and earlier
- Projection based models for high dimensional data. Ph.D. thesis.
- Multi-view predictive partitioning in high dimensions.
B McWilliams, G Montana Statistical Analysis and Data Mining (2012). 5: 304-321. arXiv.
- Predictive subspace clustering.
B McWilliams, G Montana. In Proceedings of the 10th International Conference on Machine Learning and Applications (2011), 247-252. pdf.
- A PRESS statistic for two-block partial least squares regression.
B McWilliams, G Montana. In Proceedings of the 10th Conference on Computational Intelligence UK (2010), Colchester. pdf.
- Sparse partial least squares for on-line variable selection in multivariate data streams.
B McWilliams, G Montana Statistical Analysis and Data Mining (2010). 3: 170-193. pdf.
- Predictive modeling with high-dimensional data streams: an on-line variable selection approach.
B McWilliams, G Montana. Signal Processing with Adaptive Sparse Structured Representations (SPARS) 2009.
Fall Semester 2014
I co-lectured Probabilistic Graphical Models for Image Analysis with Dr. Aurelien Lucchi.
Spring Semester 2014
I was head TA of Computational Intelligence Laboratory. This course is now taught by Prof. Hofmann and headed by Martin Jaggi and Aurelien Lucchi.
The website for the probabilistic graphical models course that Dr. David Balduzzi and I taught at Uni Basel in summer 2013 is located here.
Disney Research Zurich