Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

publications

Calibrated approximate Bayesian inference

Published in Proceedings of the 36th International Conference on Machine Learning, 2019

We give a general purpose computational framework for estimating the bias in coverage resulting from making approximations in Bayesian inference. Coverage is the probability credible sets cover true parameter values. We show how to estimate the actual coverage an approximation scheme achieves when the ideal observation model and the prior can be simulated, but have been replaced, in the Monte Carlo, with approximations as they are intractable. Coverage estimation procedures given in Lee et al.(2018) work well on simple problems, but are biased, and do not scale well, as those authors note. For example, the methods of Lee et al.(2018) fail for calibration of an approximate completely collapsed MCMC algorithm for partition structure in a Dirichlet process for clustering group labels in a hierarchical model. By exploiting the symmetry of the coverage error under permutation of low level group labels and smoothing with Bayesian Additive Regression Trees, we are able to show that the original approximate inference had poor coverage and should not be trusted.

Download here

Distortion estimates for approximate Bayesian inference

Published in Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence, 2020

Current literature on posterior approximation for Bayesian inference offers many alternative methods. Does our chosen approximation scheme work well on the observed data? The best existing generic diagnostic tools treating this kind of question by looking at performance averaged over data space, or otherwise lack diagnostic detail. However, if the approximation is bad for most data, but good at the observed data, then we may discard a useful approximation. We give graphical diagnostics for posterior approximation at the observed data. We estimate a “distortion map” that acts on univariate marginals of the approximate posterior to move them closer to the exact posterior, without recourse to the exact posterior.

Download here

Improving bridge estimators via f-GAN

Published in Statistics and Computing, 2022

Bridge sampling is a powerful Monte Carlo method for estimating ratios of normalizing constants. Various methods have been introduced to improve its efficiency. These methods aim to increase the overlap between the densities by applying appropriate transformations to them without changing their normalizing constants. In this paper, we first give a new estimator of the asymptotic relative mean square error (RMSE) of the optimal Bridge estimator by equivalently estimating an f-divergence between the two densities. We then utilize this framework and propose f-GAN-Bridge estimator (f-GB) based on a bijective transformation that maps one density to the other and minimizes the asymptotic RMSE of the optimal Bridge estimator with respect to the densities. This transformation is chosen by minimizing a specific f-divergence between the densities. We show f-GB is optimal in the sense that within any given set of candidate transformations, the f-GB estimator can asymptotically achieve an RMSE lower than or equal to that achieved by Bridge estimators based on any other transformed densities. Numerical experiments show that f-GB outperforms existing methods in simulated and real-world examples. In addition, we discuss how Bridge estimators naturally arise from the problem of f-divergence estimation.

Download here

Bayesian inference for identifying tumour-specific cancer dependencies through integration of ex-vivo drug response assays and drug-protein profiling

Published in BMC Bioinformatics volume 25, 2024

The identification of tumor-specific molecular dependencies is essential for the development of effective cancer therapies. Genetic and chemical perturbations are powerful tools for discovering these dependencies. Even though chemical perturbations can be applied to primary cancer samples at large scale, the interpretation of experiment outcomes is often complicated by the fact that one chemical compound can affect multiple proteins. To overcome this challenge, Batzilla et al. (PLoS Comput Biol 18(8): e1010438, 2022) proposed DepInfeR, a regularized multi-response regression model designed to identify and estimate specific molecular dependencies of individual cancers from their ex-vivo drug sensitivity profiles. Inspired by their work, we propose a Bayesian extension to DepInfeR. Our proposed approach offers several advantages over DepInfeR, including e.g. the ability to handle missing values in both protein-drug affinity and drug sensitivity profiles without the need for data pre-processing steps such as imputation. Moreover, our approach uses Gaussian Processes to capture more complex molecular dependency structures, and provides probabilistic statements about whether a protein in the protein-drug affinity profiles is informative to the drug sensitivity profiles. Simulation studies demonstrate that our proposed approach achieves better prediction accuracy, and is able to discover unreported dependency structures.

Download here

GPerturb: Gaussian process modelling of single-cell perturbation data

Published in Forthcoming in Nature Communications, 2025

Single-cell RNA sequencing and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) screening facilitate the high-throughput study of genetic perturbations at a single-cell level. Characterising combinatorial perturbation effects, such as the subset of genes affected by a specific perturbation, is crucial yet computationally challenging in the analysis of single-cell CRISPR screening datasets due to the sparse and complex structure of unknown biological mechanisms. We propose Gaussian process based sparse perturbation regression (GPerturb) to identify and estimate interpretable gene-level perturbation effects for such data. GPerturb uses an additive structure to disentangle perturbation-induced variation from background noise, and can learn sparse, gene-level perturbation-specific effects from either discrete or continuous responses of perturbed samples. Crucially, GPerturb provides uncertainty estimates for both the presence and magnitude of perturbation effects on individual genes. We validate the efficacy of GPerturb on both simulated and real-world datasets, demonstrating that its prediction and generalisation performance is competitive with existing state-of-the-art methods. Using real-world datasets, we also show that the model reveals interesting gene-perturbation interactions and identifies perturbation effects consistent with known biological mechanisms. Our findings confirm the utility of GPerturb in revealing new insights into the complex dependency structure between gene expressions and perturbations.

Download here

Continual learning via probabilistic exchangeable sequence modelling

Published in Preprint on arXiv, 2025

Continual learning (CL) refers to the ability to continuously learn and accumulate new knowl- edge while retaining useful information from past experiences. Although numerous CL methods have been proposed in recent years, it is not straightforward to deploy them directly to real- world decision-making problems due to their com- putational cost and lack of uncertainty quantifi- cation. To address these issues, we propose CL- BRUNO, a probabilistic, Neural Process-based CL model that performs scalable and tractable Bayesian update and prediction. Our proposed approach uses deep-generative models to create a unified probabilistic framework capable of han- dling different types of CL problems such as task- and class-incremental learning, allowing users to integrate information across different CL scenar- ios using a single model. Our approach is able to prevent catastrophic forgetting through distri- butional and functional regularisation without the need of retaining any previously seen samples, making it appealing to applications where data privacy or storage capacity is of concern. Experi- ments show that CL-BRUNO outperforms exist- ing methods on both natural image and biomedi- cal data sets, confirming its effectiveness in real- world applications.

Download here

talks

teaching

, , 1900

SB2.1 Foundations of Statistical Inference

, Department of Statistics, University of Oxord, 1900

<!–

title: “Teaching experience 1” collection: teaching type: “Undergraduate course” permalink: /teaching/2014-spring-teaching-1 venue: “University 1, Department” date: 2014-01-01 location: “City, Country” —