Titolo:
NBEB-SSP - Nonparametric Bayes and empirical Bayes for species sampling problems: classical questions, new directions and related issues
Principal Investigator:
Stefano Favaro (Università di Torino e Collegio Carlo Alberto)
Funds:
ERC Consolidator Grant
Duration:
2019-2024
Partnership:
University of Turin (Coordinator).
Collegio Carlo Alberto (Beneficiary)
Abstract:
The project deals with species sampling problems, a broad class of nonstandard inferential problems that first appeared in ecology, and whose importance has grown considerably in recent years driven by numerous applications in the broad area of biosciences, and also in machine learning and information theory. The project has two main workstreams: i) a nonparametric Bayes and empirical Bayes study of classical species sampling problems, generalized species sampling problems emerging in biological and physical sciences, and questions thereof in the context of optimal design of species inventories; ii) the use of recent mathematical tools from the theory of differentially privacy to study the fundamental tradeoff between privacy protection of information, which requires to release partial or perturbed data, and Bayesian learning in species sampling problems, which requires accurate data to make inference.
Publications
- “Conformal frequency estimation with sketched data”, S. Favaro and M. Sesia, Neural Information Processing Systems, 2022. https://proceedings.neurips.cc/paper_files/paper/2022/hash/2b2011a7d5396faf5899863d896a3c24-Abstract-Conference.html
- “On Johnson’s sufficientness postulates for feature sampling models”, F. Camerlenghi and S. Favaro, 2021, vol 9. https://www.mdpi.com/2227-7390/9/22/2891
- “A compound Poisson perspective of Ewens Pitman sampling model”, E. Dolera and S. Favaro, 2021, vol. 9. https://www.mdpi.com/2227-7390/9/21/2820
- “Scaled process priors for Bayesian nonparametric estimation of the unseen genetic variation”, F. Camerlenghi, S. Favaro, L. Masoero and T. Broderick, Journal of the American Statistical Association, to appear, https://arxiv.org/abs/2106.15480
- “Near-optimal estimation of the unseen under regularly varying tail populations”, S. Favaro and Z. Naulet, to appear. https://arxiv.org/abs/2104.03251
- “Bayesian nonparametric mixture modeling for temporal dynamics of gender stereotypes”, M. De Iorio, S. Favaro, A. Guglielmi and Y. Lifeng, Annals of Applied Statistics, 2023, vol. 17, pp. 2256-2278.
- https://projecteuclid.org/journals/annals-of-applied-statistics/volume-17/issue-3/Bayesian-nonparametric-mixture-modeling-for-temporal-dynamics-of-gender-stereotypes/10.1214/22-AOAS1717.short
- “Deep Stable neural networks: large-width asymptotics and convergence rates”, S. Favaro, S. Fortini and S. Peluchetti, Bernoulli, 2023, vol. 29, pp. 2574-2597. https://arxiv.org/abs/2108.02316
- “Learning-augmented count-min sketches via Bayesian nonparametrics”, E. Dolera, S. Favaro and S. Peluchetti, Journal of Machine Learning Research, 2023, vol. 24, pp. 1-60. https://www.jmlr.org/papers/v24/21-0096.html
- “Infinitely wide limits for deep Stable neural networks: sub-linear, linear and super-linear activation functions”, A. Bordino, S. Favaro and S. Fortini, Transactions on Machine Learning Research, 2022, https://openreview.net/pdf?id=A5tIluhDW6
- “Bayesian nonparametric disclosure risk assessment”, S. Favaro, F. Panero and T. Rigon, Electronic Journal of Statistics, 2021, vol. 15, pp. 5626-5651. https://projecteuclid.org/journals/electronic-journal-of-statistics/volume-15/issue-2/Bayesian-nonparametric-disclosure-risk-assessment/10.1214/21-EJS1933.full
- “More for less: predicting and maximizing genetic variant discovery via Bayesian nonparametrics”, T. Broderick, F. Camerlenghi, S. Favaro and L. Masoero, Biometrika, to appear https://arxiv.org/abs/1912.05516
- “Doubly infinite neural networks: a diffusion process approach”, S. Favaro, S. Peluchetti, Journal of Machine Learning Research, to appear
https://arxiv.org/abs/2007.03253 - “Optimal disclosure risk assessment”, F. Camerlenghi, S. Favaro, Z. Naulet and F. Panero, Annals of Statistics, 2021, vol. 49, pp. 723-744, https://arxiv.org/abs/1902.05354
- “Large-width functional asymptotics for deep Gaussian neural networks”, D. Bracale, S. Favaro, S. Fortini and S. Peluchetti, International Conference on Learning Representations, 2021, https://arxiv.org/abs/2102.10307
- “A Bayesian nonparametric approach to count-min sketch under power-law data streams”, E. Dolera, S. Favaro and S. Peluchetti, International Conference on Artificial Intelligence and Statistics, 2021, http://proceedings.mlr.press/v130/dolera21a.html
- “Consistent and rate optimal estimation of the missing mass”, F. Ayed, M. Battiston, F. Camerlenghi, S. Favaro, Annales de l'Institut Henri Poincaré - Probabilités et Statistiques, to appear, https://eprints.lancs.ac.uk/id/eprint/149042/
- “Perfect sampling for posterior hierarchical Pitman-Yor processes”, S. Favaro, S. Bacallado and L. Trippa, Bayesian Analysis, to appear
- “Consistent estimation of small masses in feature sampling”, F. Ayed, M. Battiston, F. Camerlenghi, S. Favaro, Journal of Machine Learning Research, 2021, vol. 22, pp. 1-28, https://www.jmlr.org/papers/v22/18-534.html
- “Stable behaviour of infinitely wide deep neural networks”, S. Favaro, S. Fortini and S. Peluchetti, International Conference on Artificial Intelligence and Statistics, 2020, https://arxiv.org/abs/2003.00394
- “Infinitely deep neural networks as diffusion processes”, S. Favaro, S. Peluchetti, International Conference on Artificial Intelligence and Statistics, 2020, http://proceedings.mlr.press/v108/peluchetti20a.html
- “Nonparametric Bayesian multi-armed bandits for single cell experiment design”, F. Camerlenghi, B. Dimitrascu, B. Engelhardt, S. Favaro and F. Ferrari, Annals of Applied Statistics, 2020, vol. 14, pp. 2003-2019, https://arxiv.org/abs/1910.05355
- “A Good-Turing estimator for feature allocation models”, F. Ayed, M. Battiston, F. Camerlenghi, S. Favaro, Electronic Journal of Statistics, 2019, vol. 13, pp. 3775-3804, https://projecteuclid.org/journals/electronic-journal-of-statistics/volume-13/issue-2/A-Good-Turing-estimator-for-feature-allocation-models/10.1214/19-EJS1614.full
- “Rates of convergence in de Finetti's representation theorem, and Hausdorff moment problem”, E. Dolera, S. Favaro, Bernoulli, 2020, vol. 26, pp. 1294-1322
- “A Berry-Esseen theorem for Pitman's alpha-diversity”, E. Dolera, S. Favaro, Annals of Applied Probability, 2020, vol. 30, pp. 847-869, https://arxiv.org/abs/1809.09276
- “Approximating predictive probabilities of Gibbs-type priors”, J. Arbel, S. Favaro, Sankhya Series A, 2021, vol. 83, pp. 496-519
- “Bayesian mixed effects models for zero-inflated compositions in microbiome data analysis”, S. Bacallado, S. Favaro, C. Huttenhower, B. Ren and L. Trippa, Annals of Applied Statistics, 2020, vol. 14, pp. 494-517, https://arxiv.org/abs/1711.01241