It is often the case in Statistics that one needs to compute sums of infinite series, especially ... more It is often the case in Statistics that one needs to compute sums of infinite series, especially in marginalising over discrete latent variables. This has become more relevant with the popularization of gradient-based techniques (e.g. Hamiltonian Monte Carlo) in the Bayesian inference context, for which discrete latent variables are hard or impossible to deal with. For many commonly used infinite series, custom algorithms have been developed which exploit specific features of each problem. General techniques, suitable for a large class of problems with limited input from the user are less established. We employ basic results from the theory of infinite series to investigate general, problem-agnostic algorithms to truncate infinite sums within an arbitrary tolerance $\varepsilon > 0$ and provide robust computational implementations with provable guarantees. We compare three tentative solutions to estimating the infinite sum of interest: (i) a "naive" approach that sums t...
Species distribution models (SDMs) are extremely useful for determining preferences and habitats ... more Species distribution models (SDMs) are extremely useful for determining preferences and habitats for di erent species. Appropriate estimation of species distribution depends on the adequate random sampling scheme which is not always available. Instead, data is frequently composed of georeferenced locations where the species has been observed, which is commonly referred to as presence-only (PO) data. The statistical modelling of PO type data through Inhomogeneous Poisson Processes (IPP) was proposed by Fithian and Hastie (2013). As has already been noted (Fithian et al, 2015), PO type data presents bias in its sampling pattern, which must be addressed. A natural way to model this bias under IPP is through thinning of the process, which is easily performed using pertinent covariates. A di erent model for the intensity is proposed using a logistic link function. It maintains the already established exibility, while adding extra exibility in the choice of covariates. Therefore, it is po...
It is often the case in Statistics that one needs to compute sums of infinite series, especially ... more It is often the case in Statistics that one needs to compute sums of infinite series, especially in marginalising over discrete latent variables. This has become more relevant with the popularization of gradient-based techniques (e.g. Hamiltonian Monte Carlo) in the Bayesian inference context, for which discrete latent variables are hard or impossible to deal with. For many commonly used infinite series, custom algorithms have been developed which exploit specific features of each problem. General techniques, suitable for a large class of problems with limited input from the user are less established. We employ basic results from the theory of infinite series to investigate general, problem-agnostic algorithms to truncate infinite sums within an arbitrary tolerance $\varepsilon > 0$ and provide robust computational implementations with provable guarantees. We compare three tentative solutions to estimating the infinite sum of interest: (i) a "naive" approach that sums t...
Species distribution models (SDMs) are extremely useful for determining preferences and habitats ... more Species distribution models (SDMs) are extremely useful for determining preferences and habitats for di erent species. Appropriate estimation of species distribution depends on the adequate random sampling scheme which is not always available. Instead, data is frequently composed of georeferenced locations where the species has been observed, which is commonly referred to as presence-only (PO) data. The statistical modelling of PO type data through Inhomogeneous Poisson Processes (IPP) was proposed by Fithian and Hastie (2013). As has already been noted (Fithian et al, 2015), PO type data presents bias in its sampling pattern, which must be addressed. A natural way to model this bias under IPP is through thinning of the process, which is easily performed using pertinent covariates. A di erent model for the intensity is proposed using a logistic link function. It maintains the already established exibility, while adding extra exibility in the choice of covariates. Therefore, it is po...
Uploads
Papers by Guido Moreira