Variable selection

Variable selection using Random Forest based on optimizing the area-under-the ROC curve (AUC) of the Random Forest.

Researchers: María Luz Calle Rosingana (Catalunya – BIO)Víctor Urrea

Selección Bayesiana de Variables en modelos lineales

Hypothesis testing, model selection and model averaging are important statistical problems that have in common the explicit consideration of the uncertainty about which is the true model. The formal Bayesian tool to solve such problems is the Bayes factor (Kass and Raftery, 1995) that reports the evidence in the data favoring each of the entertained hypotheses/models and can be easily translated to posterior probabilities.

This package has been specifically conceived to calculate Bayes factors in linear models and then to provide a formal Bayesian answer to testing and variable selection problems. From a theoretical side, the emphasis in the package is placed on the prior distributions (a very delicate issue in this context) and BayesVarSel allows using a wide range of them: Jeffreys-Zellner-Siow (Jeffreys, 1961; Zellner and Siow, 1980,1984) Zellner (1986); Fernandez et al. (2001), Liang et al. (2008) and Bayarri et al. (2012).

The interaction with the package is through a friendly interface that syntactically mimics the well-known lm command of R. The resulting objects can be easily explored providing the user very valuable information (like marginal, joint and conditional inclusion probabilities of potential variables; the highest posterior probability model, HPM; the median probability model, MPM) about the structure of the true -data generating- model. Additionally, BayesVarSel incorporates abilities to handle problems with a large number of potential explanatory variables through parallel and heuristic versions (Garcia-Donato and Martinez-Beneito 2013) of the main commands.

Researchers: (BayesVarSel)Anabel Forte Deltell ()

Análisis de datos funcionales

Análisis de datos funcionales en R: análisis descriptivo, modelos de regresión, clasificación y selección de variables.

Researchers: Manuel Oviedo de la Fuente ()Manuel Febrero Bande

Multivariate analysis with application to biomedicine

ORdensity gives the user the list of genes identified as differentially expressed genes in an easy and comprehensible way. The experimentation carried out in an off-the-self computer with the parallel execution enabled shows an improvement in run-time. This implementation may also lead to an important use of memory load. Results previously obtained with simulated and real data indicated that the procedure implemented in the package is robust and suitable for differentially expressed genes identification.

Researchers: Concepción Arenas Sola (Catalunya – BIO)José María Martínez-OtzetaItziar IrigoienBasilio Sierra