Diseño Óptimo de Experimentos

Cálculo de Diseños Óptimos para la ecuación de Antoine

Investigadores: Carlos de la Calle Arroyo ()Jesús López Fidalgo ()Licesio J. Rodríguez Aragón ()

Variable selection using Random Forest based on optimizing the area-under-the ROC curve (AUC) of the Random Forest.

Investigadores: María Luz Calle Rosingana (Catalunya – BIO)Víctor Urrea

Selección Bayesiana de Variables en modelos lineales

Hypothesis testing, model selection and model averaging are important statistical problems that have in common the explicit consideration of the uncertainty about which is the true model. The formal Bayesian tool to solve such problems is the Bayes factor (Kass and Raftery, 1995) that reports the evidence in the data favoring each of the entertained hypotheses/models and can be easily translated to posterior probabilities.

This package has been specifically conceived to calculate Bayes factors in linear models and then to provide a formal Bayesian answer to testing and variable selection problems. From a theoretical side, the emphasis in the package is placed on the prior distributions (a very delicate issue in this context) and BayesVarSel allows using a wide range of them: Jeffreys-Zellner-Siow (Jeffreys, 1961; Zellner and Siow, 1980,1984) Zellner (1986); Fernandez et al. (2001), Liang et al. (2008) and Bayarri et al. (2012).

The interaction with the package is through a friendly interface that syntactically mimics the well-known lm command of R. The resulting objects can be easily explored providing the user very valuable information (like marginal, joint and conditional inclusion probabilities of potential variables; the highest posterior probability model, HPM; the median probability model, MPM) about the structure of the true -data generating- model. Additionally, BayesVarSel incorporates abilities to handle problems with a large number of potential explanatory variables through parallel and heuristic versions (Garcia-Donato and Martinez-Beneito 2013) of the main commands.

Investigadores: (BayesVarSel)Anabel Forte Deltell ()

Modelización y diseño de experimentos

Modelling with differential equations, fitting coefficients, convolution, and more, with application for modelling Biokinetic Systems. It included the current ICRP biokinetic models. It can be applied in pharmacokinetic, internal dosimetry, bioassay evaluations, nuclear medicine and more. Run online using a webbrowser (http://ehe.usal.es/webMathematica/)

Investigadores: José Guillermo Sánchez León ()

Estadística espacial, Modelos lineales mixtos, Análisis de pedigrí

Statistical methods for forest genetic resources analysts.

Investigadores: Facundo Muñoz Viera ()

This R package implements the conditional survival function described in ‘Nonparametric bivariate estimation for successive survival times’ (C. Serrat and G. Gómez, 2007).

Investigadores: Carles Serrat i Piè (Catalunya – BIO)Guadalupe Gómez Melis (Catalunya – BIO)Victoria Moneta

Categorización de variables continuas en modelos predictivos

El paquete de R CatPredi permite seleccionar puntos de corte óptimos tanto en un modelo logístico como en un modelo de cox de riesgos proporcionales. Permite categorizar la variable continua en k (a elección del usuario) categorias, considerando un modelo univariante o múltiple.

Investigadores: Irantzu Barrio Beraza (País Vasco)María Xosé Rodríguez Álvarez (Galicia)Inmaculada Arostegui Madariaga (País Vasco)

CompARE is a free online platform to help investigators to design randomized clinical trials with composite endpoint. Following features are available:

  • Choose the primary endpoint
  • Evaluate the gain in efficiency of using a primary composite endpoint either in the context of time-to-event or binary outcome.
  • Obtain the sample size for a composite endpoint based on the information on its components and the correlation between them.
  • Obtain the sample size of an study using a single endpoint.
  • Get helpful numerical and intuitive graphical results.
Investigadores: Guadalupe Gómez Melis (Catalunya – BIO)Marta Bofill Roig (Catalunya – BIO)Jordi Cortés Martínez (Catalunya – BIO)Moisés Gómez Mateu (CompARE)

Clinical trial designs with composite endpoints

R package to calculate the required sample size in randomized clinical trials with composite endpoints. This package also includes functions to calculate the probability of observing the composite endpoint and the expected effect on the composite endpoint, among others.

Investigadores: Marta Bofill Roig (Catalunya – BIO)Jordi Cortés Martínez (Catalunya – BIO)Guadalupe Gómez Melis (Catalunya – BIO)

Survival analysis

Method to implement some newly developed methods for the estimation of the conditional survival function.

Investigadores: (condSURV)Marta Sestelo

Analysis of the dynamic of fish populations

In recent years, there has been an increasing research effort on developing methods that can generally improve the reliability of stock assessments in data-limited situations. Consequently, several data-limited assessment methods have been proposed, and surplus production models (SPMs) were among the assessment methodologies recommended for this purpose, which only requires time series of an index of relative biomass and catch data. SPMs are one of the the simplest analytical methods available for providing a full stock assessment, which estimates the changes in the biomass as afunction of the biomass of the previous year, the surplus production and the catches.

A well-known and widely used SPM is ASPIC (A Stock-Production ModelIncorporating Covariate; see Prager (1992) and (1994)). ASPIC can be fitted through ASPIC Suite program, some executables availables at(http://www.mhprager.com/aspic.html). A disadvantage of ASPIC suite program is thatthe input file, which contains the input data and the available prior values of the modelparameters, must be created manually, and then the executable must be openned to fitthe model. Hence, if we are interesting on running slighly different ASPIC’s varying theprior values of the model parameters, for example, the previous procedure has to berepeated manually several times.

For solving the mentioned problem, we have developed connectASPIC (available at https://github.com/IMPRESSPROJECT/connectASPIC), an R package which fits ASPIC in R connecting with Version 7 of the ASPIC Suite program. For this aim, our package contains three functions: the first one creates and fills up an input file(.a7inp) for ASPIC program, once the input file is available our second function callsfrom R the ASPIC executable to fit ASPIC based on the created input file, and finallythe resulting output file is reading in R using our last function.

Therefore, using connectASPIC, the SPM can be fitted automatically, and hence studies above the effect on its performance depending of the input information can be carriedout easily.

Investigadores: Maria Grazia Pennino ()Marta Cousido Rocha ()Anxo PazSantiago Cerviño

Acuerdo entre observadores

Modelo Delta de acuerdo nominal entre dos observadores. El modelo está implementado en forma de aplicación para MS-Windows, web interactiva (php), paquete de R, aplicación Shiny

Inferencia estadística sobre datos doblemente truncados

Implementation of different algorithms for analyzing randomly truncated data, one-sided and two-sided (i.e. doubly) truncated data. It also computes the kernel density and hazard functions using different bandwidth selectors. Several real data sets are included.

Investigadores: Jacobo de Uña Álvarez ()Rosa Crujeiras Casais ()Carla Moreira

Programa para tablas 2×2 para el caso Multinomial

Proporciona el nivel de significación exacto incondicionado bilateral para comparar las hipótesis nulas H: p1–p2≤ d1 o bien H: p1–p2 ≥ d2; H: d1≤p1–p2≤ d2

Análisis de datos funcionales

Análisis de datos funcionales en R: análisis descriptivo, modelos de regresión, clasificación y selección de variables.

Investigadores: Manuel Oviedo de la Fuente ()Manuel Febrero Bande

Programa para tablas 2×K y distribución hipergeométrica multivariante

Implementa el test exacto de Fisher para tablas de contingencia 2×c. Proporciona la probabilidad de la tabla observada y el nivel de significación exacto del test.

Programa para tablas 2×K y distribución hipergeométrica multivariante

Implementa el test exacto de Fisher para tablas de contingencia 2×c. Proporciona la probabilidad de la tabla observada y el nivel de significación exacto del test. El tiempo y la cantidad de memoria requerida está controlado por el nivel de precisión deseado en la determinación del nivel de significación.

Datos censurados en un intervalo

Pruebas de hipótesis para datos censurados por la derecha y en un intervalo.

Investigadores: Ramon Oller Piqué (Catalunya – BIO)Klaus Langohr (Catalunya – BIO)

Es una implementación web de una herramienta de vigilancia epidemiológica para la detección del instante en el que comienza la epidemia anual de gripe. El método estadístico subyacente está descrito en Martínez-Beneito et al. (Bayesian Markov switching models for the early detection of influenza epidemicsStatistics in Medicine, 27(22), 4455-4468).

Tras darse de alta, los usuarios pueden introducir y editar sus datos de tasas de incidencia de gripe y solicitar al sistema la probabilidad de estar en fase epidémica siempre y cuando se disponga de datos históricos de al menos 3 temporadas. El sistema devuelve (via e-mail si así lo desea el usuario) dicha probabilidad junto con la probabilidad (en el caso de estar en estado epidémico) de que en la semana siguiente haya un aumento o una disminución en la tasa de incidencia y dos gráficas que completan dicha información.

Para el desarrollo de este sistema se ha utilizado software estadístico de libre distribución (R y WinBUGS), un servidor web de aplicaciones java (Tomcat) y un servidor de base de datos (MySQL). Esta aplicación ha sido creada por miembros del grupo “Geeitema” dentro del proyecto “MEVIEPI”. Para cualquier otra pregunta o consulta puede contactar con: fludetweb@geeitema.org .

Estimación de medidas basimétricas en entorno forestal

Process automation of Terrestrial Laser Scanner (TLS) point cloud data derived from single scans. ‘FORTLS’ enables (i) detection of trees and estimation of diameter at breast height (dbh), (ii) estimation of some stand variables (e.g. density, basal area, mean and dominant height), (iii) computation of metrics related to important forest attributes estimated in Forest Inventories (FIs) at stand level and (iv) optimization of plot design for combining TLS data and field measured data.

Investigadores: María José Ginzo Villamayor ()Juan Alberto Molina-ValeroManuel Antonio Novo PérezAdela Martínze-CalvoJuan Gabriel Álvarez-GonzálezFernando MontesCésar Pérez-Cruzado