Software Archive - Biostatnet

AntoineOptimal

Node DoE+Bioinformatics Design of clinical trials and experiments

Diseño Óptimo de Experimentos

Cálculo de Diseños Óptimos para la ecuación de Antoine

Researchers: Carlos de la Calle Arroyo ()Jesús López Fidalgo ()Licesio J. Rodríguez Aragón (DoE+Bioinformática)

The ASURI (Analysis of SUrvival and patients RIsk prediction based on gene signatures) package discovers marker genes that are related to risk prediction capabilities and to a clinical variable of interest.

Researchers: Alberto Berral González (DoE+Bioinformática)María Sánchez Martín (DoE+Bioinformática)Natalia Alonso Moreda (DoE+Bioinformática)José Manuel Sánchez Santos (DoE+Bioinformática)Javier de Las Rivas Sanz (DoE+Bioinformática)Santiago Bueno-FortesManuel Martín-Merino

AUCFR

Node Catalunya – BIO Variable selection

Variable selection using Random Forest based on optimizing the area-under-the ROC curve (AUC) of the Random Forest.

Researchers: María Luz Calle Rosingana (Catalunya – BIO)Víctor Urrea

BayesVarSel

Node Valencia – VABAR Variable selection

Selección Bayesiana de Variables en modelos lineales

Hypothesis testing, model selection and model averaging are important statistical problems that have in common the explicit consideration of the uncertainty about which is the true model. The formal Bayesian tool to solve such problems is the Bayes factor (Kass and Raftery, 1995) that reports the evidence in the data favoring each of the entertained hypotheses/models and can be easily translated to posterior probabilities.

This package has been specifically conceived to calculate Bayes factors in linear models and then to provide a formal Bayesian answer to testing and variable selection problems. From a theoretical side, the emphasis in the package is placed on the prior distributions (a very delicate issue in this context) and BayesVarSel allows using a wide range of them: Jeffreys-Zellner-Siow (Jeffreys, 1961; Zellner and Siow, 1980,1984) Zellner (1986); Fernandez et al. (2001), Liang et al. (2008) and Bayarri et al. (2012).

The interaction with the package is through a friendly interface that syntactically mimics the well-known lm command of R. The resulting objects can be easily explored providing the user very valuable information (like marginal, joint and conditional inclusion probabilities of potential variables; the highest posterior probability model, HPM; the median probability model, MPM) about the structure of the true -data generating- model. Additionally, BayesVarSel incorporates abilities to handle problems with a large number of potential explanatory variables through parallel and heuristic versions (Garcia-Donato and Martinez-Beneito 2013) of the main commands.

Researchers: (BayesVarSel)Anabel Forte Deltell (Estadística Espacial)

BIOKMOD & BIOKMODWEB

Node DoE+Bioinformatics Modeling of biological, ecological and environmental processes.

Modelización y diseño de experimentos

Modelling with differential equations, fitting coefficients, convolution, and more, with application for modelling Biokinetic Systems. It included the current ICRP biokinetic models. It can be applied in pharmacokinetic, internal dosimetry, bioassay evaluations, nuclear medicine and more. Run online using a webbrowser (http://ehe.usal.es/webMathematica/)

Researchers: José Guillermo Sánchez León ()

breedR

Node Valencia – VABAR Modeling of biological, ecological and environmental processes.

Estadística espacial, Modelos lineales mixtos, Análisis de pedigrí

Statistical methods for forest genetic resources analysts.

Researchers: Facundo Muñoz Viera ()

bwsurvival

Node Catalunya – BIO Survival

This R package implements the conditional survival function described in ‘Nonparametric bivariate estimation for successive survival times’ (C. Serrat and G. Gómez, 2007).

Researchers: Carles Serrat i Piè (Catalunya – BIO)Guadalupe Gómez Melis (Catalunya – BIO)Victoria Moneta

CatPredi

Node País Vasco Classification

Categorización de variables continuas en modelos predictivos

El paquete de R CatPredi permite seleccionar puntos de corte óptimos tanto en un modelo logístico como en un modelo de cox de riesgos proporcionales. Permite categorizar la variable continua en k (a elección del usuario) categorias, considerando un modelo univariante o múltiple.

Researchers: Irantzu Barrio Beraza (País Vasco)María Xosé Rodríguez Álvarez (Ciencia abierta y Software)Inmaculada Arostegui Madariaga (País Vasco)

CompARE

Node Catalunya – BIO Design of clinical trials and experiments

CompARE is a free online platform to help investigators to design randomized clinical trials with composite endpoint. Following features are available:

Choose the primary endpoint
Evaluate the gain in efficiency of using a primary composite endpoint either in the context of time-to-event or binary outcome.
Obtain the sample size for a composite endpoint based on the information on its components and the correlation between them.
Obtain the sample size of an study using a single endpoint.
Get helpful numerical and intuitive graphical results.

Researchers: Guadalupe Gómez Melis (Catalunya – BIO)Marta Bofill Roig (Catalunya – BIO)Jordi Cortés Martínez (Catalunya – BIO)Moisés Gómez Mateu (CompARE)

CompAREdesign

Node Catalunya – BIO Design of clinical trials and experiments Survival

Clinical trial designs with composite endpoints

R package to calculate the required sample size in randomized clinical trials with composite endpoints. This package also includes functions to calculate the probability of observing the composite endpoint and the expected effect on the composite endpoint, among others.

Researchers: Marta Bofill Roig (Catalunya – BIO)Jordi Cortés Martínez (Catalunya – BIO)Guadalupe Gómez Melis (Catalunya – BIO)

condSURV

Node Galicia Survival

Survival analysis

Method to implement some newly developed methods for the estimation of the conditional survival function.

Researchers: (condSURV)Marta Sestelo

connectASPIC: Fitting ASPIC (A Stock Production Model Incorporating Covariates)

Node Valencia – VABAR Modeling of biological, ecological and environmental processes.

Analysis of the dynamic of fish populations

In recent years, there has been an increasing research effort on developing methods that can generally improve the reliability of stock assessments in data-limited situations. Consequently, several data-limited assessment methods have been proposed, and surplus production models (SPMs) were among the assessment methodologies recommended for this purpose, which only requires time series of an index of relative biomass and catch data. SPMs are one of the the simplest analytical methods available for providing a full stock assessment, which estimates the changes in the biomass as afunction of the biomass of the previous year, the surplus production and the catches.

A well-known and widely used SPM is ASPIC (A Stock-Production ModelIncorporating Covariate; see Prager (1992) and (1994)). ASPIC can be fitted through ASPIC Suite program, some executables availables at(http://www.mhprager.com/aspic.html). A disadvantage of ASPIC suite program is thatthe input file, which contains the input data and the available prior values of the modelparameters, must be created manually, and then the executable must be openned to fitthe model. Hence, if we are interesting on running slighly different ASPIC’s varying theprior values of the model parameters, for example, the previous procedure has to berepeated manually several times.

For solving the mentioned problem, we have developed connectASPIC (available at https://github.com/IMPRESSPROJECT/connectASPIC), an R package which fits ASPIC in R connecting with Version 7 of the ASPIC Suite program. For this aim, our package contains three functions: the first one creates and fills up an input file(.a7inp) for ASPIC program, once the input file is available our second function callsfrom R the ASPIC executable to fit ASPIC based on the created input file, and finallythe resulting output file is reading in R using our last function.

Therefore, using connectASPIC, the SPM can be fitted automatically, and hence studies above the effect on its performance depending of the input information can be carriedout easily.

Researchers: Maria Grazia Pennino ()Marta Cousido Rocha ()Anxo PazSantiago Cerviño

DELTA

Node Granada Contingency tables

Agreement between two raters in nominal scale

Model Delta of nominal agreement between two raters. The model is implemented as MS-Windows app, interactive web (in php), R package, Shiny app

DTDA

Node Galicia Epidemiological models

Inferencia estadística sobre datos doblemente truncados

Implementation of different algorithms for analyzing randomly truncated data, one-sided and two-sided (i.e. doubly) truncated data. It also computes the kernel density and hazard functions using different bandwidth selectors. Several real data sets are included.

Researchers: Jacobo de Uña Álvarez (Supervivencia)Rosa Crujeiras Casais ()Carla Moreira

EQUIV_ASO

Node Granada Contingency tables

Programa para tablas 2×2 para el caso Multinomial

Proporciona el nivel de significación exacto incondicionado bilateral para comparar las hipótesis nulas H: p₁–p₂≤ d₁ o bien H: p₁–p₂ ≥ d₂; H: d₁≤p₁–p₂≤ d₂

fairgwr

Node Andalucía Occidental Spatial statistics

fairgwr: an R package for fairness-regularized geographically weighted regression

Researchers: Pepa Ramírez Cobo ()Ismael Montero

fda.usc

Node Galicia Classification Dimension reduction techniques Predictive models Variable selection

Análisis de datos funcionales

Análisis de datos funcionales en R: análisis descriptivo, modelos de regresión, clasificación y selección de variables.

Researchers: Manuel Oviedo de la Fuente ()Manuel Febrero Bande

FET2xc

Node Granada Contingency tables

Programa para tablas 2×K y distribución hipergeométrica multivariante

Implementa el test exacto de Fisher para tablas de contingencia 2×c. Proporciona la probabilidad de la tabla observada y el nivel de significación exacto del test.

FET2xc_i

Node Granada Contingency tables

Programa para tablas 2×K y distribución hipergeométrica multivariante

Implementa el test exacto de Fisher para tablas de contingencia 2×c. Proporciona la probabilidad de la tabla observada y el nivel de significación exacto del test. El tiempo y la cantidad de memoria requerida está controlado por el nivel de precisión deseado en la determinación del nivel de significación.

FHtest

Node Catalunya – BIO Survival

Datos censurados en un intervalo

Pruebas de hipótesis para datos censurados por la derecha y en un intervalo.

Researchers: Ramon Oller Piqué (Catalunya – BIO)Klaus Langohr (Supervivencia)