Computing optimal cutpoints in diagnostic tests.

Researchers: Mónica López Ratón ()María Xosé Rodríguez Álvarez (Galicia)

**Multivariate analysis with application to biomedicine**

ORdensity gives the user the list of genes identified as differentially expressed genes in an easy and comprehensible way. The experimentation carried out in an off-the-self computer with the parallel execution enabled shows an improvement in run-time. This implementation may also lead to an important use of memory load. Results previously obtained with simulated and real data indicated that the procedure implemented in the package is robust and suitable for differentially expressed genes identification.

Researchers: Concepción Arenas Sola (Catalunya – BIO)José María Martínez-OtzetaItziar IrigoienBasilio Sierra

p3state.msm provides functions for estimating semi-parametric regression models but also to implement nonparametric estimators for the transition probabilities. The methods can also be used in progressive three-state models. In progressive three-state models, estimators for other quantities such as the bivariate distribution function (for the sequentially ordered events) are also given.

Researchers: (p3state.msm)Javier Roca Pardiñas ()

**Programa para tablas de contingencia R×C**

Proporciona la distribución de probabilidad correspondiente a tablas de contingencia R×C con totales marginales fijos e independencia entre filas y columnas.

**Programa para tablas 2×2 ****para el caso doble binomial**

Proporciona el nivel de significación exacto incondicionado bilateral para contrastar la hipótesis nula H: p_{1}–p_{2}≤ d_{1} o bien H: p_{1}–p_{2} ≥ d_{2}

Se trata de una herramienta que muestra puntuaciones de riesgo de mala evolución en pacientes con exacerbación aguda de EPOC a partir de modelos predictivos previamente desarrollados y validados. La aplicación está disponible para Android y Windows.

Researchers: Inmaculada Arostegui Madariaga (País Vasco)Irantzu Barrio Beraza (País Vasco) (PrEveCOPD)María J. LegarretaCristobal EstebanSusana García-Gutiérrez

Paquete de R que proporciona una variedad de herramientas para el análisis de datos de cuestionarios con resultados percibidos por el paciente. En particular, los modelos de efectos mixtos basados en la distribución beta-binomial se implementan para tratar con datos binomiales con sobredispersión.

Researchers: Josu Najera Zuloaga (País Vasco)Dae-Jin Lee (Madrid – SEMIPAR)Inmaculada Arostegui Madariaga (País Vasco)

**Multivariate regression models**

refreg R package is a software implementation of a new statistical model for estimating reference regions conditioned to a set of covariates. This statistical methodology is based on a multivariate location-scale model which provides probabilistic regions covering a specific percentage of the data conditionally on covariates.

Researchers: (refreg)Javier Roca Pardiñas () (refreg) (refreg)

**Dealing with with uncertainty for analysing the dynamic of exploited fish populations**

The analysis of the dynamic of a population has become a fundamental tool in ecology, conservation biology, and particularly in fisheries science to assess the status of exploited resources. Uncertainty is an inherent component in fishery systems that makes difficult taking management decisions. Here, we present Rfishpop (available on https://github.com/IMPRESSPROJECT/Rfishpop) a package to deal with with uncertainty for analyzing exploited populations in R. More precisely, Rfishpop package address such aims implementing a completed Management Strategy Evaluation (MSE) cycle which is a simulated approach explicitly designed to identify fishery rebuilding strategies and ongoing harvest strategies that are robust to uncertainty and natural variation (Punt et al. 2016 and Kell et al., 2007).

A prototypical MSE incorporates a number of interlinked model structures. The steps for a MSE cycle, are:

- Population dynamics and fishing activity (Operating Model, OM): An operating model is typically used to generate “true” ecosystem dynamics including the natural variations in the system.
- Data collection: Data are sampled from the OM to mimic collection of fishery dependent data and research surveys (and their inherent variability).
- Data analysis, stock assessment and Harvest Control Rule (HCR): These data are passed to the assessment model. Based on this assessment and the HCR, a management action is determined (e.g., a change in the Total Allowable Catches, TAC)
- Implementation of the HCR: Corresponding fleet effort and catch are then modelled, potentially allowing for error in implementation, and resulting catches are fed back into the operating model, OM. By repeating this cycle the full management process is modelled.

It is possible to test the effect of modifying any part of this cycle including changes to the operating model, assumptions about noise, etc. Alternative Management Procedures (MPs) can be compared by running many stochastic simulations, each for several years, to identify the performance of a rule according to different metrics under the likely range of conditions.

In its current state, the package includes tools to simulate the real dynamics of a fishery using a generic age-structured operating model. The OM models a biological system with recruitment, growth, maturity and natural mortality and a fishery system were fishing intensity and selection. This allows to implement structural uncertainty having different options for each process and natural stochasticity playing with variability in these processes. Once the exploited population has been generated through the OM, the package also contains a set of methods to estimate biological reference points as Maximum Sustainable Yield (MSY) reference points (Hart and Reynolds, 2002). These points allow to identify management targets in terms of fishing intensity, population status and yield. The package also contains statistical methods for sampling data from the OM simulating sampling error, which is another source of uncertainty in fishery management. These methods provides different data types which can suit different assessment methods, from simple data-limited methods to more complex age or length-structured methods (examples of assessment models can be found in Chapters 6 and 7 of Haddon, 2002).

As we mentioned above, the data obtaining from the sample functions are passed to the assessment model. Our package does not develop any new assessment model as the idea is to implement the already existents ones. The package contains specific functions to change the format of the data reported by Rfishpop into the required format of the assessment model function. Finally, the package contains functions to implement the resulting management action, determined from the assessment and the HCR, projecting our exploited population through the years on based of catches or effort established by the management action. The described functions of Rfishpop package allow to verify the performance of management strategies or procedures in different settings generated from the OM. The package is also useful to check the performance of assessment models when some their assumptions are violated or some parameters are misspecificated. It is important to stand out that this package is an open project, future aims focus on introducing new posibilities at some steps of the MSE cycle and also improvements in some of the procedures already implemented.

Researchers: Marta Cousido Rocha ()Maria Grazia Pennino ()Santiago Cerviño

**Programa para tablas 2×2 ****para el caso doble binomial**

Proporciona el nivel de significación exacto incondicionado bilateral para contrastar la hipótesis nula H: d_{1}≤p_{1}–p_{2}≤ d_{2}

**Programa para tablas 2×2 ****para el caso doble binomial**

Proporciona el nivel de significación exacto incondicionado para el test unilateral H: p_{1}–p_{2}=d

**Programa para tablas 2×2 para el caso Multinomial**

Proporciona el nivel de significación unilateral, exacto y asintótico, incondicionado para contrastar la hipótesis nula H: p_{1}–p_{2}=d

**Multiple testing**

This package implements seven different methods for multiple testing problems. The Benjamini and Hochberg (1995) false discovery rate controlling procedure and its modification for dependent tests Benjamini and Yekutieli (2001), the method called Binomial SGoF proposed in Carvajal Rodríguez et al. (2009) and its conservative and bayesian versions called Conservative SGoF (de Uña Álvarez, 2011) and Bayesian SGoF (Castro Conde and de Uña Álvarez, 2013 13/06), respectively, and the BB-SGoF (Beta-Binomial SGoF, de Uña Álvarez, 2012) and Discrete SGoF (Castro Conde et al., 2015) procedures which are adaptations of SGoF method for possibly correlated tests and for discrete tests, respectively. Number of rejections, FDR and adjusted p-values are computed among other things.

Researchers: (sgof)Jacobo de Uña Álvarez ()

**Survival analysis**

Provides flexible hazard ratio curves allowing non-linear relationships between continuous predictors and survival. To better understand the effects that each continuous covariate has on the outcome, results are ex pressed in terms of hazard ratio curves, taking a specific covariate value as reference. Confidence bands for these curves are also derived.

Researchers: (smoothHR)Artur Araújo

**Programa para tablas 2×2 ****para el caso doble binomial**

Proporciona el nivel de significación exacto incondicionado para comparar dos proporciones binomiales independientes según la hipótesis nula H: p_{1}=p_{2}

**Epidemiología genética**

Plataforma online para analizar estudios de epidemiología genética

Researchers: Joan Valls Marsal (Catalunya – BIO)Raquel Iniesta Benedicto (Catalunya – SEA)Victor Raúl Moreno Aguado (Catalunya – SEA)Xavier SoleElisabet Guino

**Teaching**

This is a collection of programs to illustrate themes in teaching statistics, such as the central limit theorem, confidence intervals, bootstrapping, nonparametric statistics

Researchers: Michael J. Campbell (Statistics at Square One)

**multi-state models**

Newly developed methods for the estimation of several probabilities in an illness-death model. The package can be used to obtain nonparametric and semiparametric estimates for: transition probabilities, occupation probabilities, cumulative incidence function and the sojourn time distributions. Additionally, it is possible to fit proportional hazards regression models in each transition of the Illness-Death Model. Several auxiliary functions are also provided which can be used for marginal estimation of the survival functions.

Researchers: (survidm)Marta SesteloGustavo Soutinho

**Design of phase III trials with long-term survival outcomes based on short-term binary results**

Sample size and effect size calculations for survival endpoints based on mixture survival-by-response model

Researchers: Marta Bofill Roig (Catalunya – BIO)Guadalupe Gómez Melis (Catalunya – BIO)Yu Shen

**Análisis de la supervivencia**

Simulación de datos de supervivencia complejos

Researchers: David Moriña Soler (Catalunya – SEA) (survsim)