# SCS 2017: Longitudinal and Nested Data/References and links

### From Wiki1

## Contents |

## General references on multilevel and longitudinal data analysis

- John Fox & Sanford Weisberg Mixed-Effects Models in R, An Appendix to An R Companion to Applied Regression, Second Edition
- Judith D. Singer & John B. Willett (2003)
*Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence*, New York: Oxford University Press. - Snijders, Tom A.B., and Bosker, Roel J. (2012) Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling, second edition.
- JosĂ© C. Pinheiro, Douglas M. Bates (2000) Mixed-Effects Models in S and S-PLUS, Springer
- The original textbook on the nlme package covering both linear and non-linear (in beta's) models with normal random errors and random effects.

- Andrew Gelman and Jennifer Hill (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press
- Mainly on nested models including Bayesian analyses with BUGS

- Douglas Bates' (2011) notes on mixed models and the lme4 package:

## SEMs and Mixed Models

- Papers on the relationship between the multilevel/mixed modeling and structural equation modeling (SEM) with latent variables:

## Software for Multilevel Models

- plm package for the Econometric Analysis of Panel Data
- Packages for mixed models

## Issues in mixed models

### Multilevel/Hierarchical/Mixed Model RSquared

- Snijders and Bosker (1994) Modeled Variance in Two-Level Models, Sociological Methods Research 1994; 22; 342
- Note by Andrew Gelman
- Gelman and Pardoe (2006) Bayesian Measures of Explained Variance and Pooling in Multilevel (Hierarchical) Models
- A mixed model paradox

### Random slopes

- Holger Schielzeth and Wolfgang Forstmeier (2009) "Conclusions beyond support: overconfident estimates in mixed models"
*Behavioral Ecology*advocate the wider used of random slopes models. Convergence problems are mentioned but not really addressed.

### Missing data in longitudinal studies

### Multiple imputation for longitudinal studies

- Michael Spratt et al., (2010) Strategies for Multiple Imputation in Longitudinal Studies.
- Schafer, Joe (2005) Missing Data in Longitudinal Studies: A Review
- Excellent survey.

## HMC and Stan

- Getting started with Stan in R
- This is guide to installing the latest version of Stan. You first need to install R and, preferably R Studio.

- Stan documentation
- Includes links to the 'Modelling Language Manual' which you should download and keep open in Adobe Acrobat as a constant reference. Also excellent tutorials including recent videos]

- Stan tutorial
- Good summary of structure of the Stan language

- Brief Guide to Stan's warnings
- Alvarez et al. (2014) Bayesian Inference for a Covariance Matrix
- Sorensen and Vasishth (2015) Bayesian linear mixed models using Stan: A tutorial for psychologists, linguists, and cognitive scientists
- Betancourt (2017) A conceptual introduction to Hamiltonian Monte Carlo
- Radford Neal (2011) MCMC using Hamiltonian dynamics
- excellent but quite mathematical paper

- Bracken (2015) Introduction to Stan
- Gelman and Hennig (2016) Beyond subjective and objective in statistics
- Simpson et al. (2015) Penalising model component complexity: A principled, practical approach to constructing priors
- Gelman et al (2015) Stan: A probabilistic programming language for Bayesian inference and optimization
- Gelman (2006) Prior distribution for variance parameters in hierarchical models
- In principle priors should be independent of data. But that is not a principle that Gelman seems to support in practice. From his paper:
- We view any noninformative or weakly-informative prior distribution as inherently provisionalâ€”after the model has been fit, one should look at the posterior distribution and see if it makes sense. If the posterior distribution does not make sense, this implies that additional prior knowledge is available that has not been included in the model, and that contradicts the assumptions of the prior distribution that has been used. It is then appropriate to go back and alter the prior distribution to be more consistent with this external knowledge.

- In principle priors should be independent of data. But that is not a principle that Gelman seems to support in practice. From his paper:
- Dropping the puck

### Important pointers

- Don't use a prior with a smaller range than the range specified by the limits on the prior.

### Blog entries

- Choosing priors -- under current revision by Andrew Gelman
- Example using lkj Cholesky prior
- LKJ prior
- Dealing with divergent transitions

### Near-zero variances

Likelihood goes to but probability is nearly 0.

- Bates (2014) Random effect variance = zero would simplify the model by dropping the variance.
- In a sense, he would lasso them to zero.

- Gelman (2011) Avoiding boundary estimates using a prior distribution as regularization
- This 2011 blog post by Gelman discusses the use of a prior for regularization when using a 'glmer' or 'lmer' to avoid boundary correlations and boundary variances in the G matrix. He suggests a gamma(2, 1/A) where A is large for for the scale parameter, i.e. the between cluster standard deviation. The expected value of the distribution is then 2A and the density approaches 0 at 0.
- Using Stan, one can use a gamma(2, 1/A) prior for the scale parameters and a LKJ_corr_cholesky(a) prior with a > 1, for the correlation matrix.

- LKJ priors