Spida/hsfull: description

From Wiki1

Jump to: navigation, search

hsfull.csv hs.csv hs1.csv hs2.csv

Math achievement and ses in a sample of 160 U.S. schools from the 1982 study “High School and Beyond”.



This is a classical data set from the field of education used to illustrate multilevel data and models. It is used in the first edition of Bryk and Raudenbush. hsfull is the complete data set with 160 high schools, hs is a random subset of 40 high schools, hs1 is a random subset of 80 schools and h2 contains the complement of hs1. These two subsets can be used to illustrate split sample validation: develop a model on one half of the data and assess its performance on the other. complementary


hsfull is a data frame with 7185 observations on 160 schools on the following 9 variables. hs, hs1 and hs2 consist of subsets of 40 schools, 80 schools selected randomly and the remaining 80 schools respectively.


school id


measure of math achievment


socio-economic status of family


a factor with levels Female Male


a factor with levels No Yes


the size of the school


a factor with levels Catholic Public


a measure of the priority given by the school to academic subjects


a measure of the disciplinary climate in the school


Each row consists of the data for one student. hsfull is the complete data set. hs1 and hs2 are complementary split halves of the schools in the data. hs is a selection of 40 schools which seems to be a good number of clusters for presentations in class.

Source and Reference

Raudenbush, Stephen and Bryk, Anthony (2002), Hierarchical Linear Models: Applications and Data Analysis Methods, Sage (chapter 4).

Personal tools