Improving the Measurement of School Climate

Sarah Lindstrom Johnson, Ray E. Reichenberg, Kathan Shukula, Tracy E. Waasdorp, Catherine Bradshaw

Introduction: Research over the past decade has provided clarity in defining school climate, improvements in its measurement, and more sophisticated study designs allowing for testing causal relationships between climate and academic and behavioral outcomes. Additionally, the federal government has become increasingly focused on school climate, recently evidenced by its inclusion as an accountability measure in the Every Student Succeeds Act. This suggests that research in the next decade should focus on how to improve the assessment of school climate as well as ensure measurement equivalence for different groups of students. This paper will address both of these aims using item-response theory (IRT), a commonly used paradigm for the design of achievement assessments.

Methods: Students (n=69,513) in 98 schools involved in the Maryland Safe and Supportive Schools Project completed a school climate assessment focused on three pillars of climate: safety, engagement, and environment as defined by the US Department of Education. Item characteristics were estimated for 90 items using the ‘mirt’ package in R for each of the three subscales using unidimensional item response theory. Item parameters (e.g., difficulty, discrimination) and test information, as well as local and global fit statistics were utilized to create parsimonious assessments of safety, engagement, and environment. Differential item functioning analyses were conducted in order to identify any items exhibiting bias across any of five areas: race/ethnicity, gender, academic performance, school-level, maternal education.

Results: A total of 30 items were retained across the three constructs (10 items per construct). Item information revealed low discrimination items, particularly for the safety construct as well as items that were redundant, particularly for the engagement construct. Final test information curves indicated reliability estimates above .70 across the spectrum of student perspectives corresponding to +/- 3 SD’s from neutral. Item and test characteristics revealed a greater ability to assess negative perspectives on climate, particularly with regard to safety. Differential item functioning analyses indicated differences in expected scores between groups well below acceptable cut-points (range= -.585-.239).

Discussion: In order to further the inclusion of school climate as an important component of school accountability it is paramount to be able to measure it with precision. IRT allows for the creation of a parsimonious assessment and the evaluation of its functioning by student sub-groups, in turn addressing two key barriers to the use of research-informed assessments of school climate in practice.

This abstract was submitted to the 2017 Society for Prevention Research Annual Meeting.