R Tutorial: What is a hierarchical model?

DataCamp
DataCamp
21.8 هزار بار بازدید - 4 سال پیش - Want to learn more? Take
Want to learn more? Take the full course at https://learn.datacamp.com/courses/hi... at your own pace. More than a video, you'll learn hands-on coding & quickly apply skills to your daily work.

---

Hi, I'm Richard Erickson. I'm a data scientist and I'll be teaching you about mixed-effects models.

Data can have several types of structure, including being nested within itself, thus making the data "hierarchical". During this course, you will learn how to analyze this data using the lme4 package. We will also go over how to plot this data and describe the results. First, we will go over the basic parts of a mixed-effects model and see how it may be applied to student test scores. Second, we will learn how to apply and interpret the results of linear mixed-effects regression or model. Third, we will learn how to use a generalized linear mixed-effects model. Lastly, we will apply mixed effect models to time-series analysis as part of a repeated measures analysis.

Why do we use a hierarchical model? Sometimes we have data that can be nested within itself and our observations are not truly independent. For example, we might have a set of student test scores where each student has her or his own test score. But, student performance can vary because of classroom-level factors, such as teacher quality, or school-level factors, such as building conditions. Hence, one might ask, "are students really independent from other students in the same classroom or school?" The answer is: probably not. Additionally, what if each classroom has a different number of students? For example, maybe the 5th grade has 30 students while the 3rd grade only has five. By chance alone, the 3rd-grade test scores are more likely to have high or low outliers because of the law of large numbers. By treating classrooms as a "random-effect" within the model, we can pool shared information about means across the classrooms within the same school. Lastly, what if we revisit the same group of students year-after-year? In this scenario, our observations are not independent across years. A repeated-measures analysis, another example of a hierarchical model, allows us to correct for this. We'll revisit this in chapter 4.

In fact, all three of the problems I just described occur somewhat often in statistics. These models go by different names: nested models, like the classrooms nested within schools in our example, hierarchical models because the data has, not surprisingly, a hierarchy, and multi-level models because we have two or more levels of interest. Also, in a regression framework, the multi-level covariate can sometimes be called "a random-effect" that "pools" information across groups. A model with both a "fixed-effect", such as the standard linear regression and a "random-effect" is called a "mixed-effect" model. Last, some models can account for re-sampling the same individuals or groups over time. These models are called "repeated-measures" or "paired-tests".

In the following exercises, we'll explore a dataset of test scores collected from elementary school children between kindergarten and 1st grade to assess how their math knowledge improved. The data is a subset of a national-level exam given across the United States and is from a book by West and collaborators. The dataset contains several variables at different levels: the individual student-level, the classroom/teacher-level, and the school-level. At the end of this chapter, we will fit a multi-level model to our data. But first, let's explore the dataset!

The purpose of the next coding exercise is to show you that linear models do not always produce intuitive results and that it is necessary to add a new technique to your modeling toolbox. As you can see on this slide, the data contains two levels of data: classroom and student-level data.

Now, let's go look at the school data!
4 سال پیش در تاریخ 1398/12/07 منتشر شده است.
21,805 بـار بازدید شده
... بیشتر