Using hierarchical linear models to test differences in Swedish results from OECD’s PISA 2003: Integrated and subject-specific science education

3(2), 2007 Abstract The possible effects of different organisations of the science curriculum in schools participating in PISA 2003 are tested with a hierarchical linear model (HLM) of two levels. The analysis is based on science results. Swedish schools are free to choose how they organise the science curriculum. They may choose to work subject-specifically (with Biology, Chemistry and Physics), integrated (with Science) or to mix these two. In this study, all three ways of organising science classes in compulsory school are present to some degree. None of the different ways of organising science education displayed statistically significant better student results in scientific literacy as measured in PISA 2003. The HLM model used variables of gender, country of birth, home language, preschool attendance, an economic, social and cultural index as well as the teaching organisation.


Introduction
In Sweden, schools are free to choose different ways of organising the science curriculum at the local level.This gives schools the opportunity to teach science integrated or subject-specifically (Skolverket, 2001a).In the following we name the different ways of organising the curriculum "teaching organisation" which is a less precise, but shorter, term.We also use the terms "integrated" and "thematic" as synonymous concepts.In this explorative study, the results from PISA 2003 are matched with essentially one question made in a survey to the same schools.The question was if the schools teach thematically (as the integrated teaching organisation was called in the survey) or subject-specifically.The original question (in Swedish) is found in Box 1.The additional survey of the PISA 2003 schools was performed in the autumn of 2003, after the main collection of PISA 2003 results was completed.Data from the two connected studies are used to discover if there is a relation between the different ways of organising science teaching and student results in science in the PISA 2003 study in Sweden.
The question asked in this study is: Does the teaching organisation in an integrated or a subjectspecific way affect student results in scientific literacy?Since curriculum discussions regarding science teaching in Sweden as well as in other countries have a component that concerns the use of integrated or subject-specific teaching, it should be of interest to investigate if these two teaching organisations produce different results.
A description of the essential features of the data used in this study is presented first.The next section deals with the methods of statistical analysis performed in this study.After that, results and their interpretation follows, after which the model is assessed and diagnosed.A discussion of the results concludes the article.
How PISA measures results Harlen (2001) discusses the rationale of the PISA-study.PISA aims at defining each domain not merely in terms of mastery of the school curriculum, but to test adolescents on important knowledge and skills needed in adult life.The aims of PISA are well aligned with ideas regarding an integrated curriculum (Aikenhead, 2003;Schwab, 1989;Showalter, 1973), especially the idea that the student is learning for life and needs to be able to learn new things later in life.The framework for PISA is also rather well aligned with the Swedish science curriculum (Skolverket, 2006).Scientific literacy in PISA 2003 is defined as: 'Scientific literacy is the capacity to use scientific knowledge, to identify questions and to draw evidence-based conclusions in order to understand and help make decisions about the natural world and the changes made to it through human activity.'(OECD, 2003) In this study the overall science results of PISA 2003 have been used.This result is calculated from the raw score (the students results from the test booklets) of the science questions iterated by Item Response analysis and weighted to a mean of 500 points with a standard deviation of 100 points (OECD, 2002(OECD, , 2005a(OECD, , 2005b)).Examples of items used in PISA can be found in the Swedish reports (Skolverket, 2001b(Skolverket, , 2004)).

Sample
The OECD's PISA is an international study of student performance in Mathematical, Reading and Scientific literacy.In the year 2003 Mathematical literacy was the main subject of study and the other subjects were minor.The sample of students is randomly sampled in two steps (OECD, 2005b).This sample procedure results in nesting students at the first level with schools on the second level.The sample in PISA 2003 was fifteen year-old students.This study is restricted to Undervisningsgrupp syftar på de av er angivna undervisningsgrupper som hade elever som deltog i PISA-studien under våren 2003.I kolumnen Tema-eller ämnesundervisning markerar ni hur undervisningen i huvudsak har bedrivits i undervisningsgruppen under våren 2003.I följande kolumner anger ni namn och e-post till undervisande lärare i gruppen.För de grupper som haft en lärare i NO som arbetat med temaundervisning/integrerad ämnesundervisning anges namn i kolumnen för temalärare.För de grupper som haft lärare som undervisat i uppdelade ämnen i kemi/ biologi/fysik anges namn i de för dessa kolumner angivna.

Maria Åström and Karl-Göran Karlsson
Box 1. Original question to the schools of PISA 2003PISA , asked in autumn 2003. [123] . [123] 3(2), 2007 students that took the science part of the PISA 2003 test and were in the ninth grade of compulsory school.Since PISA 2003 collected data from students in the seventh, eighth, ninth and tenth grades, students outside the ninth grade were excluded.In the original sample of PISA 2003 there were 4624 students of which 4420 students were in grade nine.Of these 2359 students have science results from PISA 2003 and in our sample there are 1867 students.The remaining 492 students were in schools that did not answer the survey of autumn 2003.
The data collected in the survey of autumn 2003 shows different organisations in science teaching (integrated, subject-specific and mixed science).132 out of 172 schools answered the survey, corresponding to 77 percent of the PISA 2003-schools that had students in grade nine.Analysis of the data at the school level reveals that around 20 per cent of the schools organised science teaching in an integrated way and around 20 per cent of the schools used a mixed organisation (e.g.subject-specific in some classes and thematic in others, or sometimes subject-specific and sometimes thematic).The rest of the schools used subject-specific teaching.Small schools taught integrated science more often than large schools (Åström, 2004).When the sample was analysed at the student level, the proportion of students with integrated science remained at 22 per cent, but students with mixed teaching were only 12 per cent.The rest of the students received subject-specific teaching.This means that schools had several classes where teachers taught subject-specifically and few classes where teachers taught integrated at mixed schools.
The sample was analysed with simple mean comparison for the students with different teaching organisations, both at individual level and at school level.This comparison is found in table 1.Using hierarchical linear models to test differences

3(2), 2007
There is a slight tendency for the mixed group to have a higher mean than the other groups, but as seen from the t-values this is not significant.A comparison of some variables that was used by PISA in the analysis of student results showed that there were differences between the groups, for instant the Economic, cultural and social index was higher in the mixed group than in the other groups.There were also differences between the groups of students with regard to ethnicity.It was decided to get a better analysis of the data using a more complex model with some variables used that have proven to give differences in the students results.The variables used in the study are found in table 3.
Table 3. Variables of the study and their description.
The independent variables used in this study were selected from models used in the PISA 2003 main report (OECD, 2004, p. 439).Those variables concerned gender, country of birth, home language, preschool attendance and an economic, social and cultural index at the student level (OECD, 2005a).Science result (the weighted likelihood estimate, WLE) is the dependent variable.
A list of the variables and a short description of them is in Table 3.As is seen in Table 3 there are missing data for some of the variables.The missing data cases were not substituted with dummy indicators or cases in the present analysis.The additional variable of teaching organisation was collected separately and added to the PISA data file.This additional variable is a nominal variable that groups students into three groups: integrated, subject-specific or mixed.Data on the teaching variable was collected at class level and the variable of teaching organisation can thus be connected to each student.

Analysis method
The sample was analysed with hierarchical linear models (HLM) using SPSS MIXED Linear models, since two levels of data were nested (Tabachnick & Fidell, 2007).The variable of teaching organisation was used additionally in two of the models.The variables are presented in Table 3.The gender variable did not contribute significantly to the results and could have been excluded, but was kept for comparison with PISA results.A maximum likelihood method was used in fitting the three models.
A two level hierarchical model of the variables used in PISA 2003 was first modelled (called PISA_plain in this study).The variables chosen were gender, country of birth, home language, preschool attendance and an economic, cultural and social index.The variables were fixed at the first level and the means of the schools could vary.
Secondly, a two level hierarchical model of the variables used in PISA 2003 and the variable collected in a survey from autumn 2003 was modelled (called TEACH_simple).This model was built from the same variables and the teaching organisation at the first level.The variables were fixed at the first level and the means of the schools could vary.The first level equation is: ß 0j is the intercept, ß 1 , ß 2 , ß 3 , ß 4 , ß 5 and ß 6 are coefficients of increase or decrease of the intercept depending on the variable and e ij is an error term.The index i stand for the student i at school j and runs from 1 to n, with n as a function of j, where n is the number of student with WLEscience at school j.The index j runs from 1 to 132.
The second level equation of the second model (TEACH_simple) is: where ß 0j is the random intercept, γ 00 is the grand mean over schools and u 0j is an error term that shows variation between schools.
Thirdly, a two level hierarchical method to test the variable TEACH was performed and tested (called TEACH_complex).The model's first level equation was the same as in equation ( 1) and the second level equation was performed to predict random variation of the variable TEACH between schools.It is: where ß 5j is the random effect of the variable TEACH, γ 50 is the fixed effect of the variable TEACH and u 5j is an error term that shows variation between schools.
The models were tested with a type III test (the sums of squares adjusted for any other effects that do not contain it and orthogonal to any effects, if any, that do contain it) to find if there were dependences between the variables in the model.A null model with no variables, but with random intercept, was calculated to use as a reference in the evaluation of the different models (Tabachnick & Fidell, 2007).Then a full model for each of the different models was calculated and the variances between the null model and the full model were compared in the analysis (Snijders & Bosker, 1998).
Using hierarchical linear models to test differences

Evaluation of the different models
Three different measures to evaluate the models were done.An intra-class correlation was calculated to find out if a two level hierarchical model is appropriate.The CHI2-values were calculated and compared to tabled values for the appropriate degrees of freedom, to check if the model explains better than random.The effect sizes (that is the percentage of explained variance at each level of the analysis, see below) of the three models were calculated and compared.A closer description of the used methods is found in Snijders & Bosker (1998) and Tabachnick & Fidell (2007).

Intra-class correlations
To test intra-class correlation, ρ can be calculated.ρ = s bg 2 /(s bg 2 +s wg 2 ), where s bg is variance between groups (schools) and s wg is variance within groups (schools).Intra-class correlation is a measure of the degree of dependence of individuals.Existence of intra-class correlation represents the effect of all omitted variables and measurement errors, under the assumption that the errors are unrelated (Kreft & de Leeuw, 1998, p.9). Intra-class correlation is the same for all the models (2,3 percent), which is quite small.However, even a small amount of intra-class correlation in a large sample makes a hierarchical linear model applicable.

Assessing models
A test to assess if the model makes better than random predictions was performed.This was calculated from the CHI2 value of the intercept only model and the CHI2 value of the full model with all variables fixed.The difference in CHI2-values was compared to the tabled CHI2-value with appropriate degrees of freedom, at a significance level of 0,05.The PISA_plain model has seven degrees of freedom (CHI2 tabled is 14) and for the two other models (TEACH_simple and TEACH_complex) there are nine degrees of freedom (CHI2 tabled is 17).The calculated CHI2values for the models are 201 and 202, so the models predict better than chance.

Effect sizes
Calculation of the models' effect size is performed with η 2 = (s 1 2 -s 2 2 )/s 1 2 , where s 1 2 is residual variance of the null model and s 2 2 is residual variance of the full model.However the usually used measure of explained variance (or effect size) is composed of both a within-group and a between-group component in the hierarchical linear modelling (Kreft & de Leeuw, 1998).This gives some complication in calculation of the explained variance, and two different values for each of the models are given.The model used to calculate the explained variances is taken from Snijders & Bosker (1998, p. 102) who use a combined explained variance, that can be compared to the tabled PISA values for the same entities.A typical group size of fourteen students was used in the calculations.

Result
The hierarchal linear models were analysed with SPSS MIXED Linear.The following presents a description of all three models (PISA_plain, TEACH_simple and TEACH_complex).The results of the calculation of the first model are explained in detail.The others are presented with shorter descriptions, since they are all calculated in a similar way.The variables in the tables are written with explanatory denotations, to facilitate the reading.

The PISA_plain model
A Type III test of the fixed effects of PISA_plain was conducted first.In this test, the variables Country of birth and ESCS were significant at p<<0,001 (less than 0,1 percent).The influence of Language at home and Preschool attendance were significant at p<0,05 (at a five percent level) but the variable GENDER did not significantly contribute to the model.The economic, social and cultural index used in the analysis is centred at OECD mean, and the Swedish mean is slightly higher.A table of estimated coefficients, standard errors and the t-Ratio of the variables in the PISA_plain model is in Table 4.

Maria Åström and Karl-Göran Karlsson
[127] 3(2), 2007 As may be seen in Table 4, a female student with zero Economic, social and cultural index (in accordance with the OECD mean value), not born in Sweden, not speaking Swedish at home and with more than one year of preschool had on average 431,2 PISA points.When the student is born in Sweden, this yields an increase of the dependent variable by 47,2 compared to students born in other countries.When a student is speaking Swedish at home this yields an increase of 32,2 PISA points compared to a student that speaks some other language at home.A student with no preschool education has a decrease of 23,9 compared to a student with more than one year of preschool.A one unit increase of the Economic, social and cultural index yields an increase of 33,6 PISA-points.For a regular female student in Sweden, born in Sweden, who speaks Swedish at home, has an average Economic, social and cultural index of 0,3 (the Swedish mean for Economic, social and cultural index) and who attended preschool one year or more is predicted to have 520,8 PISA points according to this analysis.A boy in Sweden with the same characteristics is predicted to have an average of 518,7 PISA points.

The TEACH_simple model
The TEACH_simple model contains the variables as described above in the methods section.All regression coefficients are fixed at the first level of the model.The intercept varies between schools with a fixed mean.A type III test of fixed effects was performed.It showed that the variables Country of birth and ESCS contribute significantly to variation in the dependent variable (at p<<0,001).The variables Language at home and Preschool attendance contribute significantly, at a significance level of 5 percent.The variables GENDER and TEACH are not significant.Table 5 shows estimates of the fixed effects from the variables.Using hierarchical linear models to test differences [128] 3(2), 2007

Variable
Effect estimates can be interpreted as above.The effect of teaching organisations is not significant, but the table shows that the students attending subject-specific teaching has 4,9 fewer PISA points than students with mixed teaching organisation.The group with integrated teaching has 1,8 fewer PISA points than students with mixed teaching organisation.Care should be taken in interpreting these results since the t-test does not show a significant result.According to this analysis, a typical Swedish student (girl) who was born in Sweden, has Swedish as her home language, has a mean Economic, social and cultural index of 0,3, attended preschool more than one year and also attended subject-specific science classes has a predicted score of 519,5 PISA points.

The TEACH_complex model
The TEACH_complex includes the variables as described above.Regression coefficients are all fixed at the first level of the model.This model also contains a random regression coefficient at the second level: the intercept that depends on the TEACH variable.This tests for the presence of differences in students' results in different teaching organisations.A type III test of fixed effect was conducted.The results were essentially the same as the type III test of the TEACH_simple model.The variables Born in Sweden and Economic, social and cultural index contribute significantly (at p<<0,001 level) as do the variables Language at home and Attended preschool (at p<0,05 level) to variation in the dependent variable.The variables GENDER and TEACH are not significant.
Table 6 shows estimates of the fixed effects from the variables in the TEACH_complex model.The estimated effects can be interpreted as above.This table is essentially the same as the table for the TEACH_simple model.Once again the analysis shows that a typical Swedish student (girl) who was born in Sweden, has Swedish as her home language, has a mean ESCS of 0,3, attended preschool more than one year and also received subject-specific science classes has a predicted score of 518,9 PISA points.

Evaluation of the models
The three models were evaluated for effectiveness, possibility of prediction and intra-class correlation as described in the method section above.The two proportional reductions of error are calculated according to Snijders & Bosker (1998)  3(2), 2007 Table 7. Evaluation of the three HLM models in this study.
As can be seen in Table 7, the both effect sizes (e.g.proportion of reduction of error for predicting an individual outcome or the proportion of reduction of the between school variance) are nearly the same for the three models, about twelve and thirty-five per cent.

Discussion
The science teaching organisation does not show statistically significant differences on science literacy results in the PISA 2003 study, according to this study.There may be several reasons for the lack of differences found between the different teaching organisations and this section deals with some of them.We also briefly discuss the treatment of missing values.
The variables in the three models explain about twelve percent of the variance between students and thirty-five per cent of the variance between schools.These numbers are comparable to the OECD model (OECD, 2004, p. 439).That model deals with mathematics results as the dependent variable and explains thirty-two percent of the between school variance and seven percent of the within school variance.In the OECD model gender is included but in the Swedish data there is no significant difference between boys' and girls' science results, so that variable is not contributing to the model.The difference between this study and the PISA 2003 model may be explained by differences in the models used and of course that the PISA model is applied across countries.The PISA 2003 model (p.439) was modelled as a three level model, with fixed variables and random intercepts across schools and countries (W.Schoulz 1 , personal communication, Sept. 20 2006).
The model in this study is a two level model with fixed variables and random intercepts across schools.
The additional variance explained by the TEACH_simple model is very small.The three models make better than random predictions of student results according to CHI2-tests.This is mainly due to the economic, social and cultural index (ESCS) and country of birth (BORN).Other variables in the model contribute less to predictions of student science results.
Some possible explanations of why students' test results in science are unaffected by the science organisation are listed below.
1.The variable is not well enough defined and would have to be better defined in the teacher survey to be accounted for.Compare with social facts described in Searle (Searle, 1997).
2. The variable may mean different things to different respondents.An answer from one survey respondent may mean something different to that person than the same answer from another respondent (Searle, 1997).
3. The variable of science organisation doesn't really affect students' science literacy results or it is not possible to assess differences in students' science literacy results due to science organisation with the present assessment.Using hierarchical linear models to test differences [130] 3(2), 2007 As to statements 1 and 2, it should be pointed out that the word used in the question to PISA 2003 schools about teaching organisation was "theme" and not "integrated teaching".The word 'theme' was used in Swedish curricula between 1980 and 1994, at which time a new national curriculum was implemented.The curriculum from 1994 (revised in 2001) granted a great deal of freedom to individual schools to develop new ways of organising work in schools.The curriculum of 1994 discusses subject integration.Since science teachers have not used the word integration very often, it was considered appropriate to use the word "theme" as this would be a word that would be better recognized by the survey respondents.The question written in Swedish is found in Box 1, for the readers own judging of how they would answer the question.
Regarding statement 3 above, it is worth mentioning that there are difficulties in determining what should be taught in the general curriculum during the last years of compulsory school, since there are no specifics listed in the curriculum (Skolverket, 2001a).Sweden is a small country with a fairly homogenous culture, with a tradition of explicitly regulated content and teaching methods (Gustafsson, 1999).Gustafsson writes that teachers in a decentralised curriculum do not experience the freedom intended.Nevertheless, different approaches to the national science curriculum have been developed in recent years, reflecting teachers' various interpretations of the curriculum (Åström, forthcoming).It is thus possible that the variable of teaching organisations relates to different contents, according to the ways teachers interpret the curriculum.The curriculum is the same for all teachers, so they are supposed to accomplish the same thing, in terms of student outcome, but how they chose to organise the science teaching can be done in different ways.Another possibility is that differences are ruled out by other factors that influence students' results, such as motivation and teacher interaction (Lee, Smith, & Croninger, 1995).Influences deriving from differences in teacher behaviour have not been included in this study.The teachers' ways of working in classrooms are usually a main factor to consider when comparing student results.One possible explanation for the lack of difference between different forms of teaching organisation could be that individual teachers' personal methods reduce or nullify the diversity different teaching organisations otherwise would provide.
The goals of the science curriculum in Sweden is well aligned with the framework of PISA 2003 in some of the science subjects and a bit weaker, but still applicable, in others as found in a study where TIMSS, PISA and the Swedish curriculum were compared (Skolverket, 2006).PISA focuses on process knowledge that students have achieved during their studies.The knowledge of science processes is an important part of the Swedish curriculum.An argument for integrated teaching has been that the students will get a whole and integrated picture of the topics studied (Fogarty, 1995;Penick, 2003) that would promote knowledge of the science processes.Therefore it might be expected that students that have studied integrated science would have accomplished more knowledge in science processes and perform better on the PISA assessment than students that have studied subject-specific science.As seen in this study this does not seem to be the case.The chain of evidence has however more than one difficulty, since the variable of teaching organisation is fuzzy as discussed above.

Conclusions
Teachers and schools in Sweden are able to organise science teaching differently.Therefore it is possible to study the relationship between teaching organisation and student results in scientific literacy by investigating a randomly selected sample of schools in the compulsory education system from PISA 2003.In this study no differences were found in PISA 2003 science results for students with different teaching organisation.The same result, no difference between different teaching organisations, is found at the individual level with simple mean comparison between groups (Åström, 2005) and also as described in the introduction.The result applies for the group of students as a whole as well as when the variables of country of birth, home language,

Maria Åström and Karl-Göran Karlsson
[131] 3(2), 2007 preschool attendance and economic, social and cultural index are taken into account.The variable of teaching organisation does not contribute significantly to the test results.In conclusion, the teaching organisation of science in Sweden, be it integrated, subject-specific or mixed organisation, has no statistical significance for students' results in scientific literacy as measured by PISA 2003 according to this investigation.

Table 1 .
Mean comparison between the groups and mean for the whole group at individual level and school level.The mean values for the different group was compared and tested against the grand mean with t-test.A table of tested t-value is in table 2.

Table 4 .
Estimates of fixed effects in a full model of the PISA_Plain model.**=at a significance level of p<<0,001, *=at a significance level of p<0,05.

Table 5 .
Estimates of fixed effects in a full model of the TEACH_simple model.**=at a significance level of p<<0,001, *=at a significance level of p<0,05.

Table 6 .
Estimates of fixed effects in a full model of the TEACH_complex model.
model of calculations of effect size (or explained variance).

1
Senior research fellow at the Australian Council for Educational Research