In the south-west of Germany, the educational system at the secondary level offers a variety of settings for language learning and acquisition. In more and more schools, EFL (English as a Foreign Language) education is being supplemented with CLIL (Content and Language Integrated Learning) programmes, in which subjects such as Geography, History and Biology are taught in English during specific years (cf. MKJS 2004). Hence, the question arises whether CLIL is indeed as beneficial as it is assumed to be, or if, as has been argued by Bruton (2011), the success of CLIL programmes is simply based on the selectivity involved in many of them. First and foremost, CLIL programmes offer an increase in exposure to the English language. However, CLIL materials, being scientifically oriented, also constitute a genre that is virtually absent from EFL settings. As research suggests that the passive is one of the characteristic features of scientific text (cf. Svartvik 1966, Holtz 2011), it has been chosen as a diagnostic criterion to investigate the impact of CLIL programmes on written learner language. The fact that the passive often alternates with a synonymous active structure, thus enabling learners to avoid it, adds to its importance in differentiating between more advanced learners and less advanced ones. Moreover, it is hoped that insights will be gained with respect to the lexis-grammar interface in language learning as passive forms of certain verbs are treated as lexical chunks by EFL materials. To find out whether CLIL materials are indeed similar to scientific text, a corpus of teaching materials (Teaching Materials Corpus, TeaMC, ~1,000,000 words) was compiled. It comprises the following subcorpora: (1) EFL materials Year 7-10; (2) CLIL materials Year 7-10; (3) EFL materials Year 11-12. In a preliminary analysis, the passive was indeed found to occur almost three times more frequently in subcorpus 2 than in subcorpus 1, and even with a considerably higher frequency than in subcorpus 3, which acts as a reference norm that learners are supposed to aspire to. To investigate differences in the written interlanguage of learners from EFL and CLIL programmes, the Secondary-Level Corpus of Learner English (SCooLE) was compiled. Data was elicited from Year 11 learners in mere EFL as well as EFL+CLIL settings at various secondary schools across the south-west of Germany. Participants were presented with two sets of essay topics, one of which was formulated in the passive. Learners subsequently typed two short argumentative essays in class. All in all, the SCooLE comprises about 850 essays, amounting to a total of around 250,000 words. Due to the fact that the elicited text data was found to be highly deviant, the corpus had to be preprocessed in order to normalise especially those forms which have a serious impact on the automatic retrieval of passive constructions. This was, on the one hand, effected on the basis of VARD output (Variant Detector, cf. Rayson & Baron 2011), on the other hand by manually annotating typical misspellings. For annotation of part-of-speech, various tools were tested for their performance on interlanguage at this level. This resulted in the decision for concurrent use of the TreeTagger (cf. Schmid 1994) and CLAWS (Garside & Smith 1997), which, taken together, were shown to offer a recall rate (cf. Granger 1997) of 94 %. However, a number of erroneous passive constructions, which seem of particular relevance for the purpose of this study, remained irretrievable. Hence, manual annotation of all passives was effected. To avoid results being influenced by intervening variables that might affect the performance of the two groups of learners (e. g. cognitive capacities, aspects of motivation or language learning/acquisition history in the individual learner), a questionnaire as well as two psychometric tests were administered. The information obtained from this procedure was included into the SCooLE in a rich set of metadata on learner variables. A preliminary analysis shows that CLIL learners indeed use the passive more frequently than their non-CLIL counterparts. However, discrepancies were found with respect to cognitive capacities and other variables as well. It is thus one of the future aims of this study to determine whether or not the differences found in the interlanguages of the two groups are due to educational settings or a result of CLIL programmes being selective. This paper describes the procedures involved in the compilation of the SCooLE in as far as they are relevant to the investigation of the passive. Furthermore, a comparison between the SCooLE and the TeaMC is effected, providing a quantitative analysis of passive constructions by using measures such as passive ratio (cf. Granger 2013). A preliminary qualitative analysis is carried out in order to describe the challenges involved in the investigation of the English passive in learners that often do not yet entirely master the lexical, morphological and syntactic processes involved in the use of this structure.