Australian Council for Educational Research (ACER)
19 Prospect Hill Rd, Camberwell
Victoria, Australia 3124
Sam Haldane is a senior software engineer with the Australian Council for Educational Research (ACER). He managed the technical aspect of the PISA 2006 CBAS study, overseeing the design and development of the software used, including authoring, viewing, translation and delivery systems. He is currently involved in the software development for the PISA 2009 Electronic Reading Assessment project being developed by ACER in collaboration with colleagues at the Public Research Centre Henri Tudor and the German Institute for International Educational Research (DIPF).
This paper traces the history of systems developed and used in a selection of large-scale computer-based surveys. It addresses the issues raised at each of the development stages and the various solutions proposed and subsequently implemented. Those issues include security of software and data, specifications of the hardware and software used, perceptions of network administrators and test administrators, economics of delivery and data capture, collation and marking, and creating and presenting material in different languages. The methods and delivery system used in PISA 2006 are critiqued together with those trialled for PISA 2009. In the latter case solutions to issues raised in the field trial will be addressed. Finally, the current status of delivery systems for large-scale international and national surveys will be measured against a perceived ideal for achieving the potential promised by computer-based assessment.
Computer based assessment is becoming more and more common. As technology improves, the requirements of computer based assessment delivery systems are expanding and becoming more demanding. The Australian Council for Educational Research (ACER) has been involved in several large-scale computer based surveys in the recent years. Each survey had different objectives and requirements, and because of this different systems were developed and used in the field.
PISA Computer Based Assessment of Science (CBAS)
The PISA Computer Based Assessment of Science (CBAS) project in PISA 2006 was the first time a computer-based component was included in the PISA project. The main aim of CBAS was to create and administer a test that assessed students’ science literacy using a computer-based environment. CBAS leveraged the computer based delivery method to add value that could not be achieved using the traditional paper based test. To achieve this, rich elements like video, audio and interactive simulations were to be used, reducing the overall reading load of the test.
Comparability of the test between students and participating countries was a major objective of PISA CBAS. The paper based PISA test has very well defined and strict standards with regards to the translation and presentation of items, to ensure that students in different countries have a very similar experience when taking the test. Rigorous translation and verification procedures, item review, and standards for print quality are just some of the steps taken to ensure this comparability in the paper-based test. This high standard of comparability was to be taken over to CBAS to ensure that students taking the test in countries and schools with better computer equipment did not have a better experience and hence find the test easier than students taking the test in countries and schools that were not so well equipped.
Reliability was another major objective for CBAS. Failure of a test session is quite costly, both in terms of test administrator time and loss of data, so the system developed should be as reliable as possible, building on high quality and tested components. In the case of a test failure, the system should have data recovery mechanisms to preserve whatever data was collected in the session before the failure.
The CBAS system needed to support fully translatable software and items, due to the international nature of PISA. All elements of the software needed to be translatable, as well as all text elements within the items, including dynamic text contained in items with interactive simulations. Right-to-left scripts such as Hebrew and Arabic also needed to be supported, meaning that the software itself needed to support mirroring of the interface, with user interface components that would be on the left for the English version should be on the right for the Arabic / Hebrew version.
The main way that CBAS added value to the test was by utilising media such as video, audio and interactive simulations. The system requirements were conceptualised with this in mind, and the fundamental technology used to build the system was chosen with this and the afore-mentioned objectives in mind.
The fundamental requirements of the system developed for PISA CBAS reflected the main objectives mentioned above. Security was also a concern. As all PISA items are secure, it was a requirement that no item material was left on student or school computers after the test sessions. Students should also not be able to compromise a test session by terminating the CBAS delivery software.
The hardware and software used was required to be affordable at the time of the field trial (which was in 2005), but still able to facilitate the rich content that was a main objective of the project. Where possible, free and open source software should be used to avoid licensing costs.
The system was required to be as easy as possible for test administrators to set up and use. Generally speaking in most countries, PISA test administrators are retired or semi-retired teachers with limited technical knowledge. While the Consortium can recommend that test administrators with substantially more technical knowledge be used for PISA CBAS, the reality is that the same test administrators used for the paper-based test would be used for CBAS. Therefore all effort was to be made to design the system to be as user friendly as possible.
With the main objectives and requirements in mind, the CBAS system was implemented to work as a client-server model. One computer was used by the test administrator to control the test session (the server), and this computer was networked to five other computers that students used to take the test (the clients). The response data from each student was transmitted over a local area network back to the test administrator’s computer (the server). This model was chosen to make the data collection procedures easier, and to allow the test administrator to centrally control the test session from one computer, making the session administration easier.
To ensure the highest comparability possible, it was recommended that participating countries use mini-labs of six computers that the test administrator set up in advance, and carried in to the schools. This increased the logistical requirements of the study but minimised the set up time per school, and ensured greater comparability. Several specific laptop models were recommended, all of which had the same technical specifications such as screen size and resolution, CPU speed and memory capacity.
The Java programming language was chosen due to it’s widespread support, and abundance of open source libraries available. The Java Media Framework (JMF) library was used to provide support for video and audio. Java has good support for internationalization and localization, which was a major requirement of CBAS due to the international nature of PISA. Java also has very good libraries for user interface design, and networking. To implement items with interactive simulations, Adobe Flash was chosen. A proprietary third party library was used to integrate Flash content into the Java based delivery applications.
Microsoft Windows was chosen as the supported operating system because the majority of new laptops at the time came with a version of Windows pre-installed. The delivery system was implemented as two desktop applications that were installed onto the CBAS laptops. One application was for the test administrator (the server application) and the other was for the students to take the test (the client application). All item content such as video, audio and Flash was installed on the client computers along with the client application to reduce the network bandwidth required.
Initially, wireless networks were recommended in order to ease the technical setup required to be done by the test administrator. This proved to be problematic in many countries as interference from other wireless networks, microwaves and even airport radar systems caused some session failures in the field trail. Therefore, for the main study wired networks were recommended as a more reliable technology.
A major concern of many participating countries was the large cost of implementing CBAS. Only three countries participated in the main survey of CBAS. The main source of this large cost was the purchase or hire of mini-labs of six computers and networking hardware. Logistics were also a major concern, especially in cities like Tokyo, Japan where it was unfeasible for test administrators to travel by car, and they were also unable to carry the required equipment on the subway. The total control over the delivery system hardware and software did result in a very low test session failure rate, however.
Data capture was largely automated and semi-centralised on the test administrator’s computer. Each test administrator was required to copy the data onto a device (e.g. CD or USB stick), which was then sent to the national centre where the national data was consolidated.
Marking of the CBAS items did not cost anything, as due to the nature of the CBAS items, marking was totally automated. No open response items or any other items requiring human marking were administered in PISA CBAS.
Translation of the CBAS items was quite cost effective, mainly due to the low word count of the CBAS items themselves. Custom software was created to facilitate the ease of translation of the CBAS items. The translation software was required to be installed on translators computers, but after the initial installation step translation was relatively easy. An online translation website was developed to facilitate the download and upload of translations, and also to support the CBAS translation procedures.
PISA Electronic Reading Assessment (ERA)
The PISA Electronic Reading Assessment (ERA) in PISA 2009 is the second time a computer based assessment component has been included in a PISA cycle. In 2009 the focus is on electronic texts as opposed to science (as in CBAS). A typical item contains a simulated website environment along with a question relating to the website content.
Low implementation cost was the main objective that shaped the implementation of the ERA systems. The high cost of implementing CBAS meant that many countries were unable or unwilling to participate, so in order to have more participating countries, a solution that uses existing school information technology infrastructure was sought. A compromise of comparability is a result of using existing school infrastructure, as different schools around the world have different computer systems, with varying screen sizes and resolutions, CPU speeds and RAM capacities.
Logistics were a major concern with many countries for CBAS. A solution that didn’t require test administrators to carry around laptops was a main objective of ERA.
The nature of the Electronic Reading Assessment meant that a complex hypertext environment was needed. A complex amount of interactivity within the websites was a main objective of the project, with multiple websites existing in the one unit, and interactive features like email and blogs.
The requirements of ERA were quite strict in that not much existing infrastructure could be assumed. Because ERA was required to use existing school infrastructure, things such as inter / intranet connectivity and host operating system version could not be relied upon. Due to differing security policies in schools around the world, it could not be assumed that software could be installed on the school computers.
Computers in schools around the world vary greatly in specifications, therefore the ERA system was required to run on the lowest common denominator hardware, assuming a reasonable set of minimum system requirements.
The ERA system was implemented as a bootable CD (with a bootable USB option available in the main survey, commencing early 2009). A USB memory device was used for data collection. The bootable CD contained all of the software required to deliver the ERA test, which meant that no software was required to be installed on the school computers at all.
The bootable CD / USB contains a Linux distribution which has been customised for ERA, with the appropriate software, fonts and Input Method Editors (IMEs). The system uses a standard browser and web server, which run directly from the CD. The TAO platform was used as the base technology to deliver the test, with some custom Flash components used to deliver the simulated hypertext environment.
The system was designed to use the minimum possible amount of hardware resources, so that it could run on the maximum amount of existing school computers possible. Care was taken to optimise CPU and memory usage where possible.
Most computers are configured to boot from a bootable CD or USB memory device when there is one present. Some computers require a small procedure to be undertaken in order to enable this functionality, however. This procedure involves changing the boot sequence inside the Basic Input / Output System (BIOS) of the computer. This is a somewhat technical procedure, but is required in order for the ERA system to run.
The design of the ERA system to use existing school information technology infrastructure ensured a low cost of delivery relative to CBAS. However, a relatively high amount of session failures occurred. This was due to some existing school computers not meeting the minimum hardware requirements of the ERA software, or the school hardware not being compatible with the Linux distribution used.
Data capture was higher cost with ERA, as each computer used collected the student response data on a USB memory device. The test administrators then were required to consolidate the collected data by copying the captured data files from each USB device onto a computer (often a laptop carried to each test centre).
Expert (non-automated) marking was required for some ERA items. Custom online marking software was developed to facilitate distributed marking, which also centralised the marking data collection.
Translation for ERA was quite costly due to the high word count of the items. A custom translation management system was developed to facilitate the download / upload of translations. XML Localisation Interchange File Format (XLIFF), a standard supported by many open source and commercial translation software packages was used in order to reduce costs. This enabled translators to use software that they are used to, and also eliminated the need to develop a custom translation application (as was done for CBAS).
National Assessment Project – Information and Communication Technology Literacy Assessment
The Information and Communication Technology Literacy project (ICTL) was commissioned by the Ministerial Council on Education, Employment, Training and Youth Affairs (MCEETYA) in Australia. The study was an Australia wide assessment of ICT literacy in year six and year ten (twelve and sixteen year old) students. It aimed to measure students’ abilities to:
Use ICT appropriately to access, manage, integrate and evaluate information;
develop new understandings;
and communicate with others in order to participate effectively in society.
The ICTL items themselves are very rich. A lot of items contain emulated application environments such as word processors, presentation preparation applications and photo organising applications. The test itself was delivered using remote desktop technology.
The ICTL project utilised three delivery models, depending on what existing infrastructure each school included in the study had. Test administrators did a phone interview or a site visit to determine the level of existing infrastructure, and then used the appropriate delivery model for that school.
Internet delivery was used when the school had an appropriate computer lab with sufficient internet connectivity and bandwidth. In this model, the school computers were used as the clients, with a remote server. This model requires minimal setup by the test administrator, and all data is collected centrally on the remote server, eliminating any data collection procedures required of the test administrator.
A carry-in server model was used at schools that had an appropriate computer lab with a local area network (LAN), but insufficient internet connectivity to deliver the test via the internet. For this model, test administrators travelled with a pre-setup server laptop that they plugged in to the school’s LAN. The existing school computers were then used as clients to access the server over the LAN.
For schools that didn’t have an appropriate computer lab, carry-in mini labs of computers were used (much like the CBAS model). The mini labs consisted of 10 computers; one for the test administrator (server) and nine student computers. The test administrator also carried in network hardware and cables. This model was only used for a handful of schools, usually quite remote and small schools.
Using this combination of delivery models depending on the school infrastructure, a very high success rate was achieved; 99% school-level response rate after replacements. Test administrator training was made more complex because each test administrator needed to be trained in all three delivery models.
Some technical issues with the ICTL study did cause some problems. The remote desktop client that exists on most Windows computers by default requires the use of a non-standard port, which is blocked by most firewalls in schools. The study used an Internet Explorer plugin that allows remote desktop access through a standard port, but requires installation before the test session by the test administrator, and requires Microsoft Windows.
At the moment there is no silver bullet for delivering computer-based assessment. The ideal technology for a study varies greatly depending on the objectives and requirements of the study. Simple, non-complex items may be delivered best using standard web technology available everywhere, whereas complex, rich items are best delivered using Flash, or even remote desktop technology when application emulation is required.
The ‘toolbox’ of delivery methods used in the ICTL has a lot of benefits. The model is able to deliver very rich items, and a high rate of success is achieved by tailoring the delivery method to the school infrastructure. Having a choice of delivery models, along with the training involved, does raise the cost of the study, however.
In the future, internet delivery should have the highest return on investment. Delivery through the internet has the advantages of ease of deployment and low administration logistics and costs. The obvious disadvantage of internet delivery at the moment when it comes to large-scale international studies is the lack of appropriate infrastructure.
The carry-in server model utilised in the ICTL study mentioned above is the best trade off at the current time. A high percentage of schools have a sufficient LAN but not sufficient Internet bandwidth to make internet delivery possible. The carry-in server model has no need for any installation of software, and has the advantage of total control over the hardware and software on the server (as opposed to ERA where both server and client must run on unknown hardware).
Consortium (2004) CBAS Requirements and Specifications; Internal documents.
ACER (2008), A Preliminary Proposal for an International Study of Computer and Information Literacy (ICILS), Prepared in response to a request from the IEA Secretariat for the 2008 IEA General Assembly.