To begin with, what is a computer adaptive test?
In layman’s terms, a computer adaptive test (also known as a personalised assessment) is a test that changes questions from the bank in response to the candidate’s performance in real-time, giving a more accurate indication of their level of ability on a standard scale.
Please continue reading the post to see more specific information that I will present.
Table of Contents
What Is Adaptive Computerized Testing?
Simply put, computerized adaptive testing (CAT) is a computer-based exam that adjusts test question difficulty for each individual test taker using unique algorithms. When a test is computer adaptive, it adjusts in real-time to the test-taker’s skill level and presents questions accordingly. It is a secure exam design technique you can use to shield the information on your test from disclosure and deter test-takers from using cheating methods. Tests can be given using CAT more quickly, with fewer questions, and with higher levels of security. (In this section, you can read more about how computerized adaptive tests operate.) Let’s examine the beginnings of adaptive testing to gain a better understanding of CAT.
What Is Adaptive Testing?
As implied by its name, an adaptive test adjusts or customizes exam questions based on each test taker’s proficiency in real time. A unique set of test questions is ultimately produced for each person as a result of this. Depending on the test-taker’s performance on earlier questions, the test will adjust. As a test taker correctly answers the majority of the questions, harder questions are picked out and presented. However, simpler questions are presented if the answers to the prior questions were incorrect. The test is able to stop after a manageably small number of questions, which may vary for each individual, and provide a score. Instead of how many questions were correctly answered, the test taker’s score is determined by the “level” of difficulty they reached. An adaptive test is much more effective than a conventional test because it has variable starting and stopping points and necessitates the test-taker to respond to fewer questions. One of the earliest known adaptive tests was the Stanford-Binet Intelligence Scale given at the beginning of the 20th centur.y
The History Of Adaptive Testing
When I first started my testing career almost 40 years ago, Ron Hambleton, a pioneer in our field, taught me about computerized adaptive testing. He taught me the principles of adaptive testing as well as the technical and organizational steps required to create one. Under his guidance, my coworkers and I were able to design a large number of adaptive tests and give them to K–12 students. In 1990, about eight years later, I used the same adaptive testing design for a global information technology certification program, eventually giving out over a million adaptive tests. That was the first time computerized adaptive testing had ever been used on such a large scale globally, as far as I’m aware. Since then, I’ve become a fan and frequently suggest them to Caveon’s clients because of their effectiveness and security attributes.
The Stanford-binet Intelligence Test
The Stanford-Binet Intelligence Scale, which was first used in about 1916, was one of the earliest adaptive tests in history. It was a test that I had previously practiced administering to children as part of a graduate school course. Due to the fact that it doesn’t start and end in the same location for every child, it is regarded as an adaptive test. Instead, the questions were ranked in order of difficulty, and younger kids were given some of the simpler (though not the easiest) questions first. The examiner then adjusted the starting point back up to even easier ones if the young child failed to correctly answer three of the simple questions in a row. The test would continue until the child gave three incorrect responses in a row, assuming the child could correctly answer three of the questions in a row. The test would then be terminated by the assessor. Each sub-test was administered in a similar way, and there were several of them. The child’s performance on each sub-test would be taken into account when calculating the test’s score.
Adaptive Measurement Outside Of Testing Examples
Such adaptive measurement is fairly typical and has been around for a lot longer outside of testing. Take, for example, the sport of high jumping, which began in Scotland in the 19th century. Typically, a high jump competition starts with a bar height that is a little bit lower than the competitors’ combined abilities. Therefore, some of the more skilled jumpers may actually skip the first few predetermined heights of the bar. A high-jump competition is very effective because of this and the fact that a person is disqualified for failing to clear a height where others succeed. Each competitor’s skill is quickly assessed with a limited number of jumps and various jump counts. The final bar height each finalist can reach determines the competitors who will win. Those who can clear the bar the highest succeed.
It is useful to consider that a high-jump competition is run similarly to a conventional test. As an example, consider the following: The high jumper would have to attempt or attempt to jump over all 28 bars that ranged in height from 3 feet to 10 feet, each bar being raised by 3 inches. The number of successful jumps out of a possible 28 would determine the score. This would actually be a pretty good way to gauge your high jump skills. No jumper, though, would find this kind of competition enjoyable, regardless of ability. There are far too many pointless jumps, which wears one out. Additionally, the jumper would find the lower jumps boring and the higher jumps frustrating. Although adaptive tests, especially computerized adaptive tests, can help to avoid this kind of experience, it is precisely the experience that traditional exams in our field require.
The Pros Of Computerized Adaptive Testing
The fact that CAT is effective is one of its main benefits. It uses fewer items than a conventional test to produce a score that is just as useful to the test-taker. These specific advantages result from this efficiency:
- Less time is needed for the test. Examinees will test for a shorter period of time, sometimes by 50%.
- Reduced testing costs. Reduced test administration costs can result from time savings.
- more secure testing. The overall average exposure of the items in the pool is decreased by revealing fewer items to each examinee. This makes tasks like item gathering and knowledge-based cheating challenging and less lucrative. With computerized adaptive tests, each test taker receives a distinct form with (ideally) little overlap between these forms, making it challenging to cheat by copying answers during in-person test administrations. How much overlap there will be between tests will primarily depend on the size and composition of the CAT’s pool of items.
- lower levels of boredom and fatigue When the majority of the questions are moderately difficult, it makes for a more enjoyable testing experience for each test taker to not have to respond to the easy and difficult questions (easy and difficult for them, that is).
These advantages combined with others typically result in a very positive experience when taking a computerized adaptive test. In 1995, I polled 3,000 of Novell’s certification applicants who had taken at least one CAT. 43% indicated they preferred taking computerized adaptive tests, 19% said they had no preference, and only 19% said they preferred a traditional test format. The main barrier to choosing a computerized adaptive test is the skepticism that such a brief test could accurately assess their aptitude. However, after being given the opportunity to take the tests and having the CAT format explained to them, the majority of them came to the conclusion that they were superior tests. Naturally, after using CATs for a while, candidates developed confidence in their ability to distinguish between those who were knowledgeable and experienced and those who were still learning, and they shared that confidence with the other candidates.
How Do I Begin Using Cat In My Program?
Getting started with computerized adaptive testing can be done in a variety of ways, as was already mentioned. Since CAT implementation is typically more challenging, I strongly advise beginning by consulting with an experienced testing expert. With this in mind, having access to a sizable enough item bank is one of the first and most crucial requirements for implementing CAT. Utilizing AIG is one way to expand your item pool. The quickest—and frequently most economical—way to grow your pool is through automated item generation. With the click of a button, some AIG tools, like this one, can quickly and easily expand your item pool. This thorough guide will teach you more about AIG. See more about What Is The Computing Industry?
How Does Adaptive Computerized Testing Work?
Based on information learned about the examinee from earlier questions, CAT chooses questions in a sequential manner to increase the exam’s precision. According to the examinee, the exam’s difficulty seems to be adjusted to correspond with their level of proficiency. A more difficult question will be asked of the examinee, for instance, if they do well on an item of intermediate difficulty. Or, if they did poorly, a question that was simpler would be asked of them. Computer-adaptive tests require fewer test items to produce equally accurate scores than static multiple-choice tests, which almost everyone has taken and have a fixed set of items given to all test takers. (Obviously, there is nothing in the CAT methodology that mandates that the items be multiple-choice, but just as most exams use this format, so do most CAT exams.)
The fundamental iterative algorithm used in computer adaptive testing consists of the steps listed below.
- The pool of available items is searched for the optimal item, based on the current estimate of the examinee’s ability
- The chosen item is presented to the examinee, who then answers it correctly or incorrectly
- The ability estimate is updated, based upon all prior answers
- Steps 1–3 are repeated until a termination criterion is met
Since there is no information about the examinee prior to the administration of the first item, the algorithm is typically started by choosing a medium- or medium-easy-difficulty item as the first item.
Different examinees take tests that are very different from one another because of adaptive administration. Examinees typically take different tests, but their ability scores are comparable (i.e., as if they had received the same test, as is common in tests designed using classical test theory). IRT, or item response theory, is the psychometric technique that enables the computation of fair scores across various sets of items. IRT is also the preferred methodology for selecting optimal items which are typically selected on the basis of information rather than difficulty, per se.
The Uniform Certified Public Accountant Examination employs a related methodology known as multistage testing (MST) or CAST. Some of CAT’s drawbacks, as listed below, are avoided or lessened by MST.
Why Adaptive Computerized Testing?
CAT is utilized for the NCLEX because it:
- Reduces the number of “easy” items that high-ability candidates receive; “easy” items tell little about a high performing candidate’s ability
- Reduces the number of “difficult” items low-ability candidates receive; candidates tend to guess on items that are too difficult which can skew results
- Reduces item exposure and subsequent security risks
- Improves precision of measurement of the NCLEX candidate’s ability related to nursing and
- Provides a valid and reliable measurement of nursing competence
Reform Of Adaptive Computerized Testing
Although computer adaptive technology is still a recent innovation, over the next few years its use in the US seems to be on the rise significantly. The Partnership for Assessment of Readiness for College and Careers (PARCC) and the Smarter Balanced Assessment Consortium, for instance, both plan to use the technology in their significant national assessment initiatives. States and schools are generally adopting computer-adaptive testing based on the following justifications:
- Teachers and students will have more time for instruction and learning by using assessments that are more accurate or more accurate while taking less time to complete.
- The tests adjust each question to the knowledge and skills of the test taker, removing the need for students to struggle with questions that are too challenging or waste time on questions that are too simple.
- The tests can offer teachers more accurate and easily accessible information on the learning requirements of their students, which they can use to modify their lessons and enhance the academic support they offer to their students.
- Because not every test taker sees the same items, test security is improved.
Debate Of Adaptive Computerized Testing
Computer-adaptive tests are still in their infancy, so discussions about their application, validity, advantages, and drawbacks are just starting to take shape. The technology will probably be the subject of increased scrutiny, discussion, and debate as more states prepare to use new computer-adaptive online exams in the upcoming years.
The following are a few examples of arguments that supporters of computer-adaptive testing might make in addition to the potential advantages mentioned above:
- In particular for students at the lower and higher ends of the learning spectrum, the tests can be useful in determining a student’s learning level more precisely than fixed-question tests.
- Adaptive tests give teachers more specific information about students who are exceptionally proficient or exceptionally far behind in their mastery of expected knowledge and skills.
- Because the tests are shorter, less taxing, and better suited to each student’s unique abilities, they might increase student engagement in the testing process and produce more accurate results.
- Large-scale standardized testing may become more efficient and less expensive as computerized scoring of open-ended and essay-style questions improves and may eventually surpass human scoring in accuracy and reliability.
Some typical objections that opponents of computer-adaptive testing might raise include the following:
- The sophisticated scoring software required for computer-adaptive tests’ essay and open-ended question sections is still in the early stages of development. The need for human scoring arises from the possibility that some systems have not undergone adequate testing, while others may be prone to bugs and errors that could result in inaccurate results that could be detrimental to the test-takers.
- The use of computerized tests may harm students who are less technologically literate and have less access to digital technology, including students from lower-income families and those who live in remote areas with unreliable internet access.
- There are frequently significant logistical difficulties and financial costs associated with the switch from paper-and-pencil exams to computer-adaptive exams, especially for cash-strapped states, districts, and public schools. Whether it’s custom-made or an off-the-shelf item, the sophisticated software needed for the tests can be pricey and even prohibitively expensive.
- It may be prohibitively difficult to set aside the time and computers required for all students to complete a test in schools with a dearth of computers, insufficient computing networks, or both.
- Computer-adaptive testing frequently needs strong technical support because faulty or malfunctioning systems can seriously interfere with class schedules and operations as well as test administration.
- Transitioning to online, computer-adaptive testing may be difficult or impossible for districts and schools that still use paper-based processes because they lack the necessary equipment, staff, and resources.
The Bottom Line
Despite its many advantages, computer adaptive testing (CAT), which is a hot topic in the assessment community, is still not utilized to its full potential. We’ll give you an overview of CAT, a rundown of some advantages, and, without using a lot of technical jargon, an overview of the technology involved in this article.
Last but not least, I want to say thanks for reading.