The assessment industry is trying in earnest to deliver accessible tests that comply with laws like the The United States’ (Section 508 of the Rehabilitation Update of 2017) and Europe’s (EN 301 549), both of which require compliance with the Web Content Accessibility Guidelines (WCAG) 2.0, including Level A and AA success criteria.
The industry has a lot of legacy content, which costs millions of dollars to produce, and their assessments are reliant on the use of this content. The content is crucial to measuring very specific knowledge or skills. WCAG includes some exceptions within its success criteria, and below is a description of those exceptions that are relevant to online assessments.
The 3 Success criteria we will focus on in this post are:
- 1.1.1 Non-text Content
- 2.1.1 Keyboard
- 2.2.1 Timing Adjustable (essential exception)
The official documentation for these can be found on W3C’s WCAG website, or for the more readable interpretation, I like Luke McGrath’s Wuhcag.com. Note there is a new version of WCAG – version 2.1 – that added some new success criteria. I cover the new Timing Adjustable success criteria so you can start thinking about this when the legal requirements include the WCAG 2.1 level A and AA additions.
The first success criteria within the Perceivable guideline states that test-creators should provide text alternatives for non-text content to serve an “equivalent purpose.” The success criteria specifically mentions a test exception (in addition to others) stating, “If non-text content is a test or exercise that would be invalid if presented in text, then text alternatives at least provide descriptive identification of the non-text content.”
This doesn’t mean you don’t have to provide text alternatives for test content as a whole, but for specific tests where the point of the test is to measure that modality. If the test is about your ability to hear and speak a language, then text alternatives would undermine what you are trying to assess.
Test makers do indeed worry that the text-based alternatives can lead the answer. Test takers who would have access to that alternative would have a distinct advantage. There are cases where the alternative text is the answer – like an assessment where the user is asked to identify a shape or color, and obviously the description can’t actually be meaningful, but merely identify it as a shape. I think we can easily understand why they can’t fully describe the image, and how that would invalidate the test.
There are however many other instances of more complex images and graphics where content writers find they can’t write descriptions of things like maps or photographs without leading the answer. I generally try to probe a little deeper about the purpose of the question, and find that the ability to see is not one of the skills being measured. The way the graphic has been made has led to the writer to describe it in a particular way. In these cases, rethinking and replacing the graphic is the solution, not removing accessibility for many test takers.
Content Needs to be Keyboard Accessible
The ability to access content via keyboard is crucial to providing access for online assessment. It’s the key functional underpinning for assistive technology like screen readers and switch devices. The success criteria doesn’t forbid other input devices like a mouse or touchscreen, but it does say you need to at least provide keyboard input.
The exception is for “underlying function,” meaning if what you’re trying to get the user to do is something like handwriting, then keyboard input is going to basically undermine the task you’ve created to assess handwriting.
But this doesn’t give a free pass for test purposes. Even though you’ve made some fancy drag-and-drop items (which are typically more mouse friendly), remember you likely aren’t trying to assess mouse skills. More likely you’re trying to create tasks that involve higher level cognitive skills, which should still be possible for people who can’t use a mouse.
Ability to Adjust the Timing
Success Criteria 2.2.1 is all about making sure we allow people enough time to do what we’ve asked them to do. Some people take longer, and we should provide a way for them to extend how much time we need. Taking too long at the ATM? The machine has to be secure, so it has a timer, but it lets you know when it’s about to time out, and give you the option to extend that time.
The relevant exception for testing is the “Essential Exception” which states, “The time limit is essential and extending it would invalidate the activity.” I’ve participated in many discussions about how essential a timed test is, and it comes down to why you’ve set the time.
If you set the time because your test center needs to have the various seats available at particular times of the day, then I’m afraid that has nothing to do with your ability to measure the skills or knowledge of the test taker, but more about your practical business needs. Test centers need to accommodate for the possibility that some test takers might take longer than expected.
If however the test includes aspects of facility – how quick you are in your recall or completing a task within a certain time, then this indeed qualifies as a test exception. Many test programs do allow for specific amounts of extended time for test takers with particular accessibility needs because the access takes additional time. Test takers are just as quick minded, but getting through the material and responding using their assistive technology just takes more time.
Sorry, No Exceptions Here
WCAG 2.1 introduced a new success criteria to the Distinguishable guideline about reflow of content. It sets expectations about a user presenting content without needing the user to scroll in 2 dimensions. Horizontal scrolling can be particularly cumbersome for low-vision users, and the amount of time it takes to access this information can be 10 fold or higher, not to mention the cognitive stress of needing to remember portions of sentences as you move to the next line by scrolling over and down.
Traditionally, test makers have been very careful about the presentation of content. Assessments are measurement tools and controlling the presentation is one way of removing variation. When tests moved from paper to computer, many programs kept this rigid presentation control, even going so far as to exactly match their paper versions.
This new guideline does not offer an exception for this assessment use case, and I agree with the approach. In the 21st century, many users do most of their reading on screens. They’ve gotten used to responsive design which reflows text and content. Past research about users using different delivery modalities have shown us that familiarity is the key factor in readability and performance. Users have also gotten used to making specific text preferences that meet their specific needs, or just their comfort.
It’s time for us to accept content reflow in assessment for all users, not just users with specific vision needs. Users should also be able to make changes to the size, spacing, and colors of text-based content.
It’s important to provide access to test takers when it doesn’t affect the measurement of specific skills. Testing programs need to think hard about how their content can be limiting access to some test takers, and what they need to do to improve their tests for the future.
In an upcoming blog, I’ll discuss a favorite topic of mine, text-based transcripts of media, and how important it can be for all test takers