BRIDGEWATER STATE COLLEGE

Assessment Guidebook

Chapter 5. Assessment Tools

At this point in the process, your department has a program mission statement (Step 1) and a set of learning outcomes (Step 2). Step 3 is to review, evaluate, and select a tool or tools to assess student achievement in each of the learning outcomes. Selection of a tool involves a tradeoff between the ability to obtain detailed information and the need to keep the process feasible and manageable. For this reason, the chapter gives advantages and disadvantages for each of the various assessment tools. Most assessment experts believe it is important to use multiple assessment tools to overcome the disadvantages of a single tool, albeit with added work and expense.

A. Issues to Consider in Choosing an Assessment Tool

Assessment tools can generally be placed in two categories, direct and indirect. Sometimes a tool from each of these categories is used to get a more holistic view of student learning.

Direct measures of assessment are those in which the products of student work are evaluated in light of the learning outcomes for the program. Evidence from coursework such as projects or specialized tests of knowledge or skill are examples of direct measures. In all cases, direct measures involve the evaluation of demonstrations of student learning.

Indirect measures of assessment are those in which students judge their own ability to achieve the learning outcomes. Indirect measures are not based directly on student academic work but rather on what students perceive about their own learning. Alumni may also be asked the extent to which the program prepared them to achieve learning outcomes. In another example, people in contact with the students, such as employers, may be asked to judge the effectiveness of program graduates. In all cases, the assessment is based on perception rather than direct demonstration.

Direct measures tend to be more time- and labor-intensive than indirect measures, which can often be handled through surveys. Without the direct evaluation of student work, larger sample sizes may be possible, which adds to the value of the results.

Validity and reliability are two factors to consider in choosing an assessment tool. Validity is “the degree to which an assessment measures (a) what is intended, as opposed to (b) what is not intended, or (c) what is unsystematic or unstable” (Source: University of Maryland Center for the Study of Assessment Validity and Evaluation). Face validity refers to validity taken at face value; does an assessment tool give an immediate impression that it measures the intended outcome? Content validity refers to sample representativeness; how well do the results capture the characteristics of a population of students? Reliability refers to the tool’s consistency in producing results over time and with different samples of students.

B. Direct Methods

1. Capstone course

Capstone courses draw upon and integrate knowledge, concepts, and skills associated with the entire curriculum of a program. Taken normally in the senior year, capstone courses ask students to demonstrate facility in the program’s learning outcomes, in addition to other outcomes associated with the particular course.

Within a capstone course, evidence of student learning may include comprehensive papers, portfolios, group projects, demonstrations, journals, or examinations.

But how does one use this evidence to assess the overall program? The final grade for the course, being a single measure, does not dissociate into an assessment of student achievement in the various learning outcomes for the program (although achievement in each of the learning outcomes may combine into the final grade). One method of assessment in capstone courses is to evaluate student work with an eye toward the multiple dimensions of the program’s outcomes. More than one faculty member can be invited to assist in the assessment of student work, e.g. in a project presentation. The assessment of a major paper or project, or set of papers or projects, may be broken down into sub-assessments of each learning outcome.

The benefits of a capstone course include the increased ability of students to integrate their learning; close association between the assessment tool and the program’s particular learning outcomes; rapid feedback on the program; and association of a faculty member who contributes to program assessment through the professor’s normal teaching load.
Disadvantages include disciplinary differences in the appropriateness of a capstone course in the curriculum; the time required to develop and gain approval for the course; staffing requirements; slotting the course into the curricular sequence; and normal variations in course content as the course changes hands from instructor to instructor, i.e. reliability. Also, this type of assessment does not lend itself easily to comparisons of student work early and late in their academic career (“pre- and post-test” assessment, see item 6 below).

Examples: BSC Philosophy Department, capstone course syllabus

2. Course-embedded assessment

In course-embedded assessment, student work in designated courses is collected and assessed in relation to the program learning outcomes, not just for the course grade. As in the capstone course, the products of student work need to be considered in light of the multiple dimensions of the learning outcomes. Products may include final exams, research reports, projects, papers, and so on. The assessment may be conducted at specific points (e.g., introductory course and upper-level course) in a program.

Benefits include the fact that assessment is conducted as part of the normal workload of students and faculty, although additional work may be needed to incorporate program assessment into the course.
Disadvantages include the potential for a faculty member to feel that her or his work in a particular course is being overseen, even if it is not. Also, rubrics may need to be chosen or developed that are associated with the particular learning outcomes, increasing the preparation time (see Chapter 6, Section B, Rubrics).

Example: Physics at BSC

3. Standardized tests

The Educational Testing Service and other companies offer standardized tests for various types of learning outcomes, such as critical thinking or mathematical problem solving. Scores on tests such as the GRE or the Massachusetts Test of Educator Licensure (MTEL) may be used as evidence of student learning.

Benefits include the reliability and validity of an assessment instrument that is commercially developed, eliminating the arduous process of developing an instrument in-house; simplicity in administration and evaluation of test results; and the potential for cross-institutional comparisons of results.
Disadvantages include the generic nature of standardized tests and their potential lack of fit with a particular program; a possible lack of motivation by students to take the test or do well on it; and the debatable question of whether a standardized test gives a true measure of student learning. Also, ETS and other services charge substantial fees for these tests, which is an added administrative cost or possibly a cost to the students.

Example: California Critical Thinking Dispositions Inventory

4. Locally developed tests

Faculty in a program may decide to develop a test that is reflective of the program’s mission and learning outcomes. The test is usually graded by multiple evaluators. Locally developed tests are less costly than a standardized test, but require work by the program’s faculty in development and scoring.

Benefits include the ability to tailor a test to a specific program.
Disadvantages include the challenge of developing a test with proven reliability and validity, the potential need to develop rubrics and train multiple test evaluators in the use of these rubrics, and the need to develop a new test periodically.

Examples: English Placement Test at California State University, English Placement Test at the University of Wisconsin-Madison

5. Portfolio evaluation

A portfolio is a compilation of student work that, in total, demonstrates a student’s achievement of various learning outcomes. Portfolios can be created for a variety of purposes aside from program assessment, such as fostering reflection by students on their education, providing documentation for a student’s job search, or certifying a student’s competency. Portfolios created over the span of a student’s academic career, compared to those consisting of a student’s work only at the end, provide the basis for a developmental assessment.

Portfolios may combine multiple types of evidence and are not necessarily limited to classroom work. For example, portfolios may contain research papers, presentations, videos, audio recordings, work done through employment, or journal entries discussing co-curricular activities or programs. Once the material is collected, it falls upon an individual or group to establish a system by which to evaluate the contents of the portfolio in terms of a program’s learning outcomes (see Chapter 6, Section B, Rubrics).

In the School of Education and Allied Studies at BSC, portfolios are used to document each student’s competence in teacher preparation. This is a different purpose from that for program assessment. In program assessment, a cross section of students may be sampled to evaluate student learning outcomes, but in teacher preparation, the intent is to validate every student’s competence.

A key question in portfolios arises in the collection of evidence. In teacher preparation, students themselves collect and save the material, and online systems are now available to assist in that process. But for program assessment, the department itself may have to assemble the student portfolios; in this case, issues must be considered about how the students are to be informed of the fact that their work is being assessed for programmatic reasons. Some faculty ask students to sign consent forms to copy work products and to use student work products in accreditation reports. More information on this can be obtained from the BSC Institutional Review Board.

Benefits of portfolios include the ability to document student development over time, and the potential benefit to the students of seeing their own development and in collecting material that may support their career goals. Thus, program assessment becomes an integral part of the learning process.
Disadvantages include a labor-intensive process in the evaluation of evidence in student portfolios. Also, there is an expense in storing and organizing the evidence.

Example: Alverno College Diagnostic Digital Portfolio

6. Pre-test/Post-test evaluation

One of the questions that comes up in assessment is not only whether students can demonstrate the learning outcomes when they graduate, but how much of what they can demonstrate was actually gained during their time in the program. This suggests the need to assess the students' knowledge and skills at the point of entry into the program and, later, at the point of exiting the program. In pre-test/post-test assessment, student work is assessed both early and late in their academic career, from which the growth and development of the students can be deduced.

Several of the previously-described tools lend themselves to pre-test/post-test evaluation. Portfolios that collect evidence throughout a student’s academic career can intrinsically be a type of pre- and post-test evaluation. Course-embedded assessment in which student work is collected from introductory and upper-level courses also provides a type of pre- and post-test evaluation, although the level of difficulty in the two courses can be expected to differ considerably. Standardized or locally developed tests can be administered at two times in a student’s career to assess learning. However, if the test is exactly duplicated at the two times, then students may improve simply by having seen it twice. On the other hand, if different tests are administered at the two times, it can be difficult to ensure that both tests are of the same nature and difficulty, so the reliability of this method becomes a question.

Benefits include the ability to gain insight into students’ academic development.
Disadvantages include the increased amount of work involved in assessing student work more than once, and the difficulty of designing tests or assessment tools that are truly comparable at different times.

Example: Astronomy Diagnostic Test (e-mail Professor Martina Arndt for a copy)

This is a standardized, multiple choice test to assess content knowledge in astronomy. It is administered by the BSC Physics Department before and after students take the Astronomy course. The test does not count toward student grades. The performance is compared to that of students at other institutions. This diagnostic test was developed for undergraduate, non-science majors taking their first astronomy course by the multi-institutional Collaboration for Astronomy Education Research (CAER).

The test is free of charge but is not to be altered in any way, questions are not to be left out, questions are not to be removed and placed on course quizzes or exams, and the test items are not to be distributed to students in case the items find their way into "student files." In exchange for using the ADT, the authors respectfully request that faculty submit scores to the national database.

C. Indirect Methods

1. Student self-efficacy

Students have a sense of their own competence. Student self-efficacy involves the rating by students of their perception of their own achievement in particular learning outcomes. Research shows a significant, although imperfect, correlation between actual and perceived competence. What can be problematic are gender and demographic differences in the accuracy of self-efficacy. For example, certain groups of students may rate their quantitative skills at a level below that indicated by standardized tests. Also, unless the answers are anonymous, students will be likely to overrate their abilities. The same is true if students perceive they can be penalized by their answers.

Self-efficacy as an assessment tool is relatively simple. For example, a researcher/assessment expert at Clemson University has designed a test that asks students to rate the perceived importance and self-efficacy of leadership skills, communication skills, interpersonal skills, analytical skills, decision-making skills, technological skills, the global economy, ethics, and business practices (see Example).

Benefits include the inexpensive nature of the tool. A relatively simple survey can be constructed which simply asks students to rate their competence in different areas. Also, pre- and post-test assessment can be conducted to examine changes both in self-efficacy and perceived importance of a topical area. Another benefit is that all learning outcomes can be assessed simultaneously, in one test.
Disadvantages include an imperfect relationship between self-efficacy and actual competence; student self-reporting may not always be congruent with their actual level of achievement.

Example: Charles Duke research article. See abstract below.

Learning Outcomes: Comparing Student Perceptions of Skill Level and Importance
Charles R. Duke, Clemson University
Journal of Marketing Education, Dec2002, 24(3): 203-218

"Presents a study that discusses the process used for developing learning outcomes and the illustrative analysis of student perceptions as one step in the process of implementing a learning outcomes approach to curricular design. Illustration of a potential method of applying the analysis of student perception within the evaluation process; Methodology that can improve the faculty's understanding of the students' feeling about the program; Performance evaluation approaches."

2. Student satisfaction surveys

Given that student satisfaction with a program or course is not a learning outcome, satisfaction may or may not relate to outcomes assessment. But satisfaction may correlate with other variables. For this reason, a common component of assessment systems is the student satisfaction survey. Such surveys may consider the extent to which students are satisfied with their interactions with faculty, with their introductory or advanced courses, or with their preparedness coming out of the program (see Student Self-Efficacy). Use of individual course evaluations for program assessment is problematic because the evaluations reflect on individual instructors – a serious pitfall to be avoided in program assessment.

Benefits include the relative simplicity of administering this type of survey. Standardized, commercial surveys are available that provide comparison data from other institutions.
Disadvantages include the difficulty of designing questions appropriately, or, again, a potential hazard in linking student satisfaction and achievement of learning outcomes.

Example: Noel Levitz Student Satisfaction Inventory, sample survey

3. Student attitudinal surveys

If learning outcomes include elements of appreciation or understanding of particular issues of concern, student attitudes can be measured as part of the assessment program. For example, informed appreciation for the arts may be assessed using an attitudinal survey. Another example may be students’ empathy toward disadvantaged groups, which can be measured in an attitudinal survey. A further example would be attitudes toward learning or toward the profession. Both standardized tests and locally designed surveys can be used for this purpose, although the responses are very sensitive to the wording of the questions.

Benefits include the simplicity of administering the system.
Disadvantages include the challenge of determining student attitudes in a reliable manner.

Examples: Cooperative Institutional Research Program (CIRP) “Freshman” Survey, BSC Psychology Department

4. Exit interviews

Rather than assess students’ attitudes, self-efficacy, or satisfaction through the use of surveys, students may be interviewed directly in individual or focus-group settings. Such interviews allow a more thorough, free-form exploration of the issues through the use of follow-up questions that depend on students’ responses.

Benefits include the depth and richness of information that can be obtained through interviews.
Disadvantages include the time- and labor-intensive nature of conducting such interviews and in analyzing the information obtained from interviews for comparison across multiple interviews. Also, student anonymity needs to be protected in this tool, and stray comments about individual faculty must not become part of the assessment data.

5. Alumni surveys

The perspective that students have on their education may change significantly after time away from school. Some learning outcomes lend themselves more naturally to questions posed some time after graduation. For example, an outcome involving preparation for professional practice can best be assessed after the student has graduated and been employed in the job market.

Benefits include the real-world perspective that can be obtained from alumni.
Disadvantages include the difficulty of finding and reaching alumni, the possibly self-selective nature of those who choose to respond, and the relatively narrow scope of learning outcomes that can be assessed in this manner.

Examples: BSC Program Review alumni survey, Career Services survey

6. Employer surveys

It is possible that some of the students' knowledge and skills are evident to the employers who rely on these characteristics. Thus, some accrediting bodies either require or encourage programs to perform an assessment through the major employers of their students. These may range from information as basic as hiring data, to site supervisor evaluations, to detailed surveys of the characteristics that the employers perceive in program graduates. Advisory boards, anecdotal information, and placement data may be used in place of formal surveys.

Benefits of this tool include the real-world perspective that employers might be able to provide.
Disadvantages include the potentially limited ability of employers to assess their employees’ characteristics in terms of specific learning outcomes, or the inability of employers to assess graduates only from a particular school. Also, this tool depends on surveying employers with sufficient numbers of graduates. In large corporations, it may even be difficult to find the right person to contact for this information. In addition, former students may object to having their employers surveyed in this way.

Example: Tennessee Tech University engineering employer survey

7. Curriculum analysis

Historically, accrediting bodies have required institutions or programs to document the information that students are receiving and the content that the program delivers in its courses (see Course Mapping in Chapter 4, Learning Outcomes). Documentation can be obtained from the curriculum and syllabi of individual courses.

With the move toward learning-outcomes assessment, programs are required to show that students actually exhibit the skills and qualities that the program wishes to develop. However, a curriculum analysis may still be relevant and is often included in accreditation documents. For example, some accrediting bodies may require the documentation of the number of hours devoted to a particular subject in the curriculum.

Benefits include the relatively straightforward task of analyzing the content of the curriculum, for which only course syllabi may be needed.
Disadvantages include the potential inequality between delivery of material and documentation of learning for specific outcomes.

Example: BSC Anthropology program curriculum analysis

On to Chapter 6. Implementation

Last Modified: October 21, 2004