Summative Assessment

Estimated time to complete: 100 minutes

Module Learning Objectives

By the end of this module, you will be able to…

  • Define Summative Assessments.
  • Argue for summative assessments as a strategy to support positive learning experiences and inform teaching practices.
  • Argue for rubrics as a mechanism to support and evaluate student learning.
  • Integrate summative assessments into your course design.
  • Align summative assessments with learning objectives and learning experiences.

Summative Assessment Defined

Summative assessments are evaluations conducted at the end of a learning period to determine a student’s overall understanding and development of expertise. As seen in the backward design process, summative assessments provide evidence about the extent to which students achieved the learning objectives (Wiggins, 2005; Wiggins and McTighe, 2011).

Unlike formative assessments, which are ongoing and provide feedback during the learning process, summative assessments are typically administered after a unit has been completed to provide a summary of a student’s learning progress and achievement (Kibble, 2017). Summative assessments are comprehensive and assess a broad range of knowledge, skills, and competencies related to the learning objectives. Often, they are high-stakes, meaning summative assessments carry significant weight in determining students’ grades or academic progress.

Examples of summative assessments include:

  • Comprehensive tests
  • Written artifacts
  • Project products
  • Presentations
  • Performance assessments

Why Summative Assessment?

Summative assessments play a role beyond merely assigning grades to students. They serve as invaluable tools for evaluating student learning and providing feedback to both learners and instructors.

For learners, summative assessments serve as milestones in a course trajectory, marking significant targets for students to strive toward. Summative assessments also hold students accountable for their learning and represent indicators of their increasing skill development and competence in the subject matter. Assessment drives learning; students tend to focus on what will be assessed.

For instructors, summative assessments provide essential data to gauge the extent to which students have met predefined objectives, as well as identifying areas where additional support may be required. As a scientific instructor, collecting and analyzing this data is crucial in refining your teaching practice (Ebert-May et al., 2003) and is aligned with calls to teach science the same way it is practiced (West and others, 1991; Undergraduate Science Education, 1997; Bransford et al., 2000; Glaser et al., 2001; Cech, 2003).

Like a scientist uses data to evaluate their hypotheses, a scientific instructor uses assessment data to evaluate their instructional practice. And given the importance of engaging students in scientific practices to equip them for future careers in science, summative assessments need to extend beyond evaluation of content knowledge and assess competencies expected for aspiring scientists.

A Framework for Summative Assessment

You already have the framework! Recall that when we leverage backward design, we:

  • Identify desired results about what students should know, understand, and be able to do (learning objectives).
  • Articulate what evidence would indicate progress toward and achievement of those results.
  • Plan learning experiences to guide and scaffold students toward that progress and achievement.

These elements of backward design help guide your creation of summative assessments, which will reinforce key course concepts and serve as checkpoints throughout your students' learning process.

Summative Assessment in Practice

Summative assessments are a critical component of course design that depends on the information you are trying to gather. Different summative assessments gather different information about a student’s learning by focusing on various aspects of their knowledge, skills, and abilities.

Each strategy also varies in how broadly or deeply the objectives are assessed, its objectivity, the amount of time and resources it takes to administer and grade, and whether it adds additional performance anxiety to the student.

By using a variety of assessment methods, instructors can gain a more comprehensive understanding of a student’s strengths and areas for improvement.

Summative Assessment Techniques

Reflecting on backward design, your assessments need to be aligned to the learning objectives of the unit. In a STEM course, summative assessments provide a way for students to demonstrate their understanding of the facets of science. The summative assessment techniques you select should directly align with your established learning objectives in the scientific discipline. This entails designing evaluations that not only test content knowledge but also measure practical skills and competencies—such as scientific inquiry, data analysis, problem-solving, and effective communication—that are crucial for success in STEM fields. Authentic assessment is an approach where students demonstrate their knowledge and skills through tasks that reflect how science is done in the workforce (Wiggins, 1990; Schultz et al., 2022).

For example, here are four different approaches to summative assessment:

Click for a Traditional Exam-Based Example
  • Weekly quizzes
  • 3 midterm exams
  • Final exam
Click for a Writing-Centric Example
  • Weekly short answer quizzes
  • 4 self-reflection essays
  • Final research paper (scaffolded sections due every 2 weeks)
Click for a Project-Based Example
  • Weekly quizzes
  • Group project: proposal, presentation, report
  • Individual self-reflection essay
Click for a Performance-Based Example
  • 3 lab reports
  • 4 lab practicals
  • Independent project: proposal, presentation, report

Notice that all these courses utilize more than one kind of summative assessment technique except the Traditional Exam-Based plan, which only uses tests. A well-designed course incorporates a range of assessment opportunities that combine both low- and high-stakes assessments. For example, all of the plans administer weekly or daily quizzes instead of just a few during the whole semester. This way, lower performance on one assessment doesn’t devastate a student's grade and they have multiple opportunities to demonstrate their knowledge and abilities.

In addition, each of these approaches to summative assessment could be designed to authentically reflect disciplinary practice, allowing students to apply their knowledge and skills in meaningful contexts. This approach helps students see the relevance of what they are learning and strengthens their scientific self-efficacy (Estrada et al., 2011). For example, exam questions could present real-world problems that require students to apply knowledge in context (Villarroel et al., 2020), or projects could mirror tasks that scientists perform in professional settings (Villarroel et al., 2018).

Plan for Summative Assessment

Call to mind a course you are teaching, have taught, or are planning to teach.

In the Course Design module, we asked you to imagine the third week/day for your course and some objectives for that moment in time.

Then in Active Learning Experiences, we asked you to design an activity that embodies active learning experiences around one objective.

How could students demonstrate enduring understanding of the scientific content and concepts related to that objective?

In what ways could they demonstrate scientific skills and competencies related to that objective?

Be sure to imagine several, varied ways students could demonstrate the objective.

Design with Professor Pham

Professor Pham is designing an assessment plan for their medical microbiology course. As learning goals for the course, Professor Pham wants students to develop skills in collecting and organizing information, communication, and creative thinking, as well as understand how microorganisms impact human health and society. Professor Pham knows that presentations can be an excellent medium to assess these competencies.

How could Professor Pham design the presentation assignment to align with facets of science?

For example, to target the Discovery facet, the presentation must address potential biases and stereotypes in medical history and reflect on the importance of diversity and inclusion in microbiology research. How could the presentation assignment align with the other facets?

Design with Professor Pham: Now You Try!

Call to mind a course you are teaching, have taught, or are planning to teach.

Select two or more of the following assessment techniques:

  • Comprehensive tests
  • Written artifacts
  • Project products
  • Presentations
  • Performance assessments

Design a summative assessment plan for your classroom that uses the selected techniques. Briefly describe your design.

How would the assessment plan provide students the opportunity to demonstrate that they understand the appropriate facets of science for your course?

How would each summative assessment technique evaluate the skills and knowledge that students practiced during formative assessments?

What adjustments would you need to make so the plan is feasible for you and your instructional team to implement and grade?

Inclusive Summative Assessments

Given the significant influence of summative assessment outcomes on students and their grades—which can have a direct impact on scholarship, university admissions, and career opportunities—it is essential to design summative assessments that are equitable, accurate, and fair in their evaluation of student knowledge.

Fairness in Testing

The Standards for Educational and Psychological Testing provide principles to establish fairness in testing (Association et al., 1985):

  • All test takers have access to materials and opportunities to learn
  • All test takers receive equitable treatment during tests
  • Biases are removed from the assessment and in evaluation of the student’s work

Additionally, instructors can practice these elements to produce high-quality and equitable assessments (Kibble, 2017):

  • Ask instructional staff and colleagues to review assessments for construct underrepresentation (e.g., too few items on a construct, inclusion of trivial items) and construct irrelevant variance (e.g., items that are too hard/ easy, contain trivial details, are culturally insensitive).
  • Include enough, high-quality items that provide a reliable and accurate picture of students’ knowledge, skills, and abilities.
  • Provide clear instructions and practice materials to students.

The principles and elements outlined above are crucial for ensuring that all students have an equal opportunity to showcase their knowledge and understanding on assessments that accurately measure their abilities. While these principles may seem intuitive, it’s essential to delve deeper into the evidence-based strategies instructors can use to design assessments that promote equity and accuracy. By doing so, educators can create an equitable and inclusive assessment environment that values the diverse strengths and abilities of all students.

Transparent Assessment Design

Summative assessments can be made more inclusive by applying the principles of transparent assignment design. Clearly communicating the purposes, tasks, and criteria of an assignment helps all students, regardless of their background or prior educational experiences, understand what success looks like and how an assignment supports their learning (Winkelmes et al., 2023). This transparency reduces uncertainty and also helps students see the relevance of the assessment beyond the course, supporting student motivation and metacognition.

A transparent assignment design template and other resources are available online from the Transparency in Learning and Teaching project.

Example of an Inclusive Assessment Tool: Rubrics

To be equitable and inclusive, assessments need to be evaluated in a way that is standardized and consistent across student work. Rubrics can achieve this goal by providing a detailed description of the criteria and standards used to assess student performance. Rubrics help instructors assess student work systematically and consistently, providing feedback that is aligned with the established criteria. Without rubrics, grading of written assessments can be subjective and biased, leading to unfair grades and confusion for students.

Rubrics also provide a positive framework for student expectations, outlining clear steps that show progress toward meeting those goals. Conveniently, when shared with students, rubrics support metacognition by giving students the tools they need to self- or peer-assess their performance while developing their ideas and creating early drafts.

Considerations for creating rubrics

A rubric typically consists of a grid or list that outlines the specific criteria for success, along with descriptions of different levels of performance for each criterion (e.g., developing, proficient, and excellent).

Rubrics can be analytic or holistic. Analytic rubrics provide a different score for each criterion, such as overall ideas and conceptual understanding, organization, use of evidence in arguments, grammar and spelling, and format. In contrast, holistic rubrics provide a single score that represents the grader’s overall assessment of the work and how its components hang together (Bean and Melzer, 2021). Analytic rubrics are more common in STEM courses, though holistic rubrics are not unheard of.

Rubrics can be generic or task-specific. A generic rubric would apply to multiple assignments, whereas a task-specific rubric would call out specific criteria for one assignment (Bean and Melzer, 2021). For example, a generic rubric could be used to assess all written assignments during a semester, whereas a task-specific rubric would be tailored for each lab report.

Rubrics use a range of descriptors for performance levels (Bean and Melzer, 2021). Terms indicating understanding or achievement of learning objectives might include “exceeds”, “fully”, or “meets criteria”. Terms such as “usually” and “sometimes” specify mid-range competency and may indicate that students need more practice with the concepts or skills, or that they need to improve communication of their understanding. Terms like “never”, “rarely”, or “minimally” convey that the criteria are not being met and more work would need to be done to further understanding or demonstrate proficiency.

And to write a rubric, once again we revisit our backward design principles:

  • Establish clear and specific expectations for student performance based on your learning objectives and assignment type (e.g., written lab report, oral presentation).
  • Use the expectations to fill out the “criteria” column of the rubric, then articulate “rating” levels.
  • Confirm that the rubric is emphasizing knowledge and skills gained in the course, rather than prior knowledge or ability.

Analytic vs. Holistic Rubric

Examine these example rubrics showing two difference approaches to assessing STEM student work.

Which example rubric resonates with your experiences as a student? Why?

Call to mind a course you are teaching, have taught, or are planning to teach.

Which example rubric would allow you to communicate clear and specific expectations for student performance in your course? How so?

How else do you communicate clear and specific expectations for student performance in your course?

Analytic vs. Holistic Rubric: Now You Try!

Call to mind a course you are teaching, have taught, or are planning to teach.

Using these example rubrics as a guide, answer the following questions.

Imagine a lab report, research paper, group project, or other large assignment for your course or a course similar to yours. Based on the assignment type and your learning objective(s), what is one criteria that you would assess that student work on?

Describe what student work looks like that fully meets that criteria. How much variety could their be in student work that still meets that criteria? Briefly give a few examples.

Describe what student work looks like that does not meet that criteria. How much variety could their be in student work that still does not meet that criteria? Briefly give a few examples.

What advice would you provide students whose work does not meet the criteria to help them move towards fully meets the criteria?

And what would you say to students to convey why the criteria matters towards the goals of the course and beyond?

Then, repeat the above questions, choosing a second criteria you would assess this student work on.

Finally, how could you present this information to students in an effective, equitable, learner-centered way? Think about how you would design a rubric around information like this, then briefly describe your design.

Example of an Inclusive Assessment Tool: Exams without Time Limits

Timed tests are common in education, but what does the research say? Studies show that time-limited tests are less valid reliable, inclusive, and equitable than tests in which there is no time limit or a sufficiently generous limit that allows all students to complete the assessment (Gernsbacher et al., 2020).

Test-taking pace is not a valid measure of student understanding. Research has found that timed tests can especially negatively impact students from certain groups, making them less equitable for all learners.

Research shows that most students who do receive additional time on tests do not use all of it (Cahalan-Laitusis et al., 2006; Holmes and Silvestri, 2019; Spenceley and Wheeler, 2016). Gernsbacher speculates that these students are actually requesting to not experience the anxiety and pressure of running out of time (Gernsbacher et al., 2020). When time limits are removed from exams, numerous studies have shown that student performance improves across student groups including those who are learning English, from underrepresented backgrounds, older than average, and females (De Paola and Gioia, 2016; Foos and Boone, 2008; Mullane and McKelvie, 2000). For example, Foos and Boone showed that young adults score higher than older adults under standard timed test conditions, but older adults perform as well as young adults when time limits are removed (Foos and Boone, 2008).

So, consider administering untimed asynchronous tests such as take-home exams or untimed online exams, or design test questions that aren’t easily searchable by leveraging higher-order skills and the facets of science.

Equitable Summative Assessments Leverage Universal Design for Learning

As we saw in the Course Design module, incorporating Universal Design for Learning (UDL) principles into course materials not only benefits students with disabilities but also enhances learning experiences for all learners. In addition, instructors have a responsibility to comply with disability rights legislation and provide student accommodations to ensure access to educational materials and assessments. Here are suggestions for how you can use UDL principles in the context of summative assessments:

Provide multiple means of representation by offering different formats for presenting information:

  • Visual: images, diagrams, charts, and graphs
  • Auditory: audio recordings, podcasts, and videos
  • Text-based: written text, transcripts, and summaries

Provide multiple means of action and expression to allow students to demonstrate their knowledge and skills through various methods:

  • Writing: papers (research, essay, lab reports) and free-response answers
  • Speaking: oral presentations, debates, and discussions
  • Creating: multimedia projects

Provide multiple means of engagement by designing assessments to include features that motivate and engage students:

  • Authentic tasks: use real-world problems that reflect students’ interests and experiences
  • Cultural relevance: incorporate diverse perspectives, cultures, examples, and scenarios
  • Choice and autonomy: allow students to choose topics, formats, or pace
  • Real-time critique: provide regular feedback and progress monitoring

Equitable Summative Assessments Reduce Stereotype Threat

As we saw in the Inclusive Learning module, stereotype threat is a psychological phenomenon that refers to the feeling of anxiety or self-doubt that individuals experience when they are in a situation where they are at risk of being judged or evaluated based on a negative stereotype about their group (Steele et al., 2002). This can lead to a decrease in performance and motivation, as individuals may feel like they are being held to a lower standard or that they are being judged based on their group membership rather than their individual abilities.

Students deserve the opportunity to engage with an assessment to the best of their ability without being impeded by stereotype threats, biases, or assumptions. To achieve this, it is therefore crucial that summative assessments are designed and conducted in a way that minimizes the activation of stereotype threat by following these practices:

  • Frame assessments as opportunities to learn and demonstrate progress rather than as high-stakes evaluations. Do not frame exams or quizzes as definitive judgments of a student's intelligence or identity.
  • Provide practice exams, assignments, or low-stakes quizzes to help students become familiar with the testing format and reduce anxiety.
  • Teach stress-reduction techniques, such as deep breathing or mindfulness exercises, before exams.
  • Write inclusive test questions, and include diverse and inclusive examples and scenarios.
  • Ensure that test materials, such as images or scenarios, do not perpetuate negative stereotypes or biases.
  • Solicit demographic information after exams, not before (Kumar, Oct 2010).

Commit to Action: Equitable Summative Assessments

Call to mind a course you are teaching, have taught, or are planning to teach.

What is one concrete action you can take to make the design of your summative assessments more equitable?

Equitable Grading

Traditional grading is the practice of assigning numerical points to one-time assessments and aggregating those points into a single letter grade for the course (Nilson et al., 2023). Often, there are no opportunities to reattempt the assessment, so students are unable to apply feedback (Nilson et al., 2023). Overall, traditional grading penalizes students who have received unequal opportunities and discourages growth and learning (Nilson et al., 2023).

Incorporating more equitable grading practices promotes a more just and inclusive educational environment that supports the success of all students. Below are some equitable grading practices you can implement in your course.

In providing feedback (Hope, 2020):

  • Ensure it is constructive, focusing on strengths and areas for improvement. Specific and actionable feedback can help students understand their performance and make meaningful progress towards developing expertise
  • Locate it separately from the grade or points
  • Make sure it is timely so that students are not waiting or guessing

Ensure your grading policy is transparent (Garcia et al., 2023):

  • Communicate the criteria and standards used to evaluate performance
  • Communicate how grades are assigned
  • Provide rubrics, standards, or expectations

Do not apply grading curves (Feldman, 2023):

  • Curves are not transparent and make it difficult for students to determine grades
  • They are inconsistent and unfair
  • Curving grades is the result of misalignment in the course design: instructors need to ask themselves why students are failing exams rather than applying a curve (which “fixes” the symptom but not the underlying issue)

Check your own biases and assumptions (Feldman, 2023):

  • Grade without knowing who students are to prevent bias in your grading based on existing beliefs about students.
  • Become aware of common stereotypes and find ways to avoid perpetuating and activating them.
  • Don’t let expectations or assumptions about students affect your grading.
  • Use your grading data to self-check: did you give males higher scores over females on free-response questions? What about students from HECs?

Provide opportunities for practice, retakes, and redos (Feldman, 2023):

  • Allow students to apply what they learned from the assessment feedback to retest for partial or full credit.
  • Assessment drives learning: give students another opportunity to grow and showcase their knowledge.
  • Provide lots of opportunities for students to practice with the material and get feedback before the high-stakes assessment.

Alternative Grading Methods

In recent years, there has been a growing movement to move away from traditional grading methods altogether and implement alternative grading methods. Alternative grading shifts the focus from traditional letter grades (A-F) or numerical scores to a more nuanced and detailed assessment of student learning.

Common alternative grading methods include standards-based grading, specifications grading, and ungrading. Briefly:

  • Standards-based grading assesses students’ competency of specific skills or knowledge. Final grades are based on how many standards a student meets, no matter the number of attempts (Nilson et al., 2023).
  • In specifications or specs grading, the instructor creates a list of attributes of a successful submission for assignments. Students then earn a “Satisfactory” or “Not Yet” on their work and have the opportunity to resubmit for full credit (Nilson et al., 2023).
  • Ungrading removes grades in favor of formative and descriptive feedback. It can replace deficit thinking with an approach that values students’ cultural and familial knowledge, recognizing these insights as critical to their education and relationships (Mejia et al., 2018). Instructors meet regularly with students to discuss progress, and students build a portfolio of their work and growth (Nilson et al., 2023).

Instructors who are interested in alternative grading but can’t or aren’t ready to overhaul their course can try partial conversions, such as standards-based testing (i.e., only tests are graded using standards).

Commit to Action: Equitable Grading

Call to mind a course you are teaching, have taught, or are planning to teach.

What is one concrete action you can take to make the grading in your course more equitable?

Reflection: Course Alignment

Call to mind a course you are teaching, have taught, or are planning to teach.

Briefly describe the alignment between your summative assessment plan and your inclusive learning commitments, learning objectives, and syllabus. Where is there room for growth or adjustment?

Summary of Summative Assessment

  • Summative assessments evaluate whether students have achieved the intended learning objectives at the end of a learning period.
  • They provide feedback to both students and instructors.
  • Summative assessments are usually (but not always) tied to grading or other performance metrics.
  • Rubrics provide detailed descriptions of the criteria and standards used to assess student performance, reducing subjective and biased grading.
  • Summative assessments provide data about learning that can inform future changes - an important component of scientific teaching.

Takeaways from Summative Assessment

Identify two key takeaways that resonate most with you after completing this module.