In the largest analysis to date of narrative medical school evaluations, researchers at UC San Francisco and Brown University have found significant differences in how female and underrepresented minority medical students are described.


The researchers used natural language processing to analyze a large bicoastal sample of nearly 90,000 narrative evaluations of third-year clerkships from UCSF and Brown. The data spanned nine years at UCSF -- from 2006 to 2015 -- and five at Brown -- from 2011 to 2016.

These evaluations are supposed to focus on student behaviors -- or competencies -- that are directly relevant to medicine. But the analysis found that evaluators often used personal descriptors to describe a student's performance, and they used strikingly different words for men and for women.

The study also identified personal descriptors that were applied differently depending on whether students were members of groups that are underrepresented in medicine (URM). But nearly all of these personal descriptors were used more often to describe students who were not in those groups.

The evaluations form the basis of the grades that students get in their core clerkships, which are like medical apprenticeships, and are frequently quoted in letters of recommendation for residencies. Even small biases can snowball, with lasting effects on students' career prospects.

"There shouldn't be systematic differences based on gender or URM status in a sample this big," said Urmimala Sarkar, MD, MPH, an associate professor of medicine at UCSF and the senior author of the study, published Tuesday, April 16, in the Journal of General Internal Medicine. "Everything should come out in the wash."

Support links: