Te-Chuan Chen, Meng-Chih Lin, Yuan-Cheng Chiang, Lynn Monrouxe & Shao-Ju Chien
Introduction: Onsite scoring is common in traditional OSCEs although there is the potential for an audience effect facilitating or inhibiting performance. We aim to (1) analyze the reliability between onsite scoring (OS) and remote scoring (RS); and (2) explore the factors that affect the scoring in different locations.
Methods: A total of 154 students and 84 raters were enrolled in a single-site during 2013–2015. We selected six stations randomly from a 12-station national high-stakes OSCE. We applied generalisability theory for the analysis and investigated the perceptions that affected RS scoring.
Results: The internal consistency reliability Cronbach’s α of the checklists was 0.92. The kappa agreement was 0.623 and the G value was 0.93. The major source of variance comes from the students themselves, but some from locations and raters. The three-component analysis including Technical Feasibility, Facilitates Wellbeing, and Observational and Attention Deficits explained 73.886% of the total variance in RS scoring.
Conclusions: Our study has demonstrated moderate agreement and good reliability between OS and RS ratings. We validated the factors of facility operation and quality for RS raters. Remote scoring can provide an alternative forum for the raters to overcome the barriers of distance, space, and avoid the audience effect.