Pediatric POCUS Interpretation - Variability in Learning

By Delia Gold


The Variable Journey in Learning to Interpret Pediatric Point-of-care Ultrasound Images: A Multicenter Prospective Cohort Study

AEM Education Training July 2019 - Pubmed Link

Take Home Points

1. There is a significantly variable rate of achievement across learners and applications.

2. Participants required the highest number of cases for the cardiac application, and lowest for skin/soft tissue to reach a specified benchmark.

3. Deliberate practice of pediatric POCUS image cases using an online learning and assessment platform may lead to skill improvement in POCUS image interpretation.


Learning POCUS has become a priority in PEM but there are barriers to achieving proficiency. Opportunities for bedside learning is limited by the number of POCUS trained faculty available (known problem supported by the literature), and low percentage of pathology in pediatrics (as opposed to adults). POCUS expertise is by definition multifaceted… as opposed to radiology performed ultrasound, a POCUS user must be able to acquire images, interpret images, and integrate that interpretation into medical-decision making. There is evidence that to facilitate complex learning, one’s education should alternate part-task with whole-task training. E-learning can be used to accomplish the part-task of image acquisition or interpretation in order to complement the resource intensive face-to-face bedside teaching that addresses all facets of POCUS (i.e. whole-task). These authors developed a POCUS image interpretation and assessment system in order to gain information about how PEM physicians learn POCUS.


  1. To determine PEM physician POCUS interpretation performance metrics

  2. To determine the number of cases and time within most participants could achieve specific interpretation performance benchmarks


Pediatric emergency medicine (PEM) physicians and fellows

28/50 US states and 5/10 Canadian provinces.

Recruited on volunteer basis from an email invitation sent to PEM division heads, PEM fellowship program directors, and the P2 network.

Additionally, 5 PEM POCUS experts were recruited (separate from study team). Experts had completed PEM POCUS fellowship and performed at least 1,500 bedside scans.


Multi-center prospective cohort study with a convenience sample of PEM physicians in the US and Canada from September to November 2018.

PEM physicians learned POCUS using a computer-based image repository and learning assessment system that allowed for deliberate practice of image interpretation. This is what they used

Participants completed at least one application (skin/soft tissue, lung, FAST, cardiac) over a 4-week period.

The application allowed learners to view an image or clip, choose “normal” or “abnormal”. If abnormal, they had to mark the area on the image that was abnormal. For each of the four applications, there were 100 cases, 50% were normal.

Primary outcomes were benchmarks of 80%, 85%, 90%, and 95% accuracy. These were determined based on a survey of 150 P2 network members. They also compared average learner (50%ile) to least efficient learners (99%ile) of the nonexperts.

Sample size: From previous work, educationally important difference between initial and final scores was approximately 10%, proportion of discordant pairs in 12%. Power analysis with minimal sample size of 95 of PEM POCUS nonexpert participants per application.


N = 177

  • 65 fellows

  • 107 attendings

  • 5 experts

Not surprisingly, there were significantly more fellows than attendings that received POCUS training as part of their fellowship experience (82% vs. 30.8%).

At least one application was completed by the 128 (74.4%) participants, 88 of whom completed all four applications (68.8% of those who did one application/49.1% of total participants), 11 completed three applications (8.6%), seven completed two applications (5.5%), 22 completed one application (17.2%).

Primary Outcome - Median number of cases to performance benchmarks:

For the average (50%ile learner) - ranges across the four applications

80% accuracy - ranged from 0-45

85% accuracy - ranged from 25-97

90% accuracy - ranged from 66-175

95% accuracy - ranged from 141-290

For the least efficient learners (95%ile)

80% accuracy - ranged from 60-288

85% accuracy - ranged from 109-456

90% accuracy - ranged from 160-666

95% accuracy - ranged from 243-1,040

Other Findings

Performance outcomes:

Cohen’s d-effect sizes for each application were moderate to large and ranged from 0.6-1.0. Large effect size for soft tissue and lung, moderate for FAST and cardiac.

NO differences in fellow vs. attending final accuracy, sensitivity, or specificity scores for soft tissue, lung, or cardiac applications. For FAST however, attendings had higher final accuracy (+6.0% difference; 95% CI = 1.8 to 10.2) and sensitivity (+7.5% difference; 95% CI = 0.6 to 14.4).

There was NO association of PEM POCUS physician learners achieving expert performance with any of the baseline variables (# of scans performed, initial accuracy, years in practice, children’s hospital, POCUS training in fellowship)

PEM POCUS expert mean accuracy scores (95% CI) were soft tissue 96% (92.3%-99.7%), lung 96% (93.9%-98.1%), cardiac 90% (81.8%-98.2%), and FAST 93% (88%-98%). Expert final scores were significantly higher than those of nonexistent participants' scores, with an accuracy difference of 7.3% (4.4%-10.4%).

Cardiac application had lower accuracy than others. For the soft tissue application, there were greater learning gains for normal cases (specificity = +18.9%) vs. abnormal cases (sensitivity = + 4.9%) with a difference of 14% (95% CI 9.8%-18.2%). No differences in learning gains for abnormal vs normal in other applications.

Time Outcomes

The first 25 cases mean time per case was 31.7 seconds, which decreased to 21.1 seconds on the final 25 cases.

The median time (IQ range) in minutes it took to complete each 100-case application was as follows:

  • Soft tissue (still) 29.5 (20.9-40.7)

  • Lung (video + still) 52.3 (39.7-72.7)

  • Cardiac (video + still) 52 (39.7 -72.7)

  • EFAST (video + still) 63.6 (49.1-76.6).

Missing data: There were no differences in demographics of the 44/172 (25.6%) who dropped out vs. the 128 who completed at least one 100-case application.


  1. Groundbreaking work on learning curves specific to pediatric POCUS applications

  2. Robust study design and statistical analysis

  3. Multi-center, international, prospective


Image interpretations were based on expert opinion of 3 POCUS experts and intraobserver agreement between these opinions was high, but they are still subject to error.

Participants were able to select the applications and about ¼ of enrolled participants did not complete the minimum study intervention, therefore there could be bias by increased engagement for the selected applications and/or more POCUS motivated physicians.

Since this is a part-task education intervention removed from the bedside and uses a higher proportion of pathologic cases than is typically experienced, it is uncertain how this tool will translate to patient-level skill and outcomes (ex. Maybe it is easier to point out pathology than to call something normal). Final accuracies (which were high) do not necessarily translate into real life clinical accuracy.

Some participants needed more than 100 cases to reach the benchmark but since we have the most information about those at the median, there might be spectrum bias.

This was a study of only PEM physician participants, therefore may not be generalizable to other populations.

No testing of knowledge retention.


Case numbers required to reach the performance benchmarks ranged considerably for both the average and least efficient learners. Most learners needed only about 2-3 hours to achieve the highest performance benchmark accuracy for 95% of all applications, except cardiac which required 3-10 hours. As there continues to be issues at organizations regarding credentialing POCUS standards, this study can inform organizational discussions on the variable journey learners take to reach a given performance benchmark. The number of cases required to achieve a performance benchmark were not normally distributed, with skewed distribution tails, indicating that a minority of learners required considerably MORE cases to attain each benchmark. This has policy implications for education guidelines based on the “average” learner - the study data suggests shifting away from a set number of scans to determine POCUS proficiency in favor of learners achieving a performance-based competency benchmark which is supported in other literature as well. PEM POCUS is later to the game in North America than adult POCUS, maybe we can learn from their mistakes and not just set the number at 25 and move on…?

Breaking down POCUS education into part-task activities can reduce cognitive load during actual face-to-face bedside POCUS teaching ➞ this may result in more efficient and effective learning. It might be easier to learn POCUS interpretation at the bedside (ex. Sonographic McBurney’s point), but it is also a more challenging way as the learner has to simultaneously acquire and interpret images to some extent. Also, the low volume of pathology in pediatric patients might hinder the learner’s ability to achieve practice-ready standards despite having the time/faculty for whole-task learning at the bedside ➞ you need extra image review anyway for practice in peds, so this is a good way to do it, particularly for applications that are more difficult to learn such as cardiac and NORMAL soft tissue cases.

This study demonstrated a large effect size for soft tissue and lung and a moderate effect size for cardiac and FAST. This may be due to cardiac and FAST being more difficult to interpret (more views, more anatomic structures, more anatomic complexity, more physiologic variation, more variation of pathology when present, low rates of pathology compared to lung and skin) ➞ this can inform educational strategies that may enhance diagnostic performance (ex. Scaffolding, more deliberate review of incorrect cases, repeating more).

NONE of the participant baseline variables that they tested predicted achieving expert level performance. Some explanations:

  1. Number of scans might make a difference, but not at the threshold of greater than or less than 100.

  2. POCUS training during PEM fellowship vs. no training might not have made a difference since many of their study attending-level participants also engaged in institutional and other POCUS workshops/courses => i.e. a lot were from Sick Kid’s, Ottawa, and similar institutions, not sure this is generalizable to the average PEM program (at least in the US).

  3. Working at an academic children’s hospital - almost everyone who participated worked at an academic children’s hospital so likely underpowered to examine the impact of practice setting.

  4. Since scores are weighted for the 25 most recent cases and case interpretation difficulty varied over the 100-case experience, INITIAL accuracy scores may NOT have predicted for subsequent/final scores.


Prospective cohort of 177 Canadian and American PEM fellows and attendings who participated in an online image interpretation learning system. The outcomes focused on the amount of training required to meet accuracy benchmarks across different learners and POCUS applications.

Take Home Points

1. There is a significantly variable rate of achievement across learners and applications.

2. Participants required the highest number of cases for the cardiac application, and lowest for skin/soft tissue to reach a specified benchmark.

3. Deliberate practice of pediatric POCUS image cases using an online learning and assessment platform may lead to skill improvement in POCUS image interpretation.

Our score

4 Probes

Expert Reviewer for this Post


Meghan Herbst, MD @EUSmkh

Meghan directs the clinical curriculum for UConn School of Medicine and the ultrasound curriculum for the UConn Emergency Medicine Residency

Reviewer's Comments

Great review of the article. I think viewing cardiac anatomy (as well as the anatomy of FAST views) is much more complicated than viewing the lung and soft tissue. There are more structures to identify, different locations for fluid to be present in the setting of a pericardial effusion, and LV function is complicated to calculate on 2-dimensional imaging and takes viewing experience to have a gestalt sense. It would have been interesting to see the breakdown of accuracy among PEM physicians for normal images/clips vs abnormal images/clips, given their infrequent exposure to pathology.

Cite this post as

Delia Gold. Pediatric POCUS Interpretation - Variability in Learning. Ultrasound G.E.L. Podcast Blog. Published on February 15, 2021. Accessed on March 03, 2021. Available at
Published on 02/15/21 04:00 AM
comments (0)