Trial Outcomes & Findings for Preventing Medication Dispensing Errors in Pharmacy Practice With Interpretable Machine Intelligence (NCT NCT06245044)
NCT ID: NCT06245044
Last Updated: 2025-11-26
Results Overview
Difference in task time measured by the number of seconds from starting the task to accepting or rejecting a medication image
COMPLETED
NA
68 participants
Throughout the verification task
2025-11-26
Participant Flow
Participant milestones
| Measure |
Pharmacists
Licensed pharmacist were recruited to participate in a mock verification task. In this crossover design trial, each participant received all the three study interventions. The order of the arms was randomized.
|
|---|---|
|
No MI help
STARTED
|
68
|
|
No MI help
COMPLETED
|
50
|
|
No MI help
NOT COMPLETED
|
18
|
|
Scenario 1
STARTED
|
50
|
|
Scenario 1
COMPLETED
|
50
|
|
Scenario 1
NOT COMPLETED
|
0
|
|
Scenario 2
STARTED
|
50
|
|
Scenario 2
COMPLETED
|
50
|
|
Scenario 2
NOT COMPLETED
|
0
|
Reasons for withdrawal
| Measure |
Pharmacists
Licensed pharmacist were recruited to participate in a mock verification task. In this crossover design trial, each participant received all the three study interventions. The order of the arms was randomized.
|
|---|---|
|
No MI help
Technical issues
|
18
|
Baseline Characteristics
Two participants did not indicate their sex on the demographics survey.
Baseline characteristics by cohort
| Measure |
Pharmacists
n=50 Participants
Licensed pharmacists in the United States with medication dispensing experience who are 18 or older.
|
|---|---|
|
Age, Categorical
<=18 years
|
0 Participants
n=50 Participants
|
|
Age, Categorical
Between 18 and 65 years
|
50 Participants
n=50 Participants
|
|
Age, Categorical
>=65 years
|
0 Participants
n=50 Participants
|
|
Age, Continuous
|
35.52 years
STANDARD_DEVIATION 6.949 • n=50 Participants
|
|
Sex: Female, Male
Female
|
34 Participants
n=48 Participants • Two participants did not indicate their sex on the demographics survey.
|
|
Sex: Female, Male
Male
|
14 Participants
n=48 Participants • Two participants did not indicate their sex on the demographics survey.
|
|
Ethnicity (NIH/OMB)
Hispanic or Latino
|
2 Participants
n=50 Participants
|
|
Ethnicity (NIH/OMB)
Not Hispanic or Latino
|
46 Participants
n=50 Participants
|
|
Ethnicity (NIH/OMB)
Unknown or Not Reported
|
2 Participants
n=50 Participants
|
|
Race (NIH/OMB)
American Indian or Alaska Native
|
0 Participants
n=50 Participants
|
|
Race (NIH/OMB)
Asian
|
6 Participants
n=50 Participants
|
|
Race (NIH/OMB)
Native Hawaiian or Other Pacific Islander
|
0 Participants
n=50 Participants
|
|
Race (NIH/OMB)
Black or African American
|
1 Participants
n=50 Participants
|
|
Race (NIH/OMB)
White
|
38 Participants
n=50 Participants
|
|
Race (NIH/OMB)
More than one race
|
2 Participants
n=50 Participants
|
|
Race (NIH/OMB)
Unknown or Not Reported
|
3 Participants
n=50 Participants
|
|
Region of Enrollment
United States
|
50 participants
n=50 Participants
|
PRIMARY outcome
Timeframe: Throughout the verification taskDifference in task time measured by the number of seconds from starting the task to accepting or rejecting a medication image
Outcome measures
| Measure |
Scenario #1
n=50 Participants
MI help will be presented in the form of a pop-up message the participant's decision differs from the MI's determination.
|
Scenario #2
n=50 Participants
MI help will be displayed concurrently with the filled and reference images.
|
No MI Help
n=50 Participants
Participants will complete the medication verification task without any MI help
|
|---|---|---|---|
|
Reaction Time
|
4727 millisecond (ms)
Standard Deviation 1040
|
4510 millisecond (ms)
Standard Deviation 1339
|
3668 millisecond (ms)
Standard Deviation 924
|
PRIMARY outcome
Timeframe: Throughout the verification taskDifference in detection rate measured by the number of medication verification errors across all participants in the Arm/Group.
Outcome measures
| Measure |
Scenario #1
n=50 Participants
MI help will be presented in the form of a pop-up message the participant's decision differs from the MI's determination.
|
Scenario #2
n=50 Participants
MI help will be displayed concurrently with the filled and reference images.
|
No MI Help
n=50 Participants
Participants will complete the medication verification task without any MI help
|
|---|---|---|---|
|
Decision Accuracy
|
238 Number of errors
|
230 Number of errors
|
291 Number of errors
|
PRIMARY outcome
Timeframe: After every trial in Scenarios 1 and 2Population: This Outcome Measure was pre-specified to be only assessed for Scenarios 1 and 2. No data were collected for this Outcome Measure for the "No MI Help" scenario.
Participants will complete 100 mock medication verification trials in each of the study arms (i.e., Scenario 1, Scenario 2, and No Help). After each trial in Scenario 1 and Scenario 2, participants will use a visual analog scale (VAS) to respond to the question: "How much do you trust the AI advice?" The endpoints of the 100-point VAS are 'Not at all' to 'Completely trust'. Participants indicate their level of trust in the MI advice after every trial on a scale from 1-100, with higher scores indicating greater levels of trust. The trust change, as measured by the visual analog scale, will be calculated using the following formula: Trust change (i) = Trust(i) - Trust(i - 1), where i=2, 3, ..., 100. To compute a single, summarized value for the Trust Change variable within a specific scenario, the individual Trust Change scores measured from the trials are averaged. This averaging method provides a comprehensive measure of how trust shifted across the duration of the scenario.
Outcome measures
| Measure |
Scenario #1
n=50 Participants
MI help will be presented in the form of a pop-up message the participant's decision differs from the MI's determination.
|
Scenario #2
n=50 Participants
MI help will be displayed concurrently with the filled and reference images.
|
No MI Help
Participants will complete the medication verification task without any MI help
|
|---|---|---|---|
|
Trust Change
|
-0.1715 units on a scale
Standard Deviation 21.1415
|
-0.0552 units on a scale
Standard Deviation 19.3931
|
—
|
PRIMARY outcome
Timeframe: Post-intervention in Scenarios 1 and 2.Population: This Outcome Measure was pre-specified to be only assessed for Scenarios 1 and 2. No data were collected for this Outcome Measure for the "No MI Help" scenario.
Trust will be assessed using the Muir \& Moray's (1996) Trust in Automation scale. Scores range from 0 to 100 with higher scores indicating greater levels of trust.
Outcome measures
| Measure |
Scenario #1
n=50 Participants
MI help will be presented in the form of a pop-up message the participant's decision differs from the MI's determination.
|
Scenario #2
n=50 Participants
MI help will be displayed concurrently with the filled and reference images.
|
No MI Help
Participants will complete the medication verification task without any MI help
|
|---|---|---|---|
|
Trust
|
53.65 scores on a scale
Standard Deviation 33.12
|
60.37 scores on a scale
Standard Deviation 31.95
|
—
|
SECONDARY outcome
Timeframe: Throughout the verification taskPopulation: Complete eye tracking data was not available for analysis for one participant. The MI plot was pre-specified to be only assessed for Scenarios 1 and 2. No data were collected for the Outcome Measure MI plot for the "No MI Help".
Participants' eye movements were tracked using a browser-based online eye tracking system. The outcome measure is the difference in cognitive effort as measured by fixation count in the defined areas of interest: fill image, reference image, or MI plot. Higher fixation rates indicate repeated interest in a certain area.
Outcome measures
| Measure |
Scenario #1
n=49 Participants
MI help will be presented in the form of a pop-up message the participant's decision differs from the MI's determination.
|
Scenario #2
n=49 Participants
MI help will be displayed concurrently with the filled and reference images.
|
No MI Help
n=49 Participants
Participants will complete the medication verification task without any MI help
|
|---|---|---|---|
|
Cognitive Effort
Area of Interest: Fill Image
|
3 Number of fixations
Interval 2.0 to 6.0
|
3 Number of fixations
Interval 2.0 to 6.0
|
3 Number of fixations
Interval 2.0 to 5.0
|
|
Cognitive Effort
Area of Interest: Reference Image
|
2 Number of fixations
Interval 1.0 to 3.0
|
2 Number of fixations
Interval 1.0 to 3.0
|
2 Number of fixations
Interval 1.0 to 3.0
|
|
Cognitive Effort
Area of Interest: MI plot
|
1 Number of fixations
Interval 1.0 to 2.0
|
2 Number of fixations
Interval 1.0 to 2.0
|
—
|
SECONDARY outcome
Timeframe: Throughout the verification taskPopulation: Complete eye tracking data was not available for one participant. The MI plot was pre-specified to be only assessed for Scenarios 1 and 2. No data were collected for the Outcome Measure MI plot for the "No MI Help".
Participants' eye movements were tracked using a browser-based online eye tracking system. The outcome measure is the difference in cognitive effort as measured by the duration of fixations in the defined areas of interest: fill image, reference image, or MI plot. Longer fixation duration indicates a higher cognitive load.
Outcome measures
| Measure |
Scenario #1
n=49 Participants
MI help will be presented in the form of a pop-up message the participant's decision differs from the MI's determination.
|
Scenario #2
n=49 Participants
MI help will be displayed concurrently with the filled and reference images.
|
No MI Help
n=49 Participants
Participants will complete the medication verification task without any MI help
|
|---|---|---|---|
|
Cognitive Effort
Area of Interest: Fill Image
|
687.5 millisecond (ms)
Interval 407.75 to 1253.5
|
706.0 millisecond (ms)
Interval 416.0 to 1250.0
|
619.5 millisecond (ms)
Interval 367.0 to 1088.75
|
|
Cognitive Effort
Area of Interest: Reference Image
|
399.0 millisecond (ms)
Interval 235.0 to 591.0
|
419.0 millisecond (ms)
Interval 239.0 to 623.5
|
365.0 millisecond (ms)
Interval 229.0 to 531.009
|
|
Cognitive Effort
Area of Interest: MI Plot
|
268.0 millisecond (ms)
Interval 210.75 to 443.0
|
299.0 millisecond (ms)
Interval 212.0 to 499.0
|
—
|
SECONDARY outcome
Timeframe: After completing 100 mock verification trials in each armParticipants will complete 100 mock medication verification trials in each of the 3 arms. The workload of each arm will be measured by the NASA Task Load Index (TLX). The 5 TLX dimensions assessed are: mental demand, effort, temporal demand, performance, and frustration. For each dimension, participants will indicate their response to a single question. For 4 of the dimensions, the endpoints of the Likert scale are 'very low' and 'very high'. The performance dimension is reverse-scored, and the endpoints are 'perfect' and 'failure'. Participants then complete 10 pairwise comparisons of the dimensions by indicating which dimension they consider to be a more important factor (e.g., effort vs frustration). Each category score multiplied by its respective pairwise comparison count is summed and divided by 10 to get an overall weighted workload score. The result is an overall workload score between 1 and 20, with higher scores indicating higher workload.
Outcome measures
| Measure |
Scenario #1
n=50 Participants
MI help will be presented in the form of a pop-up message the participant's decision differs from the MI's determination.
|
Scenario #2
n=50 Participants
MI help will be displayed concurrently with the filled and reference images.
|
No MI Help
n=50 Participants
Participants will complete the medication verification task without any MI help
|
|---|---|---|---|
|
Workload
|
7.5 score on a scale
Standard Deviation 3.8
|
6.9 score on a scale
Standard Deviation 3.4
|
7.4 score on a scale
Standard Deviation 3.7
|
SECONDARY outcome
Timeframe: After completing 100 mock verification trials in each armParticipants will complete 100 mock medication verification trials in each of the 3 arms (No MI Help, Scenario 1, and Scenario 2). After completing 100 trials, participants will assess the mock verification interface using the System Usability Scale (SUS). The SUS is comprised of 10 statements that participants indicate their agreement with using a 5-point Likert scale ranging from strongly agree to strongly disagree. Odd-numbered questions have a positive response and even-numbered questions are reverse-scored. Scores are summed and multiplied by 2.5 to get a final SUS score. SUS scores range from 0 to 100 with higher scores indicating greater usability. An average SUS score is considered to be 68. Anything below 50 is "Not Acceptable. Scores between 51-70 are considered "Marginal", those above 71 are considered "Acceptable", and those at 80 or above are indicative of high usability.
Outcome measures
| Measure |
Scenario #1
n=50 Participants
MI help will be presented in the form of a pop-up message the participant's decision differs from the MI's determination.
|
Scenario #2
n=50 Participants
MI help will be displayed concurrently with the filled and reference images.
|
No MI Help
n=50 Participants
Participants will complete the medication verification task without any MI help
|
|---|---|---|---|
|
Usability
|
70.4 score on a scale
Standard Deviation 16.4
|
73.0 score on a scale
Standard Deviation 15.6
|
72.1 score on a scale
Standard Deviation 16.5
|
Adverse Events
No MI Help
Scenario #1
Scenario #2
Serious adverse events
Adverse event data not reported
Other adverse events
Adverse event data not reported
Additional Information
Results disclosure agreements
- Principal investigator is a sponsor employee
- Publication restrictions are in place