Comprehensive Summary
Kaufman et al. analyzed the effectiveness of ChatGPT-3.5 in the context of generating clinically relevant research concepts specifically in the field of non-operative spine medicine. ChatGPT is becoming increasingly used across many different fields and specialties in medicine, but their specific application to creating studies and research ideas in spine medicine has yet to be fully uncovered. To perform the study, Kaufman et al. sent a survey which included academic spine physiatrists from six different U.S. institutions; the participants, of which there were 13 who completed the survey completely, were asked to rate ChatGPT-3.5’s research ideas. For context, the survey was created using Research Electronic Data Capture (REDCap). ChatGPT-3.5 would design five retrospective, prospective, and five review article research ideas specifically tailored towards non-operative spine care. The fifteen resulting ideas were presented in no particular order and anonymously to the participants; the participants rated the ideas on a scale of 1 to 5 on the basis of novelty, feasibility, clinical relevance, and overall interest. To analyze the results, statistical analysis was performed using Kruskal-Wallis tests with post-hoc Dunn’s tests with Holm-adjusted p values and Fisher’s exact test. All of the analyses were performed using RStudio (significance of .05). THe results of the study showed that the median of the ratings for the 15 ideas was 3 out of 5. None of the created studies scored higher than 4, which may have suggested that the studies were taken with a moderate degree of support. 69% of the respondents mentioned that the perception of the ideas did not change after it was revealed that AI created those studies.
Outcomes and Implications
As mentioned, 69% of the respondents mentioned that their perception of the research ideas did not change after it was revealed that AI created those studies. Interestingly, this goes against what may be expected of physician researchers. However, it is also important to note that a few mentioned that AI could have some important value in the context of early research brainstorming. ChatGPT is being consistently used by different medical fields as of late, and it is clinically important the model is able to produce different research studies, especially for brainstorming. While ChatGPT can produce intriguing studies with some clinical relevance and feasibility in non-operative spine medicine, the ability to produce highly innovative ideas is still limited. Human knowledge and skill regarding the formation of study ideas is still essential for clinical research, but ChatGPT may still be helpful as a supplementary tool in this regard.