VN October 2024

Vetnews | Oktober 2024 10 « BACK TO CONTENTS Using ChatGPT in academic writing The incorporation of AI in academic writing, particularly in the field of medical research, is a topic marked by considerably more controversy than the previous sections discussed. Ever since the development of GPT-3 in 2020, its text-generating ability has ignited debate within academia (67). Leveraging editing services enhances clarity and minimizes grammatical errors in scientific manuscripts, which can improve their acceptance rate (68). While acknowledgements often thank editorial assistance, the use of spelling-checking software is rarely disclosed. Nowadays, AI-powered writing assistants have integrated advanced LLM capabilities to provide nuanced suggestions for tone and context (45), thus merging the line between original and AI-generated content. Generative AI, like ChatGPT, extends its utility by proposing titles, structuring papers, crafting abstracts, and summarizing research, raising questions about the AI’s role in authorship as per the International Committee of Medical Journal Editors’ guidelines (69) (Supplementary material). Notably, traditional scientific journals are cautious with AI, yet NEJM AI stands out for its advocacy for LLM use (70). However, these journals still refrain from recognizing ChatGPT as a co-author due to accountability concerns over accuracy and ethical integrity (70–72). The academic community remains wary of ChatGPT’s potential to overshadow faculty contributions (73). Several veterinary journals have updated their guidelines in response to the emergence of generative AI. Among the top 20 veterinary medicine journals as per Google Scholar (74), 14 instruct on generative AI usage (Supplementary material). They unanimously advise against listing AI as a co-author, mandating disclosure of AI involvement in Methods, Acknowledgments, or other designated sections. These recommendations typically do not apply to basic grammar and editing tools (Supplementary material). AI could enhance writing efficiency and potentially alleviate disparities in productivity, posing a nuanced proposition that suggests broader acceptance of AI in academia might benefit less skilful writers and foster a more inclusive scholarly community (40). The detectability of AI-generated content and the associated risks of erroneous academic judgments have become significant concerns. A misjudgment has led an ecologist at Cornell University to face publication rejection after being falsely accused by a reviewer who deemed her work as “obviously ChatGPT” (75). However, a study revealed that reviewers could only identify 68% of ChatGPTproduced scientific abstracts, and they also mistakenly tagged 14% of original works as AI-generated (76). In a veterinary study, veterinary neurologists only had a 31–54% success rate in distinguishing AI-crafted abstracts from authentic works (30). To counteract this, a ‘ChatGPT detector’ has been suggested. An ML tool utilizes distinguishing features like paragraph complexity, sentence length variability, punctuation marks, and popular wordings, achieving over 99% effectiveness in identifying AI-authored texts (77). A subsequent refined model can further distinguish human writings from GPT-3.5 and GPT-4 writings in chemistry journals with 99% accuracy (78). While these tools are not publicly accessible, OpenAI is developing a classifier to flag AI-generated text (79), emphasizing the importance of academic integrity and responsible AI use. ChatGPT’s limitations and ethical issues Hallucination and inaccuracy Hallucination, or artificial hallucination, refers to the generation of implausible but confident responses by ChatGPT, which poses a significant issue (80). ChatGPT is known to create fabricated references with incoherent Pubmed ID (81), a problem somewhat mitigated in GPT-4 (18% error rate) compared to GPT-3.5 (55% error rate) (82). The Supplementary material illustrated an example where GPT-4 could have provided more accurate references, including PMIDs, underscoring its limitations for literature searches. In the medical field, accuracy is paramount, and ChatGPT’s inaccuracy can have serious consequences for patients. A study evaluating GPT-3.5’s performance in medical decision-making across 17 specialities found that the model largely generated accurate information but could be surprisingly wrong in multiple instances (83). Another study highlighted that while GPT-3.5 (Dec 15 version) can effectively simplify radiology reports for patients, it could produce incorrect interpretations, potentially harming patients (84). With the deployment of GPT-4 and GPT-4o, the updated database should bring expected improvement; however, these inaccuracies underscore the necessity of using ChatGPT cautiously and in conjunction with professional medical advice. Intellectual property, cybersecurity, and privacy As an LLM, ChatGPT is trained using undisclosed but purportedly accessible online data and ongoing refinement through user interactions during conversations (85). It raises concerns about copyright infringement and privacy violations, as evidenced by ongoing lawsuits against OpenAI for allegedly using private or public information without their permission (86–88). Based on information from the OpenAI website, user-generated content is consistently gathered and used to enhance the service and for research purposes (89). This statement implies that any identifiable patient information could be at risk. Therefore, robust cybersecurity measures are necessary to protect patient privacy and ensure compliance with legal standards in medical settings (90). When analyzing clinical data using an AI chatbot, uploading deidentified datasets is suggested. Alternatively, considering local installations of open-source, free-for-research-use LLMs, like Llama 3 or Gemma (Google), for enhanced security is recommended (91–94). US FDA regulation While the FDA has approved 882 AI and ML-enabled human medical devices, primarily in radiology (76.1%), followed by cardiology (10.2%) and neurology (3.6%) (95), veterinary medicine lacks specific premarket requirements for AI tools. The AI- and MLenabled veterinary products currently span from dictation and notetaking apps (34, 35), management and communication software (36, 37), radiology service (31–33), and personalized chemotherapy (96), to name a few. These products may or may not have scientific validation (97–104) and may be utilized by veterinarians despite the clients’ lack of consent or complete understanding. Leading Article

Made with FlippingBook

RkJQdWJsaXNoZXIy OTc5MDU=