Raising University Students’ Critical Awareness of the Linguistic Limitations and the Potential Invalid Knowledge of ChatGPT Responses to Academic Writing Prompts

Document Type : Empirical research

Author

Faculty of Arts, University of Prince Edward Island in Canada

Abstract

As ChatGPT becomes widespread globally, university students utilize its responses to develop their academic writing assignments. In the meantime, studies have shown linguistic limitations and invalid information in ChatGPT responses. This study aims to raise university students’ critical awareness of ChatGPT limitations in academic writing by using a researcher-designed critical review activity. This study follows a quasi-experimental method that instructs students on how to evaluate ChatGPT academic writing responses. Students were required to practice a critical review activity to evaluate and criticise the linguistic appropriateness and knowledge credibility of the ChatGPT responses. The research participants included 120 university students enrolled in an Academic Writing course at the University of Prince Edward Island, Cairo campus. The academic writing course was taught for three months; meanwhile, students practised the designed critical review activity to evaluate the linguistic features and credibility of the ChatGPT responses. Pre and post-critical awareness questionnaires were administered to measure the difference in students’ critical awareness of the ChatGPT Limitations. The findings showed that participants’ critical awareness during the pre-critical awareness questionnaire was poor. However, in the post-critical awareness questionnaire, the critical awareness of most of the participants was satisfactory. Therefore, the study confirms that integrating critical review activities in the academic writing syllabus is crucial to raising students’ critical awareness towards ChatGPT Limitations. The study's findings provide a foundation for creating suitable instructional materials to integrate ChatGPT properly in teaching Academic Writing Courses.

Keywords

Main Subjects


  1. INTRODUCTION

Academic writing helps students navigate critical thinking. It challenges them to respond, analyse, synthesise and evaluate information acquired from multiple sources. Academic writing, prominent within higher education, serves as one of the fundamental practices for creating knowledge. It is through writing that students can start to clarify their thoughts and express nuanced ideas effectively. McCarthy (2011) argues that academic writing improves lifetime learning, prepares students for professional activity, and satisfies professional membership requirements.  It also improves students' ability to communicate and effectively express complicated ideas, equipping them for academic and professional endeavours in their professions (Kennelly, 2017). In addition, García Díaz (2019) reported that academic writing develops students’ ability to produce knowledge, interact with disciplinary cultures, and coordinate theoretical sources with empirical facts, all of which improve their communication skills and overall learning process. Moreover, writing for academic purposes improves students’ communication, intellectual, and reflective abilities, which aid in a deeper comprehension of the subjects (Burbano, 2023). Academic writing is crucial for success in higher education since it affects tenure, promotions, and hiring (Woloshyn et al., 2024). Successful academic writing requires fundamental potential in every discipline, such as critical analysis, synthesis, and material evaluation (Zhang, 2024). Academic writing is essentially integrated into higher education since it fosters critical thinking skills, aids in understanding the academic field, and calls for data analysis, synthesis, and evaluation. Also, it provides students with insights into how to construct arguments on particular subjects, which is essential for academic and professional success.

 In higher education as well, academic writing is a major assessment method for measuring student learning. Academic writing can be defined as one of the forms of examination: it helps the teacher to check whether the student is aware of the content of the course and whether they can present ideas in a meaningful way (Baker, 2021). Classroom assignments provide spaces for students to interact with the existing literature and participate in conversations that are already happening. Such activity contributes to the higher standards of a student's writing by teaching how to reference existing work correctly and prepares students for future employment by practising analytical thinking and effective communication skills (Smith & Jones, 2020). Based on the foregoing, one can conclude that assessment of students' work through academic writing is of great importance in education as it helps in the development of skills that would be used in their future workplace.

OpenAI has played a crucial role in advancing natural language processing (NLP) and machine learning (ML) through numerous innovative initiatives and models. These efforts have improved the ability of AI systems to comprehend and produce human language (Pokhrel & Banjade, 2023). Artificial intelligence (AI) technologies that can produce text for various jobs are developing at an exponential rate, and their widespread use is expected to have profound and far-reaching effects across numerous industries. Recently, Academic writing and academic environments have been significantly impacted by artificial intelligence (AI). A considerable body of research has looked at the implications, advantages, and difficulties of artificial intelligence (AI) in academic writing, which is becoming a bigger part of higher education. In addition to writing content on any subject and including pertinent citations, these tools can, for instance, summarise, translate, or paraphrase text. Many students and scholars are beginning to adopt AI technologies such as Chatbots as writing helpers because of these affordances, which are helpful for academic writing. Chatbots are computer programs that use artificial intelligence (AI) and natural language processing (NLP) to simulate conversations with humans. One such chatbot is ChatGPT, which uses the third-generation generative pre-trained transformer (GPT-3) developed by OpenAI.

Although ChatGPT has been praised for its ability to generate text, concerns have been raised about its accuracy and precision in generating data, as well as legal issues related to references (Athaluri et al., 2023). ChatGPT's drawbacks for academic writing are numerous and include problems with dependability, correctness, and ethics. Even though ChatGPT has potential advantages like increasing productivity and offering language assistance, there are important issues that need to be resolved when integrating it into academic settings. While many researchers such as Derga et al. (2023) noted that ChatGPT and other related NLP technologies could be useful in improving technical writing and research productivity, the current research advocates for further and more comprehensive debates about the tools’ possible applications, threats, and scope and the need to emphasise ethical and scholarly standards, with the researching activities being dominated by the human intellect and evaluation.

In addition, Conde et al. (2024) found that the employment of AI technologies raises several possible problems, from ethical questions to the veracity of the material produced, which may contain inaccurate statements. Also, Athaluri et al. (2023) referred to ChatGPT's possible shortcomings in producing trustworthy references for research ideas. The issue of AI hallucinations has the potential to impair judgment and lead to moral and legal dilemmas. The model's high degree of non-determinism erodes reproducibility and trust by generating different results for the same stimuli (Ouyang et al., 2024). According to Costa et al. (2024), ChatGPT's output may be inaccurate, which could result in inaccurate information in academic writing. Biases in ChatGPT's training data may be reflected in its outputs, which raises questions about accountability and fairness. The model's trustworthiness is further limited by its inability to be explained, which makes it difficult to comprehend how it makes decisions (Patil et al., 2024).

Despite being a major development in natural language processing, ChatGPT has several linguistic, credibility and reliability flaws that reduce its usefulness. Its architecture, training data, and innate biases are the sources of these shortcomings, which may make it difficult to produce precise and contextually relevant responses in academic writing responses. Recently, many universities have modified their policies, including permission for the use of generative artificial intelligence (GenAI) tools. As noted by Ullah et al. (2024), 41 out of the top 50 universities have published their guidelines, allowing the use of AI under certain conditions. Policies differ greatly, as some universities have established extensive frameworks to encourage AI literacy, whereas others provide very limited guidance (Atkinson-Toal & Guo, 2024). Nevertheless, the potential for generative artificial intelligence (GenAI) to write scholarly works on par with human ability raises questions about its potential abuse in the classroom by students. While Tang & Chaw (2024) clarify such abuse that students might turn in AI-generated writing assignments as their own since ChatGPT can generate a full text, jeopardising academic integrity, Mbwambo & Kaaya (2024) point out that ChatGPT's accessibility and usability raise the possibility that students will become overly dependent on it. Such potential abuse could happen due to the lack of students’ awareness of the ChatGPT limitations. Therefore, this study targets exposing university students to critical thinking and evaluation practices while using ChatGPT in academic writing to cultivate their critical awareness of ChatGPT limitations.

 

 

  1. LITERATURE REVIEW

2.1 Students’ Use of ChatGPT in Academic Writing

Since the launch of ChatGPT in November 2022, there has been a lot of interest in using generative AI technologies in higher education. In Im’s (2023) study, students expressed their satisfaction and self-development in academic writing confidence due to the use of ChatGPT in their academic submissions; however, it confirms the urgent need for organised instruction that promotes critical thinking and cooperative learning settings. This dual strategy could improve students' language proficiency as well as their capability for meaningful peer interaction and critical interpersonal skills. Although students value ChatGPT's instant support, their reliance on it could highly impede their ability to think critically and solve problems (Juan et al., 2023). Therefore, Guleria et al. (2023) emphasised the significance of cultivating critical thinking among students and researchers to raise awareness about the associated privacy and ethical risks.

Moreover, Ho (2024) investigated students’ behaviour and attitudes while using ChatGPT in their academic writing assignments. Although Ho acknowledged ChatGPT's effectiveness for vocabulary acquisition, grammar checking and paraphrasing, the study results revealed that students mostly used ChatGPT to get immediate answers to their writing prompts and structure. Therefore, he recommends the necessity of teaching students how to use ChatGPT properly and ethically while adopting a critical thinking mind towards ChatGPT responses. Yazid & Dzulfikri (2024) also stated that students often struggle to comprehend the output generated by ChatGPT. This lack of understanding can lead to confusion and misinterpretation of the information provided by the AI, which may hinder their linguistic potential and writing process. Вobk & Kryvoshyia (2024) also highlight in their study that ChatGPT students’ overreliance may reduce their mental capabilities and stress the need to establish a framework for using ChatGPT in academic writing.

According to a recent study by Shrivastava (2024), these cutting-edge technologies are altering educational processes and increasing student and teacher productivity. However, there are drawbacks to the quick uptake of these instruments. Shrivastava’s study emphasizes important worries about the consequences of bias, the possibility of plagiarism, and difficulties with assessment procedures. Shrivastava points out the need for a diversified strategy to address these issues, including the creation of strong policies and instructional frameworks that encourage moral application and optimize the advantages of generative artificial intelligence in educational contexts. Nepal (2024) highlights the importance of training students and researchers to critically evaluate the outputs generated by ChatGPT and emphasises the need for responsible awareness and deployment of ChatGPT biases to ensure fair and equitable use of AI technologies in academia. Warr et al. (2024) also added that it is critical to develop an understanding and critical awareness of the bias embedded in these models as ChatGPT.

            In addition, Li (2024) calls for ethical considerations in ChatGPT for academic writing and advocates for developing the students’ responsible use of ChatGPT in academic writing integrity. While Gultom et al. (2024) study illustrates that using ChatGPT aids and improves the academic writing process in higher education, it advocates the need to avoid students’ irresponsible use and ChatGPT’s potential limitations. Also, Asad et al. (2024) refer in their study to the potential enhancement of using ChatGPT in personalised learning in writing strategies; however, they pointed out its negative impact on students’ creativity and critical skills in academic writing. Rensburg & Reedy’s (2024) qualitative study explored the literature on the use of ChatGPT by students in higher education, including its applications, challenges and ethical concerns. The study pinpoints the limitations and ethical concerns of ChatGPT use in academic writing, focusing on bias, reliability, accuracy, and accountability. It also concludes that clear and transparent guidelines should be offered for students and integrated into the teaching of academic writing. Although a large body of studies argues against the misuse of ChatGPT in academic writing, a few studies advocate for utilising ChatGPT in the academic writing process.

2.2 ChatGPT Credibility & Reliability Limitations

AI Hallucinations & Inaccuracies

“Hallucinations” is a term that refers to false or misleading information created by ChatGPT, which can be problematic in academic contexts where precision is essential. The frequency of hallucinations (inaccurate or fabricated information) in ChatGPT, particularly in academic writing, raises serious questions regarding the credibility and reliability of ChatGPT-generated information. ChatGPT credibility issues have been thoroughly investigated in different contexts, highlighting notable inaccuracies of the content generated by ChatGPT due to its "hallucinations". George & Stuhlmüller (2023) evaluated hallucinations in ChatGPT's academic summaries using Factored Verification for detection, which proved effective detection of hallucinations in academic summaries. The findings show an average of 0.62 hallucinations per summary, highlighting the need for caution in using language models for accurate academic writing. Hwang et al. (2024) asserted that even the improved model ChatGPT-4 showed a hallucination rate of 1.13 items per abstract in the studied scholarly work.

ChatGPT’s hallucinations dramatically impact its accuracy, validity and credibility. Mahyoo et al. (2023) found that ChatGPT’s hallucination produces nonfactual inferences, repetition, and illogical reasoning in academic writing. In addition, Alkaissi & McFarlane (February 19, 2023) identified instances where it generated confident yet inaccurate information in academic writing due to its 'artificial hallucinations', arguing that ChatGPT's ability to generate factually correct scientific writing is questionable. Hallucinations can compromise the integrity of research proposals in academic writing by causing the inclusion of false information, incorrect predictions, redundant insufficient information and unreliable references, which leads to misleading users by providing inaccurate, unreliable information (Costa et al., 2024; Jančařík & Dušek, 2024; Mondal & Sancheti, 2024; Cabuquin et al., 2024 and Kolesnikov et al., 2024). Other studies, such as Gupta et al. (April 1, 2024), Kobak et al. (2024), Royani & Sihombing (2024), and (Akhoon et al., 2024) argued the lack of domain-specific knowledge, inaccuracies in scientific data and limited ability to interpret results, thus dramatically impacting the validity of academic writing and research work.

Moreover, research has reported instances where ChatGPT’s hallucination provided citations to nonexistent sources, which raises questions about its reliability. Buchanan & Shapoval (2024) found that over 30% of the references produced by GPT-3.5 were fake due to AI hallucinations creating nonexistent data, with a slight improvement in GPT-4, which still exhibited high error rates. Jan (2024) confirmed ChatGPT exhibits serious trouble correctly citing scholarly material that has been withdrawn; the model's applicability for academic applications is doubtable since it relies on predictive reasoning rather than validated evidence While several studies have pointed out that ChatGPT could enhance productivity and language mechanics, it is urgent to consider its integration into academic writing with caution to eliminate the risks related to AI hallucinations fabricating information and references. These limitations have been evident in academic and assessment environments, where accurate, credible, and reliable information is essential.

Language Model Inherent Bias

Big language models like ChatGPT in educational contexts face a threat of bias, potentially reinforcing existing disparities rather than reducing them. Recent studies such as Koçyiğit & Zhaksylyk, 2023 Kleebayoon & Wiwanitkit, 2023 and Kolesnikov et al. 2024, indicate that biases in training data reflected in AI models compromise academic justice and honesty. Wen and Younes’s study (2023) found that the ChatGPT model revealed biases such as hate speech rather than racial and gender biases. However, Motoki et al., 2023 pinpointed biases in political orientation, gender, race, and religion in the ChatGPT model. Fulgu & Capraro’s (2024) research has shown that ChatGPT-4 gender biases often attribute masculine stereotypes to female writers more frequently than the reverse; for example, bias favouring male characters in violent scenarios in moral dilemmas. In addition, Warr et al. (2024) found racial biases very implicit in ChatGPT when demographic information is indicated rather than directly presented. In an educational task, the model's responses can differ based on a student's perceived race or socioeconomic status, suggesting a deeper bias reflecting societal prejudices. Georgiou's research (2024) also highlighted ChatGPT's bias in favour of industrialised nations over developing ones, with more favourable opinions of high HDI nations than low HDI ones. On the other hand, Nepal (2024) identified confirmation and cultural bias in ChatGPT models, which generate responses that ignore opposing views or minority ideas in favour of dominant or popular narratives, limiting the diversity of viewpoints in academics. Inherent biases in ChatGPT outputs raise concerns about its use in sensitive academic discussions.

Racial and cultural bias in AI models like ChatGPT may have a substantial effect on academic writing, affecting representation, research focus, stereotyping, collaboration, and ethical standards. Hanna et al. (2023) argue that cultural biases in academic literature could result in the reinforcement of stereotypes and false beliefs about certain cultures and spread false beliefs, undermining the authority of scholarly writing and producing a lack of diversity in academic research. Additionally, this lack of representation can alienate students from other backgrounds, who may feel their viewpoints and experiences are not recognised. For example, language models like ChatGPT favour some narratives or frameworks over others when used in academic contexts (Özer, 2024) and exclude non-Western voices when they produce text that primarily focuses on Western narratives, which perpetuates the power structures already in place in academics (Nationality Bias in Text Generation, 2023 & Nepal, 2024). ChatGPT cultural biases could lead to the silence of voices that offer insightful contributions to the academic community, homogenising ideas, eroding the depth of scientific discourse, and inhibiting creativity and critical thinking.

2.3 ChatGPT Linguistic Limitations

Formulaic Language & Monotonous Writing Style

One of the linguistic limitations of ChatGPT has been identified as producing formulaic language, raising concerns of scholars unique voice and expression. Baziyad et al. (2023) argue that ChatGPT's autoregressive mechanism produces formulaic language, lacking genuineness and nuance, which hinders the development of critical thinking and creative skills in students. This results in poor emotional depth and engagement in academic writing. While Bhatia (2023) indicated that ChatGPT produces “plausible-sounding” answers based on its automated use of vocabulary and formulaic language, this raises concerns that AI tools may compromise critical thinking and individual expression, suppressing unique thought and expression, which are essential for competent writing (Audichy & Saini, 2023). In addition, Alam (2024) opined that ChatGPT has trouble with discourse particles, conversational cues, and formulaic genre structures, impacting the subtle flow and clarity and leading to decreased coherence. Moreover, Markey et al. (2024) referred to the documents produced using ChatGPT as dialogically closed, "empty," and "fluffy," missing the complexity and involvement that characterise academic writing. Yazid & Dzulfikri (2024) found that the informative density and coherence of ChatGPT academic writing responses are diminished when comparing ChatGPT outputs to writings produced by published and inexperienced authors, the informative density and coherence are diminished (Yazid & Dzulfikri, 2024).

With ChatGPT formulaic language, another linguistic limitation tends to dominate the model writing style to be monotonous. Baziyad et al. (2023) concluded that ChatGPT’s machine-patterned language, resulting from its autoregression strategy, contributes to its monotonous writing style, detected through their experimental analysis of its generated texts. However, Silva et al. (2023) argued that the ChatGPT production of monotonous writing styles is due to its reliance on extensive training data. Additionally, Basmo et al. (2023) identified monotonicity as one of ChatGPT entailments, highlighting the demolishment risk of originality and engagement of scientific writing (Mahyoob et al., 2023). Ningsih & Rohmah (2024) criticised ChatGPT’s repetitive writing styles, including direct sentences, as one of the monotonous style flaws (Fiialka et al., 2024). Nepal (2024) concluded that ChatGPT’s monotonous writing style could hinder maintaining diverse and engaging language in academic contexts.

The limitations of ChatGPT in producing formulaic language and monotonous writing styles, which have been highlighted across various studies, could significantly impact academic context quality and creativity. Bhatia (2023) declared that students in academic contexts have become more prone to adopting ways of synthesis information uncritically due to the increasing use of ChatGPT, impairing their critical thinking and analytical skills—two things that are essential for scientific academic writing (Bhatia, 2023). Baziyad et al. (2023) confirmed that the spread of AI-generated content in academic contexts makes it challenging to distinguish between unique concepts and machine-generated language, compromising the essential qualities of depth and originality (Dergaa et al., 2023). On the other hand, some researchers argue that ChatGPT could still serve as a valuable tool for generating ideas and assisting with editorial tasks despite its limitations if used judiciously. Therefore, becoming aware of the limitations of ChatGPT is crucial for understanding the model's capabilities and shortcomings and for critically evaluating its content.

Decontextualization & Linguistic Ambiguity

Another linguistic limitation that impacts the model's performance is decontextualisation, being incapable of comprehending context. As stated by Sinha et al. (2022), specific assessments reveal that ChatGPT frequently displays instability in response to various contextual cues, especially those that closely correspond with grammatical infractions. In addition, Qiu et al. (2023) found that ChatGPT encounters difficulties digesting pragmatic implicatures with the same human-like flexibility, which might result in miscommunications in complex discussions. Basmov et al. (2023) further found that ChatGPT lacks the level of contextual knowledge required to produce complex discussions, which can lead to a shallow interaction with the content. Liu et al. (2023) confirmed that ChatGPT encounters difficulties digesting pragmatic implicatures with the same human-like flexibility, which might result in miscommunications in complex discussions. While Markey et al. (2024) pointed out that ChatGPT can produce text rapidly, such lack of contextual understanding hinders the production of nuanced arguments and emotional resonance, which are vital in persuasive writing and arguments (Jiang & Hyland, 2024). In addition to decontextualization, Arif et al. (2023) found that ChatGPT produced ambiguous response output in academic writing responses. Ortega-Martín et al. (2023) further argued that ChatGPT’s general weaknesses in handling linguistic ambiguity can negatively affect the coherence and clarity of its generated texts. Due to this linguistic ambiguity limitation embedded in the ChatGPT model, Bhatia (2023) noted that ChatGPT’s ambiguous linguistic outputs risk its generated content clarity, coherence and reliability. When ChatGPT-generated content is compared to texts created by humans, these constraints may result in academic text that is less contextually relevant and incomprehensible.

The decontextualisation and lack of clarity limitation in ChatGPT affect its generated content precision, unity, and credibility. The contextual misunderstanding limitation is particularly pronounced in complex academic disciplines, where the subtleties of language can significantly alter meaning. Elhag et al. (2023) pointed out that the ChatGPT’s outputs stemming from misunderstanding its input could jeopardise the integrity of academic communication, particularly when researchers rely on ChatGPT to write or evaluate sophisticated arguments without sufficient human supervision. For example, ChatGPT can produce believable medical advice or instructional content; however, its incapability to appropriately contextualise information or validate facts could negatively affect users who depend on it to make important decisions (Chowdhury & Haque, 2023). Some studies confirmed that ChatGPT's performance has been shown to vary dramatically based on the specificity and complexity of the prompts it receives, leading to the risk of several misinterpretations (Suárez et al., 2023 & Yıldırım & Tellou, 2024). Although some literature declares that AI models like ChatGPT can facilitate certain writing tasks, their rigid, ambiguous output may deter students from developing a meaningful, fluid and adaptive writing style essential for higher-level academic success (Jiang & Hyland, 2024). Datskiv et al. (2024) concluded that ChatGPT's limitations in replicating nuanced academic language and complex argumentation structures can develop ambiguity, lead to reduced coherence in generated texts, and impact the clarity and depth of academic writing.

In conclusion, ChatGPT offers potential benefits, such as enhancing the writing pace and providing language support; however, its integration into academic writing presents significant limitations that must be addressed. ChatGPT’s multifaceted limitations extend from linguistic limitations to issues about the accuracy, reliability, and credibility of its output. Although experts and researchers in the field believe that ChatGPT limitations raise significant concerns about its impact on students’ critical and analytical thinking, the integrity of students’ academic work, and students’ irresponsible misuse of technology, they overlooked the techniques of how to mitigate those limitations to leverage its use while maintaining academic integrity and data validity. In addition, despite the prior observations and statements in research about the need for guidelines to refine students’ utilisation of ChatGPT, it remains unclear what those guidelines are and how they can be integrated into academic writing instruction. Moreover, the presented literature extensively studied students’ behaviour and attitudes while using ChatGPT in their academic writing assessments; however, none of the previous research has examined students’ awareness of ChatGPT limitations to be considered one of the factors behind their potential misuse and irresponsible integration of ChatGPT in their academic writing. Therefore, this study intends to measure the effectiveness of the researcher-designed critical review tasks on students’ critical awareness of the limitations of ChatGPT.  The research questions are “What are the ChatGPT limitation areas required to be recognised and considered by students?”, “What is the design of the critical review activity required to raise students’ critical awareness of ChatGPT limitations?” and “How effectively could the designed critical review activity, integrated into the academic writing course, develop students’ critical awareness of the ChatGPT limitations? The research uses the quasi-experimental method of one experimental group. The research participants practice the critical review activity integrated into the academic writing course for a semester. Using the pre- and post-questionnaire for data collection, results are discussed in terms of the development of students’ critical awareness of ChatGPT limitations and students’ recognition of ChatGPT limitations.

  1. RESEARCH METHODOLOGY

3.1 Research Design

Quasi-experimental research was conducted to measure the effectiveness of using a researcher-designed five-step critical review activity while reading ChatGPT responses in raising university students’ critical awareness of ChatGPT limitations in academic writing. For data collection, a quantitative approach was adopted in the study using a pre-post ChatGPT Limitations Critical Awareness Questionnaire to assess university students’ level of awareness and recognition of ChatGPT model limitations in Academic Writing.

The quasi-experimental research design was conducted in three phases: the pre-administration of the ChatGPT Limitations Critical Awareness questionnaire, the integration of critical review activities while using ChatGPT for in-class academic writing tasks for a semester (three months), and the post-administration of the Students’ Critical Awareness of ChatGPT Limitations questionnaire.

3.2 Statistical Analysis

Data was collected, cleaned and organised by the researcher. The data was run by the SSPS (Statistical Package for the Social Sciences) software to compute the statistical analysis tests and obtain results. Descriptive and comparative analysis methods were employed to evaluate the data collected on students’ critical awareness of ChatGPT Limitations. The descriptive analysis tests included calculating mode, means, median and standard deviations to present the data. The comparative analysis included calculating the T-value and the P-value using a t-test to measure the difference between pre-questionnaire and post-questionnaire scores. The p-value was compared to a set significance level (usually 0.05) to decide whether the results were statistically significant or non-significant. To measure the effect of the proposed treatment "researcher-designed critical review activity", the effect size of the suggested treatment was calculated by using the Eta square formula.

3.3 Research Setting

The research was conducted for four months in the second semester of the 2023/2024 academic year, from February 2024 to May 2024, at the University of Prince Edward Island, Cairo Campus, hosted by the University of Canada in Egypt (UPEI-UCE). The study limited itself to the type of international campus universities in Egypt, which educate both Egyptian and international students.

3.4 Research Participants

The participants of the study were selected based on two criteria. Firstly, they had to be enrolled in the Academic Writing Subject (Writing Studies) and use ChatGPT in their process of in-class academic writing tasks. Secondly, they had to come from the first or the second year in the three schools: Computer Science, Engineering, and Business School in the UPEI-UCE. The number of participants was 120 students in total, randomly enrolled in three sections of a mix of the three majors and a blend of first- and second-year students. Seventy-five students were from business majors, 27 students were from computer science majors, and 18 students were from engineering majors. The questionnaire in a Microsoft Outlook online form was sent to all the participants via email and Outlook Teams. The questionnaire yielded responses from 120 students, among 44 second-year and 76 first-year students.

3.5 Research Framework

As educators grapple with the implications of integrating tools like ChatGPT into their curricula, a focus on fostering critical thinking becomes paramount; after all, student essays should reflect more nuanced epistemic stances that are essential for persuasive argumentation (Markey et al., 2024). The challenge lies not only in leveraging these technologies effectively but also in ensuring that they enhance—rather than diminish—the development of vital rhetorical skills among learners. As researchers continue to explore these limitations, it becomes increasingly clear that a collaborative approach involving human expertise is essential to mitigate potential biases and enhance the utility of language models like ChatGPT in specialised domains. Therefore, fostering a critical analysis approach towards the use of such technologies is essential, ensuring that human oversight remains central to the research process and that any generated material undergoes rigorous scrutiny before publication (Osama & Maaz, 2023).

The critical analysis approach of a text involves analysis and evaluation rather than passively absorbing information. It encourages students to analyse text features and composition structure, question assumptions and views, analyse logic, reasoning and arguments, evaluate sources, and recognise biases and fallacies through interaction with the text. Thus, based on the critical analysis approach, the researcher designed a five-step critical review activity for students to practice doing while using ChatGPT in their in-class academic writing tasks. The critical review activity steps are 1) annotation, 2) comparative analysis, 3) argument analysis, 4) bias identification, and 5) language analysis.

In the academic semester of 13 teaching weeks, students attended two lectures each week. In the first week of the academic writing course, students were introduced to the course content orientation in the first lecture, and they filled in two online forms in the second lecture: the Pre-ChatGPT Limitations Awareness Questionnaire and the consent form to participate in the research study. During the second week of lectures, students were introduced to the ChatGPT limitations forms and the researcher-designed five-step critical review activity procedures. For the next ten weeks, students attended lectures to write five in-class writing tasks. Those in-class academic writing tasks included a short personal narrative, a reflective essay, an analytical essay of arguments, a problem-solution essay focusing on solutions, and a summary essay of an article.

Students spent four lectures to write each in-class academic writing task. Students were divided to work in pairs to finish each in-class writing task; pairs were shuffled for each in-class writing task to enhance intellectual cooperation and scaffold writing skills development. Each in-class writing task was monitored by the instructor (the researcher of this study). Students were instructed to use ChatGPT during their writing process in class. They were encouraged to use it to brainstorm ideas, outline their writing tasks, gain knowledge and information, and find research evidence and sources while maintaining academic integrity. Students did the five-step critical review activity every time ChatGPT provided them with a written text response for their prompts. Meanwhile, the instructor provided guidance, clarification, help and support to facilitate the activity and ensure doing the activity steps correctly. In the last week of the semester, students filled in the online form of the Post ChatGPT Limitations Awareness Questionnaire during the lecture while monitored by the instructor to ensure receiving the participants’ responses.

  1. RESEARCH INSTRUMENT

4.1 Questionnaire Design

The study utilised a researcher-designed survey questionnaire, which underwent reliability and validity testing. The researcher developed a questionnaire based on four main ChatGPT limitation forms in Academic Writing, including “Formulaic Language and Monotonous Writing Style”, “Decontextualisation & Linguistic Ambiguity”, “Hallucinations & Inaccuracies”, and “Model Inherent Biases”, which were identified in the study literature. The online questionnaire form has two parts: the first part is Demographic information, and the second part is the Students’ Critical Awareness Questionnaire of ChatGPT Limitations in Academic Writing. The second part (Closed Questionnaire) includes 16 statements covering the four main ChatGPT limitations mentioned above. Each statement begins with “I can recognise ….” to assess students’ critical awareness of the four mentioned limitations. The questionnaire method for collecting responses was categorical, allowing students to choose one response of three categories of responses: “Yes”, “No”, and “Not Sure”.  The questionnaire scores were recorded as Yes = 2 Points, No = 0 Points, and Not Sure = 0 Points. Each student’s responses were scored with a total score of 32 Points. See Appendix A.

  • Questionnaire Reliability

The questionnaire was piloted, followed by a test of reliability and internal consistency. The reliability test of data was examined by Cronbach’s alpha coefficient. The subscales of the four limitation forms were above 0.7, and the Cronbach’s alpha coefficient value of the total was 0.937, which indicated that the reliability of the scale was relatively high. The level of reliability was 0.893 for the limitation “Formulaic Language & Monotonous Writing Style”, 0.935 for the limitation “Decontextualization & Language Ambiguity”, 0.967 for the limitation “Hallucinations Inaccuracy”, and 0.953 for the final limitation “Inherent Model’s Biases”.

4.3 Data Collection

To ensure that equal questionnaire responses were collected, both the pre- and post-questionnaire administrations were done through an online Microsoft Outlook form while being monitored in class. At the beginning of the semester, the research participants filled in the ChatGPT Limitations Critical Awareness pre-questionnaire form online while being monitored in class. The research participants, enrolled in the academic writing subject, practised the five-step critical review activity during the semester while using ChatGPT to do five in-class academic writing tasks. Finally, they filled in the ChatGPT Limitations Critical Awareness post-questionnaire form online while being monitored in class.

 

  1. RESULTS

Students’ critical awareness of ChatGPT limitations was assessed by a researcher-designed pre- and post-questionnaire, based on the literature on ChatGPT limitations in academic writing. The researcher collected, cleaned and organised the data to run a descriptive and comparative analysis and reach conclusions.

The “t” test formula below was employed to difference between the experimental group mean scores on the Pre and Post-Administrations of the Students’ Critical Awareness Questionnaire of ChatGPT Limitations.

.

Where: 

"t" =is the calculated value of the difference between mean scores, 

Dm = the general mean score of the differences between pairs of scores, and 

Dd = the deviation of the differences from their general mean score. 

 

See the table in the next page: 

Table 5.1: "t" Value of the difference between the experimental group mean scores on the Pre and Post-Administrations of the Students’ Critical Awareness Questionnaire of ChatGPT Limitations

 

Pre-Post Admin Students’ Critical Awareness Questionnaire 



Admin



N






M

 

Std.

Deviation Difference

 

Mean Differences

 

Degree of 

Freedom

 

T

tabled

value

 

T

calculated

value

 

Level

of Sig.

 

Students’ Critical Awareness Questionnaire of ChatGPT Limitations

 

Pre

 

120

 

50.14

 

4.571



72.571

 

34

 

2.457

 

136.866

 

Significant.

At

0.01

 

Post

 

120

 

122.71

 

3.707

 

 

The above table indicates the following: 

1- The "t" calculated value of the overall Students’ Critical Awareness Questionnaire of ChatGPT Limitations (136.86) was higher than the “t” tabled Value (2.457) with (34) degree of freedom at the (0.01) level of significance. So, there was a statistically significant difference between the participants' mean scores on the pre and post administrations of the Students’ Critical Awareness Questionnaire of ChatGPT Limitations.

2- The mean score of the participants on the post-administration of the Students’ Critical Awareness Questionnaire of ChatGPT Limitations (122.71) was higher than their mean score on the pre-administration (50.14). Thus, the overall development of the Critical Awareness of ChatGPT Limitations of the participants was due to the researcher-designed critical review activity during the experiment.

Further, the “t” test formula was used to decide the relative extent of differences between the mean scores of the participants in each of the four ChatGPT limitations, nurtured by the implementation of the researcher-designed critical review activity.

See the table below: 

Table 5.2: "t" Value of the difference between Mean Scores of the Participants on the Pre and Post-Administrations in each of the four ChatGPT limitations.

Pre-Post Admin Students’ Critical Awareness Questionnaire of ChatGPT Limitations



Admin



N






M

 

Std.

Deviation Difference

 

Mean Differences

 

Degree of 

Freedom

 

T

tabled

value

 

T

calculated

value

 

Level

of Sig.

 

Hallucinations & Inaccuracies

 

Pre

 

120

 

9.54

 

1.597



12.171

 

34

 

2.457

 

67.258

 

Significant.

At

0.01

 

Post

 

120

 

21.71

 

1.742

 

Model’s Inherent Biases

 

Pre

 

120

 

12.83

 

1.014



12.371

 

34

 

2.457

 

57.984

 

Significant.

At

0.01

 

Post

 

120

 

25.20

 

1.641

 

Formulaic Language and Monotonous Writing Style

 

Pre

 

120

 

8.57

 

1.037



16.514

 

34

 

2.457

 

91.525

 

Significant.

At

0.01

 

Post

 

120

 

25.09

 

1.522

 

Decontextualization & Linguistic Ambiguity

 

Pre

 

120

 

8.06

 

.802



17.000

 

34

 

2.457

 

53.986

 

Significant.

At

0.01

 

Post

 

120

 

25.06

 

1.924

Overall

Students’ Critical Awareness Questionnaire of ChatGPT Limitations

 

Pre

 

120

 

50.14

 

4.571



72.571

 

34

 

2.457

 

136.866

 

Significant.

At

0.01

 

Post

 

120

 

122.71

 

3.707

The table above confirms the following: 

a- There is a statistically significant difference between the mean score of the participants in the post-questionnaire administration on the Hallucinations and Inaccuracies Limitation at its basic level (21.71) and the mean score of the pre-questionnaire administration (9.54) at the level of (0.1). 

b- There is a statistical difference between the mean score of the participants in the post-questionnaire administration on the Model’s Inherent Biases at its basic level (25.20) and the mean score of the pre-questionnaire administration (12.83) at the level of (0.1). 

c- There is a statistical difference between the mean score of the participants in the post-questionnaire administration on Formulaic Language and Monotonous Writing Style Limitation at its basic level (25.09) and the mean score of the pre-questionnaire administration (8.57) at the level of (0.1).

d- There is a statistical difference between the mean score of the participants in the post-questionnaire administration on Decontextualization Language Ambiguity Limitation at its basic level (25.06) and the mean score of the pre-questionnaire administration (8.06) at the level of (0.1).

To measure the effect of the proposed treatment "researcher-designed critical review activity", the effect size of the suggested treatment was calculated by using the Eta square formula 2) as follows,

See the table below: 

Table 5.3: The Calculated 2) and the Effect Size for the Students’ Critical Awareness Questionnaire of ChatGPT Limitations

 

ChatGPT Limitations

 

T- value

 

D.F

 

ŋ2 value

 

Effect size

 

Students’ Critical Awareness Questionnaire of ChatGPT Limitations

 

136.866

 

34

 

0.998

 

Large

 

As indicated in Table (5.3), the calculated 2) value of the researcher-designed critical review activity on the experimental group’s overall critical awareness of ChatGPT Limitations was 0.998. Thus, a large size of effect, on the development of the critical awareness of ChatGPT Limitations of the experimental group was evidenced, due to utilizing the researcher-designed critical review activity during the experiment.

To measure the effect of the proposed "researcher-designed critical review activity", the effect size of the proposed treatment on students' critical awareness of each of the four ChatGPT limitations was calculated by using the Eta square (η2) formula as follows:

See the table below: 

Table 5.4: The calculated Eta square and the Effect Size of the proposed researcher-designed critical review activity on students' critical awareness of each of the four ChatGPT limitations.

 

Four ChatGPT Limitations

 

T- value

 

D.F

 

ŋ2 value

 

Effect size

 

Hallucinations and Inaccuracies Limitation

 

67.258

 

34

 

0.992

 

Large

 

Model’s Inherent Biases Limitation

 

57.984

 

34

 

0.989

 

Large

 

Formulaic Language and Monotonous Writing Style Limitation

 

91.525

 

34

 

0.995

 

Large

 

Decontextualization and Linguistic Ambiguity Limitation

 

53.986

 

34

 

0.988

 

Large

 

As indicated in Table (5.4) the calculated Eta square values of the proposed treatment "researcher-designed critical review activity" on students’ critical awareness of each of the four ChatGPT limitations were the Hallucinations and Inaccuracies Limitation (0.992), the Inherent Model’s Biases Limitation (0.989), the Formulaic Language and Monotonous Writing Style Limitation (0.995), and the Decontextualization and Language Ambiguity Limitation (0.988), respectively. Thus, a large size of effect on the development of the students’ critical awareness in the four ChatGPT Limitations was evidenced due to implementing the proposed treatment "researcher-designed critical review activity" used during the experiment.

 

 

  1. CONCLUSION

University students' growing dependence on ChatGPT for academic writing requires a thorough examination of its limitations. This study aims to help students develop a critical awareness and deep understanding of the linguistic and factual flaws in ChatGPT responses. Through the use of a critical review activity created by the researcher, the study successfully involved participants in assessing the appropriateness and reliability of information produced by ChatGPT. The necessity of including critical thinking activities in academic writing courses is highlighted by the significant improvement in students' critical awareness that was revealed by analysing the data collected by the students’ critical awareness of ChatGPT Limitations pre and post-questionnaire.

These results support integrating educational materials that develop students’ critical attitude towards AI-generated content, give them the right critical tools they need to evaluate academic content generated by ChatGPT and enhance their academic writing potential. The findings also emphasise the importance of fostering critical active engagement over passive dependence behaviour, which could be unintentionally developed due to the impressive plausible display of the AI tools.  This research contributes to the ongoing discourse on the role of artificial intelligence in education while preparing students intellectually and critically to willingly adopt mindful perspectives towards the invasive wave of AI technology.

Putting these ideas into practice will be crucial as educational institutions develop further to prepare a generation of students who are not just skilled writers but also information evaluators. Educators can enable students to take an active role in their education by including these analytical tools into the curriculum. This will allow students to successfully question, analyse, and synthesise material and become proactive critical users of AI technology models. By encouraging student cooperation, this transformative approach creates a classroom climate that values technological development while fostering critical thinking. In the end, such thorough research may help educators understand how to best incorporate AI technologies into the curriculum, creating an atmosphere that encourages critical use of technology while improving academic results.

Future studies should examine the effects of including critical review exercises for ChatGPT replies in academic subjects other than academic writing courses. Examining how these exercises affect students' general critical thinking abilities, research potentials, and academic achievement may yield insightful information. Studies could also look at how well various teaching strategies and resources work to raise students' critical awareness of AI-generated content. To evaluate the generalizability of the results, it would also be advantageous to include a wide range of participants from different educational institutions and backgrounds. Lastly, to ensure that teaching methods stay applicable and successful in a learning environment that is becoming more and more digital, longitudinal research could monitor how students' attitudes and abilities toward AI technologies like ChatGPT change over time.

REFERENCES
Akhoon, I., Khan, M., Bhat, T. (2024). Comparative Analysis of AI-Generated Research Content: Evaluating ChatGPT and Google Gemini. PREPRINT (Version 1) available at Research Square doi: 10.21203/rs.3.rs-5265799/v1
Al-Sofi, B. (2024). Artificial Intelligence-Powered Tools and Academic Writing: to use or not to use ChatGPT. Saudi Journal of Language Studies, doi: 10.1108/sjls-06-2024-0029
Alam. M. (2024). Distinguishing between Student-Authored and ChatGPTGenerated Texts: A Preliminary Exploration of Human Evaluation Techniques. Iraqi journal for computer science and mathematics, 5(3):287-304. doi: 10.52866/ijcsm.2024.05.03.016
Arif, Nugroho., Nur, Hidayanto, Pancoro, Setyo, Putro., Kastam, Syamsi. (2023). The Potentials of ChatGPT for Language Learning: Unpacking its Benefits and Limitations. Register Journal, doi: 10.18326/register.v16i2.224-247
Asad, M., Shahzad, S.., Shah, S., Sherwani, F. & Almusharraf, N. (2024). ChatGPT as artificial intelligence-based generative multimedia for English writing pedagogy: challenges and opportunities from an educator’s perspective. Campus-wide information systems, doi: 10.1108/ijilt-02-2024-0021
Athaluri, S. A., Manthena, S. V., Kesapragada, V. S. R. K. M., Yarlagadda, V., Dave, T., & Duddumpudi, R. T. S. (2023). Exploring the Boundaries of Reality: Investigating the Phenomenon of Artificial Intelligence Hallucination in Scientific Writing Through ChatGPT References. Cureus15(4), e37432. https://doi.org/10.7759/cureus.37432
Atkinson-Toal, A., & Guo, C. (2024). Generative Artificial Intelligence (AI) Education Policies of UK Universities. Deleted Journal, 2, 70–94. https://doi.org/10.62512/etlhe.20
Audichya, M. K., & Saini, J. R. (2023). ChatGPT for Creative Writing and Natural Language Generation in Poetry and Prose. https://doi.org/10.1109/icacta58201.2023.10392805
Baker, J. (2021). The Importance of Academic Writing in Higher Education. Journal of Educational Research, 45(3), 123-134. https://doi.org/10.26803/ijlter.20.10.1
Barrett, A.J., & Pack, A. (2023). Not quite eye to A.I.: student and teacher perspectives on the use of generative artificial intelligence in the writing process. International Journal of Educational Technology in Higher Education, 20, 1-24.
DOI:10.1186/s41239-023-00427-0
Basmov, V., Goldberg, Y. & Tsarfaty, R. (2023). ChatGPT and Simple Linguistic Inferences: Blind Spots and Blinds. arXiv.org, abs/2305.14785 doi: 10.48550/arXiv.2305.14785
Baziyad, M., Kamel, I. & Rabie, T. (2023). On the Linguistic Limitations of ChatGPT: An Experimental Case Study. International Symposium on Networks, Computers and Communications (ISNCC), Doha, Qatar, 2023, pp. 1-6, doi: 10.1109/ISNCC58260.2023.10323661.
Bhatia, P. (2023). ChatGPT for academic writing: A game changer or a disruptive tool? Journal of Anaesthesiology Clinical Pharmacology. https://doi.org/10.4103/joacp.joacp_84_23
Bobk, O. & Kryvoshyia, D. (2024). Writing a critical academic essay: with or without chatgpt?. Grail of science (European Scientific Platform Publishing), 445-452. doi: 10.36074/grail-of-science.06.09.2024.058
Buchanan, J., Hill, S., & Shapoval, O. (2024). Chatgpt hallucinates non-existent citations: Evidence from economics. The American Economist, 69(1), 80-87. DOI: 10.1177/05694345231218454
Burbano, R. (2023). El papel de la escritura académica en la educación superior: explorando la relación entre escritura y objetivos de desarrollo sostenible. Tierra Infinita (Universidad Politécnica Estatal del Carchi - Impresa), 9(1):215-227. doi: 10.32645/26028131.1253
Cabuquin, J., Acidre, M., Manabat, M., Mary, Aruta, G., Sangutan, J. & Yu, R. (2024). The role of ChatGPT on academic research: perspectives from Filipino students across diverse educational levels. Salud, Ciencia y Tecnología - Serie de Conferencias, 3 doi: 10.56294/sctconf2024.1205
Chen, H. (2024). Utilizing ChatGPT in Systematic Reviews and Meta-Analyses. Airiti Inc, 71(5):21-28. doi: 10.6224/jn.202410_71(5).04
Chowdhury, Md. N.-U.-R., & Haque, A. (2023). ChatGPT: Its Applications and Limitations. https://doi.org/10.1109/conit59222.2023.10205621
Conde, J., Reviriego, P., Salvachúa, J., Martínez, G., Hernández, J.A., & Lombardi, F. (2024). Understanding the Impact of Artificial Intelligence in Academic Writing: Metadata to the Rescue. Computer, 57, 105-109. https://oa.upm.es/77072/1/Accepted_verseion_copyright.pdf
Consuegra‐Fernández, M., Aznar, J. S., Serra, J. G. B., & Molina, J. J. C. (2024). ChatGPT. RIE. https://doi.org/10.6018/rie.565391
Costa, K., Mfolo, L. N., & Ntsobi, M. P. (2024, July 24). Challenges, Benefits and Recommendations for Using Generative Artificial Intelligence in Academic Writing – A case of ChatGPT. Center for Open Science. https://doi.org/10.31222/osf.io/7hr5v
Datskiv, O., , Zadorozhna, I. & Shon, O. (2024). Developing Future Teachers’ Academic Writing and Critical Thinking Skills Using ChatGPT. International Journal of Emerging Technologies in Learning (ijet), 19(07):126-136. doi: 10.3991/ijet.v19i07.49935
Dergaa, I., Chamari, K., Zmijewski, P., & Saad, H. B. (2023). From human writing to artificial intelligence generated text: examining the prospects and potential threats of ChatGPT in academic writing. Biology of Sport. https://doi.org/10.5114/biolsport.2023.125623
Elhag, W., Bouteldja, D. & Bouallegue, S. (2023). A Survey on the Applications, Limitations, and Ethical Considerations of ChatGPT in Various Industries. IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering (MetroXRAINE), Milano, Italy, 2023, pp. 565-569, doi: 10.1109/MetroXRAINE58569.2023.10405608.
Fiialka, S., Kornieva, Z. & Honcharuk, T. (2024). The use of ChatGPT in creative writing assistance. XLinguae, doi: 10.18355/xl.2024.17.01.01
Fleisig, E., Smith, G., Bossi, M., Rustagi, I., Yin, X. & Klein, D. (2024). Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination. arXiv.org doi: 10.48550/arxiv.2406.08818
Fulgu, R., Capraro, V. (2024). Surprising gender biases in GPT.  arXiv:2407.06003v1 doi: 10.48550/arxiv.2407.06003
García Díaz, V. (2019). La ambivalencia de la escritura académica entre los profesores de diseño. Foro de Educación, 17(26), 197-218. doi: http://dx.doi.org/10.14516/fde.603
George, C. & Stuhlmüller, A. (2023). Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers. arXiv.org, abs/2310.10627 doi: 10.48550/arxiv.2310.10627
Georgiou, G. (2024, July 8). ChatGPT exhibits bias towards developed countries over developing ones, as indicated by a sentiment analysis approach. https://doi.org/10.31234/osf.io/49gzc
Guleria, A., Krishan, K., Sharma, V. & Kanchan, T. (2023). ChatGPT: ethical concerns and challenges in academics and research.. Journal of Infection in Developing Countries, 17(9):1292-1299. doi: 10.3855/jidc.18738
Gultom, A., Ashadi, A., Fatnalaila, F., Azizah, S. & Rosyidah, D. (2024). The Use of Chat GPT for Academic Writing in Higher Education. Formosa Journal of Sustainable Research, 3(8):1713-1730. doi: 10.55927/fjsr.v3i8.10162
Gupta, V., Anamika, F., Parikh, K., Patel, M. A., Jain, R. & Jain R. (April 1, 2024). From Advancements to Ethics: Assessing ChatGPT’s Role in Writing Research Paper. Turkish Journal of Internal Medicine 6 2 74–80.  https://doi.org/10.46310/tjim.1398093
Hanna, J. J., Wakene, A. D., Lehmann, C. U., & Medford, R. J. (2023). Assessing Racial and Ethnic Bias in Text Generation for Healthcare-Related Tasks by ChatGPT. medRxiv. https://doi.org/10.1101/2023.08.28.23294730
Ho, P. X. P. (2024). Using ChatGPT in English Language Learning: A Study on I.T. Students’ Attitudes, Habits, and Perceptions. International Journal of TESOL & Education, 4(1), 55-68. https://doi.org/10.54855/ijte.24414
Hwang T, Aggarwal N, Khan PZ, Roberts T, Mahmood A, Griffiths MM, et al. (2024) Can ChatGPT assist authors with abstract writing in medical journals? Evaluating the quality of scientific abstracts generated by ChatGPT and original abstracts. PLoS ONE 19(2): e0297701. https://doi.org/10.1371/journal.pone.0297701
Im, H. (2023). A Study on College Students’ Perspectives and Attitudes Toward the Use of ChatGPT in English Classes. https://doi.org/10.33645/cnc.2023.09.45.09.1335
Jan, R. (2024). Examining the Reliability of ChatGPT. Advances in educational technologies and instructional design book series, 367-392. doi: 10.4018/979-8-3693-0884-4.ch014
Jančařík, A. & Dušek, O. (2024). The Problem of AI Hallucination and How to Solve It. European Conference on e-Learning, doi: 10.34190/ecel.23.1.2584
Jiang, F., & Hyland, K. (2024). Does ChatGPT Argue Like Students? Bundles in Argumentative Essays. Applied Linguistics. https://doi.org/10.1093/applin/amae052
Juan, W., Qing, S., Yunus, M. M., & Rafiq, K. R. M. (2023). Integrating ChatGPT into English Language Teaching and Learning: Strengths and Weaknesses. International Journal of Academic Research in Business & Social Sciences. https://doi.org/10.6007/ijarbss/v13-i12/20251
Kennelly, I. (2017). Writing with Purpose(s): A Reflection on Different Conceptions of Academic Writing in Contemporary Higher Education. Irish Journal of Academic Practice: Vol. 6: Iss. 1, Article 6. doi:10.21427/D79725
Kleebayoon, A., Wiwanitkit, V. (2023). ChatGPT and scientific paper. J Korean Assoc Oral Maxillofac Surg 2023; 49(4): 239~240. doi: https://doi.org/10.5125/jkaoms.2023.49.4.239
Kobak, D., Marquez, R., Horvát, E., & Lause, J. (2024). Delving into ChatGPT usage in academic writing through excess vocabulary. Cornell University.  doi: 10.48550/arxiv.2406.07016
Koçyiğit, B. & Zhaksylyk, A. (2023). Advantages and drawbacks of chatgpt in the context of drafting scholarly articles. Central Asian journal of medical hypotheses and ethics, 4(3):163-167. doi: 10.47316/cajmhe.2023.4.3.04
Kolesnikov, D., Kozlova, A., Aleхandrov, A., Kalmykov, N., Treshkov, P., LeBaron, T. & Medvedev, O. (2024). Applying ChatGPT to Writing Scientific Articles on the Use of Telemedicine: Opportunities and Limitations. Artificial Intelligence in Health, 1(3), 53–63. https://doi.org/10.36922/aih.2592
Li, D. (2024). ChatGPT in Language Writing Education: Reflections and a Research Agenda for a ChatGPT Feedback Engagement Framework. Language teaching research quarterly, 43:121-131. doi: 10.32038/ltrq.2024.43.07
Liu, Z., Yao, Z. ,  Li, F. & Luo, B. (2023). Check Me If You Can: Detecting ChatGPT-Generated Academic Writing using CheckGPT. arXiv:2306.05524 [cs.CL] doi: 10.48550/arxiv.2306.05524
Mahyoob, M., Algaraady, J., & Alblwi, A. (2023, August 1). A Proposed Framework for Human-like Language Processing of ChatGPT in Academic Writing. https://doi.org/10.3991/ijet.v18i14.41725
Markey, B., Brown, D., Laudenbach, M. & Kohler, A. (2024). Dense and Disconnected: Analyzing the Sedimented Style of ChatGPT-Generated Text at Scale. Written Communication, doi: 10.1177/07410883241263528
Mbwambo, N., & Kaaya, P. B. (2024). ChatGPT in Education: Applications, Concerns and Recommendations. Deleted Journal, 2(1), 107–124. https://doi.org/10.56279/jicts.v2i1.87
McCarthy, J. (2011). Reflective Writing, Higher Education and Professional Practice. Journal for Education in the Built Environment,.6(1):29-43. doi: 10.11120/JEBE.2011.06010029
Mondal, I. & Sancheti, A. (2024). How much reliable is ChatGPT's prediction on Information Extraction under Input Perturbations?. arXiv.org, abs/2404.05088 doi: 10.48550/arxiv.2404.05088
Motoki, F., Neto, V. & Rodrigues, V. (2023). More human than human: measuring ChatGPT political bias. Public Choice, 198(1-2):3-23. doi: 10.1007/s11127-023-01097-2
Nationality Bias in Text Generation. (2023). https://doi.org/10.48550/arxiv.2302.02463
Nepal, T. (2024). Exploring the Applications and Challenges of ChatGPT in Research and Academia: A Comprehensive Review. West Science Interdisciplinary Studies, 2(05):1043-1050. doi: 10.58812/wsis.v2i05.729
Ningsih, A. & Rohmah, G. (2024). Communication Styles of Human-Machine Interaction in the ChatGPT. Indonesian journal of EFL and linguistics, 161-178. doi: 10.21462/ijefl.v9i1.767
Ortega-Martín, M., García-Sierra, O., Ardoiz, A., Álvarez, J., Armenteros, J. & Alonso, A. (2023). Linguistic ambiguity analysis in ChatGPT. arXiv.org, abs/2302.06426 doi: 10.48550/arXiv.2302.06426
Osama, M., Afridi, S., & Maaz, M. (2023). ChatGPT: Transcending Language Limitations in Scientific Research Using Artificial Intelligence. https://doi.org/10.29271/jcpsp.2023.10.1198
Ouyang, S., Jie, M., Mark, Z. & Wang, H. M. (2024). An Empirical Study of the Non-determinism of ChatGPT in Code Generation. ACM Transactions on Software Engineering and Methodology, doi: 10.1145/3697010
Özer, M. (2024). Impact of ChatCPT on Scientific Writing. İnsan ve Toplum Dergisi. 14(3):210-217. doi: 10.12658/e0002
Patil, P. D., Kulkarni, K. & Sharma, P. (2024). Algorithmic Issues, Challenges, and Theoretical Concerns of ChatGPT. Advances in computational intelligence and robotics book series, 56-74. doi: 10.4018/979-8-3693-6824-4.ch003
Pokhrel, S., & Banjade, S. R. (2023). AI Content Generation Technology based on Open AI Language Model. December. https://doi.org/10.36548/jaicn.2023.4.006
Qiu, Z., Duan, X., & Cai, Z. G. (2023, May 12). Pragmatic Implicature Processing in ChatGPT.  PsyArXiv Preprints. https://doi.org/10.31234/osf.io/qtbh9
Rensburg, H. & Reedy, N. (2024). ChatGPT in Higher Education: A Comprehensive Study of the Literature. Friend or Foe?. European Conference on e-Learning, doi: 10.34190/ecel.23.1.3126
Shrivastava, A. (2024). Transformative landscape of ChatGPT in higher education: unveiling the academic discourse. Global knowledge, memory and communication, Emerald Publishing Limited. doi: 10.1108/gkmc-02-2024-0068
Silva, T., Haiter-Neto, F. & Oliveira, M. (2023). ChatGPT and scientific writing: A reflection on the ethical boundaries. Imaging Science in Dentistry, 53(2):175-176. doi: 10.5624/isd.20230085
Sinha, K., Gauthier, J., Mueller, A., Misra, K., Fuentes, K., Levy, R., & Williams, A. (2022). Language model acceptability judgements are not always robust to context. Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2212.08979
Smith, A., & Jones, B. (2020). Academic Integrity and Professional Preparedness: The Role of Writing in Higher Education. International Journal of Academic Integrity, 16(2), 45-60. DOI:10.53623/apga.v3i1.337
Suárez, A., García, V. D.-F., Algar, J., Sánchez, M. G., Pedro, M. L. de, & Freire, Y. (2023). Unveiling the ChatGPT phenomenon: Evaluating the consistency and accuracy of endodontic question answers. International Endodontic Journal.  57(1):108-113. doi: 10.1111/iej.13985. Epub 2023 Oct 9.
Tai, A., Meyer, M., Varidel, M., Prodan, A., Vogel, M., Irofino, F., and Krausz, R. (2024). Exploring the potential and limitations of ChatGPT for academic peer-reviewed writing: Addressing linguistic injustice and ethical concerns. Journal of Academic Language & Learning, Vol. 17, No. 1, 2023, T16-T30. ISSN 1835-5196.
Tang, C. M., & Chaw, L. Y. (2024). Student Learning Performance Evaluation: Mitigating the Challenges of Generative AI Chatbot Misuse in Student Assessments. European Conference on E-Learning, 23(1), 357–364. https://doi.org/10.34190/ecel.23.1.2567
Ullah, M., Bin Naeem, S., & Kamel Boulos, M. N. (2024). Assessing the Guidelines on the Use of Generative Artificial Intelligence Tools in Universities: Results of a Survey of the World’s Top 50 Universities. https://doi.org/10.20944/preprints202411.1411.v1
Ventura, M. & Filho, A. (2023). ChatGPT: limitations, challenges and potential applications. Brazilian Journal of Science, 3(1):65-68. doi: 10.14295/bjs.v3i1.427
Warr, M., Oster, N. & Isaac, R. (2024). Implicit bias in large language models: Experimental proof and implications for education. Journal of research on technology in education, 1-24. Vol. 13, No. 6, 2023doi: 10.1080/15391523.2024.2395295
Wen, Z., & Younes, R. (2023). ChatGPT v.s. media bias: A comparative study of GPT-3.5 and fine-tuned language models. Applied and Computational Engineering. https://doi.org/10.54254/2755-2721/21/20231153
Woloshyn. V., Illingworth, S. & Obradović-Ratković, S. (2024). Introduction to Special Issue Expanding Landscapes of Academic Writing in Academia. Brock Education Journal, 33(1):3-9. doi: 10.26522/brocked.v33i1.1118
Wu, Y. (2024). Evaluating ChatGPT: Strengths and Limitations in NLP Problem Solving. Highlights in Science Engineering and Technology, 94:319-325. doi: 10.54097/z15ne349
Yazid, S. & Dzulfikri, D. (2024). Chatgpt, smart writing assistant chatbot for students: an analysis of its drawbacks. Abjadia, 9(1):41-56. doi: 10.18860/abj.v9i1.24459
Yıldırım, F. Ç., & Tellou, S. (2024). ChatGPT’s Perception of Context and Speech Acts. Advances in Business Strategy and Competitive Advantage Book Series. https://doi.org/10.4018/979-8-3693-3498-0.ch014
Zhang, Y. (2024). Teaching to Avoid Plagiarism in Academic Writing. In: Understanding-Oriented Pedagogy to Strengthen Plagiarism-Free Academic Writing. Springer, Singapore. https://doi.org/10.1007/978-981-99-9844-9_3