CREF: An LLM-Based Conversational Software Repair Framework for Programming Tutors

Boyang Yang, Haoye Tian, Weiguo Pian, Haoran Yu, Haitao Wang, Jacques Klein, Tegawendé F. Bissyandé, Shunfu Jin*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

13 Citations (Scopus)

Abstract

With the proven effectiveness of Large Language Models (LLMs) in code-related tasks, researchers have explored their potential for program repair. However, existing repair benchmarks might have influenced LLM training data, potentially causing data leakage. To evaluate LLMs' realistic repair capabilities, (i) we introduce an extensive, non-crawled benchmark TutorCode, comprising 1,239 C++ defect codes and associated information such as tutor guidance, solution description, failing test cases, and the corrected code. Our work assesses LLM's repair performance on TutorCode, measuring repair correctness (TOP-5 and AVG-5) and patch precision (RPSR). (ii) We then provide a comprehensive investigation into which types of extra information can help LLMs improve their repair performance. Among these types, tutor guidance was the most effective information. To fully harness LLMs' conversational capabilities and the benefits of augmented information, (iii) we introduce a novel conversational semi-automatic repair framework CREF assisting human programming tutors. It demonstrates a remarkable AVG-5 improvement of 17.2%-24.6% compared to the baseline, achieving an impressive AVG-5 of 76.6% when utilizing GPT-4. These results highlight the potential for enhancing LLMs' repair capabilities through tutor interactions and historical conversations. The successful application of CREF in a real-world educational setting demonstrates its effectiveness in reducing tutors' workload and improving students' learning experience, showing promise for code review and other software engineering tasks.

Original languageEnglish
Title of host publicationISSTA 2024 - Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis
EditorsMaria Christakis, Michael Pradel
PublisherACM
Pages882-894
Number of pages13
ISBN (Electronic)979-8-4007-0612-7
DOIs
Publication statusPublished - 11 Sept 2024
MoE publication typeA4 Conference publication
EventACM SIGSOFT International Symposium on Software Testing and Analysis - Vienna, Austria
Duration: 16 Sept 202420 Sept 2024
Conference number: 33

Conference

ConferenceACM SIGSOFT International Symposium on Software Testing and Analysis
Abbreviated titleISSTA
Country/TerritoryAustria
CityVienna
Period16/09/202420/09/2024

Keywords

  • Large Language Model
  • Open Source
  • Program Repair

Fingerprint

Dive into the research topics of 'CREF: An LLM-Based Conversational Software Repair Framework for Programming Tutors'. Together they form a unique fingerprint.

Cite this