Home Community Researchers From China Propose a Generate-and-Edit Approach that Utilizes Execution Results of the Generated Code from LLMs to Improve the Code Quality within the Competitive Programming Task

Researchers From China Propose a Generate-and-Edit Approach that Utilizes Execution Results of the Generated Code from LLMs to Improve the Code Quality within the Competitive Programming Task

0
Researchers From China Propose a Generate-and-Edit Approach that Utilizes Execution Results of the Generated Code from LLMs to Improve the Code Quality within the Competitive Programming Task

Researchers draw inspiration from the means of human programming to assist LLMs do higher in competitive programming jobs. The competitive programming job has recently been applied to large language models. This work necessitates accurately implementing solutions that may span a whole lot of lines and comprehending a complicated natural language description of an issue with example test cases. Executing solutions on concealed test cases allows for solution evaluation. Nevertheless, current LLMs’ accuracy and pass rates could possibly be higher for this purpose. For example, on the widely used APPS test, a competitive programming benchmark, the virtually strongest model GPT3 only scores 7% accuracy. 

Programmers often develop an initial program, run a number of sample test cases, after which make changes to the code in response to the test findings while resolving competitive programming difficulties. During this step, the programmer may use vital information from the test results to troubleshoot the software. They implement this idea by utilizing a comparable workflow with a neural-based editor. The code produced by a pre-trained LLM was examined, and it was discovered that several of the generated codes could be enhanced with small adjustments. 

They see that the error message identifies the coding fault, allowing them to correct the issue rapidly. It encourages us to look into editing methods and enhance the code quality produced by LLMs with the help of execution outcomes. On this study, researchers from Peking University suggest a singular generate-and-edit approach to enhance LLMs at competitive programming tasks. Their method uses the potential of LLMs in three phases to emulate the behavior of the human programmers mentioned above: 

🚀 JOIN the fastest ML Subreddit Community
  1. Generation utilizing LLMs. They create this system based on the issue description using huge language models like black box generators.
  2. Execution. They run the created code on the sample test case using LLMs to acquire the execution results. In addition they offer templates for the execution results as additional comments to incorporate more useful data for modification.
  3. Edit. They create a fault-aware neural code editor that improves the code using the produced code and extra comments as input. Their code editor strives to lift the caliber and precision of LLM-based code production. 

They conduct in-depth research on the APPS and HumanEval public competitive programming benchmarks. To reveal the universality, they apply their methodology to 9 well-known LLMs with parameter values starting from 110M to 175B. Their strategy dramatically raises LLM’s performance. Specifically, their method raises the common of pass@1 on APPS-dev and APPS-test by 89% and 31%, respectively. Their tiny editor model can increase pass@1 from 26.6% to 32.4% on the APPS-dev test, even for the largest language model used, GPT3-175B. They prove the transferability of their method on the out-of-distribution benchmark by improving the common of pass@1 by 48% on a brand new type of dataset called HumanEval. Various methods for post-processing programs created by LLMs have recently been presented. 

These methods do extensive LLM sampling, rerank the sampled programs, and produce the ultimate program. Their strategy, in contrast, provides two advantages: Their method keeps the sample budget constant and drastically lowers the computational burden on LLMs. Their editor alters the programs directly and outperforms these reranking-based techniques, particularly with a constrained sample budget like pass@1. They’re the primary, so far as they’re aware, to make use of an editing-based post-processing technique for programming competitions. 

The next is an inventory of the contributions: 

• To supply high-quality code for difficult programming jobs, they suggest a generate-and-edit method for huge language models. 

• They create a fault-aware neural code editor that uses error messages and produces code as input to enhance the code’s precision and quality. 

• They do trials using two well-known datasets and nine LLMs to indicate the potency and applicability of their strategy.


Try the Paper. Don’t forget to hitch our 21k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the most recent AI research news, cool AI projects, and more. If you will have any questions regarding the above article or if we missed anything, be happy to email us at Asif@marktechpost.com

🚀 Check Out 100’s AI Tools in AI Tools Club


Aneesh Tickoo is a consulting intern at MarktechPost. He’s currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects geared toward harnessing the ability of machine learning. His research interest is image processing and is captivated with constructing solutions around it. He loves to attach with people and collaborate on interesting projects.


➡️ Meet Vivid Data: The World’s #1 Web Data Platform

LEAVE A REPLY

Please enter your comment!
Please enter your name here