How I refactor large codebases with ChatGPT
Refactoring a large codebase can be daunting, especially when dealing with hundreds of files. While ChatGPT excels at tweaking individual files, you might wonder if it can handle the restructuring of a vast project. Yes - kind of.
ChatGPT has a finite context window, meaning it can only process a limited amount of information at a time. This limitation necessitates a strategic approach: break down the task into two main parts - refactoring file contents and reorganizing class structures.
Below I'll focus on reorganizing class structures:
To enable ChatGPT to propose a new structure, it needs to understand three key aspects of your project:
1. Current Naming Convention: You can use Doxygen to document your class structure. Doxygen supports C++, C, Python, PHP, Java, C#, Objective-C, Fortran, VHDL, Splice, IDL, and Lex. Generate an HTML output and paste the class structure <table> from `html/annotated.html` into ChatGPT.
2. Business Domain: Provide a concise description of your project. A well-written README file is often sufficient for this purpose.
3. Organizational Approach: Specify your preferred structuring approach, such as "Implement Domain-Driven Design (DDD) best practices."
Expect to refine the structure ChatGPT generates multiple times, incorporating additional insights into your business domain with each iteration.
Once satisfied, request ChatGPT to output the structure in a format useful for you, such as an indented file structure for a C# project. You can also ask for a script, like a .bat file, to automate the creation of these folders and files.
This method has been effective for me, though I'm sure it can be automated further.
Here’s a crypto portfolio tracker I wrote and refactored with ChatGPT using this method: https://github.com/DavidVeksler/Crypto-Portfolio-Tracker