CodeChat is an AI-powered code analysis tool that leverages Claude (Anthropic's LLM) to help developers understand codebases, generate documentation, and analyze GitHub issues. It converts code repositories and issues into a structured XML format for better LLM comprehension and provides an interactive CLI interface for querying the codebase.
./chat/
├── CodeToXML.py # Converts code repositories to XML format
├── IssueToXML.py # Converts GitHub issues to XML format
├── XMLUtil.py # Handles XML processing and compression
├── LLM.py # Main Claude API interface and conversation management
├── LLM_utils.py # Helper functions for LLM interactions
├── constants.py # System configuration and prompts
└── main.py # CLI application entry point
anthropic>=0.7.0
requests
nltk
xml.etree.ElementTree
Environment variables:
github_token
: GitHub Personal Access Token for repository access
- Clone the repository
- Install required packages:
pip install anthropic requests nltk
- Set up environment variables:
export github_token='your_github_token'
- Run the application:
python main.py
-
When prompted, provide either:
- A local path to a code repository
- A GitHub repository URL
-
Available commands:
generate documentation
: Automatically generate markdown documentationinvestigate issue
: Analyze a GitHub issue (requires issue URL)- Any custom question about the code
exit
: Quit the application
- Supports both local and GitHub repositories
- Converts code to structured XML format
- Maintains conversation context for follow-up questions
- Automatically generates comprehensive documentation
- Includes code structure and functionality analysis
- Provides usage instructions and requirements
- Analyzes issue content and comments
- Extracts and processes code snippets
- Provides resolution suggestions
- Compresses code content to fit LLM context windows
- Removes unnecessary whitespace and stop words
- Maintains code structure and meaning
The application includes error handling for:
- Invalid repository paths/URLs
- GitHub API rate limiting
- XML parsing errors
- LLM API communication issues
<source type="[github_repository|local_directory]" url/path="...">
<file name="...">
[file_content]
</file>
...
</source>
<source type="github_issue" url="...">
<issue_info>
<title>...</title>
<description>...</description>
<comments>
<comment>
<author>...</author>
<content>...</content>
<code_snippet>...</code_snippet>
</comment>
</comments>
</issue_info>
</source>
- File type restrictions: Only processes certain file extensions (.py, .txt, .md, .html, .json, .yaml)
- GitHub API rate limiting may affect repository access
- Large repositories may require additional processing time
- Context window limitations may affect very large files
- GitHub tokens should be kept secure
- Local file access is restricted to specified file types
- XML processing includes escape character handling
- API keys should be properly managed through environment variables