Produces comprehensive summaries and insights about legacy codebases to help understand unfamiliar code. Use when onboarding to a new project, planning refactoring efforts, assessing code for acquisition/migration, or generating documentation for undocumented systems. Analyzes architecture, dependencies, code quality issues, and test coverage. Creates high-level overviews with architecture diagrams, key components, entry points, and actionable insights for understanding and improving legacy code.
Install with Tessl CLI
npx tessl i github:ArabelaTso/Skills-4-SE --skill legacy-code-summarizer79
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Analyze and summarize legacy codebases to quickly understand their structure, quality, and improvement opportunities.
This skill helps understand legacy code by:
Get an overview of the project structure and size.
Initial Questions:
Commands to Run:
# Count lines of code
find . -name "*.py" | xargs wc -l | tail -1 # Python
find . -name "*.java" | xargs wc -l | tail -1 # Java
# Count files
find . -name "*.py" | wc -l
find . -name "*.java" | wc -l
# Directory structure
tree -L 3 -I '__pycache__|node_modules|target|build'
# Or without tree command
find . -type d -not -path '*/\.*' | head -20Identify Project Type:
Find where execution starts and main workflows.
Common Entry Points:
Python:
# Find main entry points
grep -r "if __name__ == '__main__':" --include="*.py"
# Find Flask/Django apps
grep -r "app = Flask\|application = " --include="*.py"
grep -r "INSTALLED_APPS\|MIDDLEWARE" --include="*.py"
# Find CLI entry points (setup.py, pyproject.toml)
grep -A 10 "entry_points\|console_scripts" setup.py pyproject.tomlJava:
# Find main methods
grep -r "public static void main" --include="*.java"
# Find Spring Boot applications
grep -r "@SpringBootApplication" --include="*.java"
# Find servlets
grep -r "extends HttpServlet\|@WebServlet" --include="*.java"JavaScript/TypeScript:
# Check package.json for entry points
cat package.json | grep -A 5 "main\|scripts"
# Find Express apps
grep -r "app = express()\|express()" --include="*.js" --include="*.ts"
# Find React entry points
find . -name "index.js" -o -name "index.tsx" -o -name "App.js"Understand the high-level structure and key modules.
Analyze Directory Structure:
# List top-level directories
ls -d */ | head -20
# Common patterns to look for:
# - src/ or lib/ (source code)
# - tests/ or test/ (test files)
# - config/ (configuration)
# - docs/ (documentation)
# - scripts/ (utility scripts)
# - models/ or entities/ (data models)
# - views/ or templates/ (UI)
# - controllers/ or handlers/ (business logic)
# - services/ or api/ (external services)
# - utils/ or helpers/ (utilities)Identify Architecture Pattern:
Common patterns in legacy code:
See references/architecture_patterns.md for detailed pattern identification.
Create Architecture Diagram:
Example Web Application Architecture:
┌─────────────────────────────────────────┐
│ Frontend (React) │
│ - components/ │
│ - pages/ │
│ - hooks/ │
└───────────────┬─────────────────────────┘
│ API Calls
↓
┌─────────────────────────────────────────┐
│ API Layer (Flask/Express) │
│ - routes/ │
│ - middleware/ │
└───────────────┬─────────────────────────┘
│
↓
┌─────────────────────────────────────────┐
│ Business Logic │
│ - services/ │
│ - controllers/ │
└───────────────┬─────────────────────────┘
│
↓
┌─────────────────────────────────────────┐
│ Data Layer │
│ - models/ │
│ - repositories/ │
└───────────────┬─────────────────────────┘
│
↓
┌─────────────────────────────────────────┐
│ Database (PostgreSQL/MongoDB) │
└─────────────────────────────────────────┘Map module relationships and identify coupling issues.
Find Direct Dependencies:
Python:
# Find imports in all Python files
grep -rh "^import \|^from " --include="*.py" | sort | uniq
# Analyze requirements
cat requirements.txt
# Or from setup.py
grep -A 20 "install_requires" setup.pyJava:
# Analyze Maven dependencies
cat pom.xml | grep -A 3 "<dependency>"
# Or Gradle
cat build.gradle | grep -A 3 "implementation\|compile"
# Find imports in code
grep -rh "^import " --include="*.java" | sort | uniq | head -50JavaScript:
# Analyze package.json
cat package.json | grep -A 50 "dependencies"
# Find imports
grep -rh "^import \|require(" --include="*.js" --include="*.ts" | head -50Create Dependency Map:
Key Internal Dependencies:
auth module
├─ depends on: user_model, database, config
└─ used by: api_routes, admin_panel
user_model
├─ depends on: database, validators
└─ used by: auth, profile, admin
payment module
├─ depends on: user_model, external_api, logger
└─ used by: checkout, subscription
Circular dependencies detected:
⚠️ module_a → module_b → module_c → module_aSee references/dependency_analysis.md for tools and techniques.
Detect technical debt, code smells, and improvement opportunities.
Common Quality Issues to Look For:
1. Large Files (God Objects)
# Find files over 500 lines
find . -name "*.py" -exec wc -l {} \; | awk '$1 > 500' | sort -rn
# Find files over 1000 lines (serious issue)
find . -name "*.java" -exec wc -l {} \; | awk '$1 > 1000' | sort -rn2. Dead Code
# Find unused imports (Python - requires tools)
# Install: pip install autoflake
find . -name "*.py" -exec autoflake --check {} \;
# Find TODO/FIXME comments
grep -rn "TODO\|FIXME\|HACK\|XXX" --include="*.py" --include="*.java"3. Code Duplication
# Find duplicate code (requires tool)
# Install: pip install pylint
pylint --disable=all --enable=duplicate-code src/
# Or use PMD for Java
# pmd cpd --minimum-tokens 100 --files src/4. Complex Functions
# Find long functions (crude check - look for large blocks)
# Python: Look for functions with many lines between def and next def
# Java: Look for methods with many lines between { and }
# Use complexity tools for accurate analysis:
# Python: radon cc src/ -a
# Java: Use PMD or Checkstyle5. Missing Documentation
# Find functions without docstrings (Python)
grep -A 1 "^def " --include="*.py" -r . | grep -v '"""' | grep -v "'''"
# Find classes without documentation (Java)
grep -B 1 "^public class\|^class " --include="*.java" -r . | grep -v "/\*\*" | grep -v "//"6. Outdated Patterns
Look for:
print "hello", raw_input())See references/code_quality_checklist.md for comprehensive quality checks.
Identify testing gaps and quality of existing tests.
Find Tests:
# Python tests
find . -name "test_*.py" -o -name "*_test.py"
ls tests/ test/
# Java tests
find . -name "*Test.java" -o -name "*Tests.java"
ls src/test/
# JavaScript tests
find . -name "*.test.js" -o -name "*.spec.js" -o -name "*.test.ts"Calculate Test Coverage:
Python:
# Install coverage tool
pip install pytest-cov
# Run tests with coverage
pytest --cov=src --cov-report=term-missing
# Generate HTML report
pytest --cov=src --cov-report=html
open htmlcov/index.htmlJava:
# Maven with JaCoCo
mvn clean test jacoco:report
# View report
open target/site/jacoco/index.htmlJavaScript:
# Jest with coverage
npm test -- --coverage
# View report
open coverage/lcov-report/index.htmlAssess Test Quality:
Quality Checklist:
- [ ] Unit tests exist for core business logic
- [ ] Integration tests cover key workflows
- [ ] Tests are readable and maintainable
- [ ] Tests run quickly (< 10 seconds for unit tests)
- [ ] Mocking is used appropriately
- [ ] Edge cases are tested
- [ ] Tests don't depend on external services (or are mocked)
- [ ] Coverage > 70% for critical modulesCreate actionable documentation for the team.
Summary Template:
# Legacy Codebase Summary: [Project Name]
## Executive Summary
[2-3 sentence overview of what the codebase does]
**Key Metrics:**
- Lines of Code: [X]
- Number of Files: [Y]
- Primary Language: [Language]
- Test Coverage: [Z%]
- Last Major Update: [Date]
## Architecture Overview
### High-Level Structure
[Include architecture diagram from Step 3]
### Key Components
1. **[Component Name]** (`path/to/component/`)
- **Purpose:** [What it does]
- **Entry Point:** [Main file/class]
- **Dependencies:** [Key dependencies]
- **Lines of Code:** [X]
2. **[Component Name]** (`path/to/component/`)
- **Purpose:** [What it does]
- **Entry Point:** [Main file/class]
- **Dependencies:** [Key dependencies]
- **Lines of Code:** [X]
[Repeat for 5-10 key components]
### Technology Stack
**Core Technologies:**
- [Language] [Version]
- [Framework] [Version]
- [Database] [Version]
**Key Dependencies:**
- [Library 1] - [Purpose]
- [Library 2] - [Purpose]
- [Library 3] - [Purpose]
## Entry Points and Workflows
### Main Entry Points
1. **[Entry Point Name]** - `path/to/file.py:function()`
- **Purpose:** [What it does]
- **Triggered by:** [User action, cron, API call, etc.]
2. **[Entry Point Name]** - `path/to/file.java:main()`
- **Purpose:** [What it does]
- **Triggered by:** [How it's invoked]
### Critical Workflows
**Workflow 1: [Name]** (e.g., User Registration)**Workflow 2: [Name]** (e.g., Payment Processing)[Step-by-step flow]
## Dependency Analysis
### External Dependencies
**Total Dependencies:** [X]
**Outdated Dependencies (require updates):**
- [Library Name] [Current Version] → [Latest Version]
- [Library Name] [Current Version] → [Latest Version]
**Deprecated Dependencies (require replacement):**
- [Library Name] - Deprecated since [Date]
- **Suggested Replacement:** [New Library]
### Internal Dependencies
**Highly Coupled Modules (>5 dependencies):**
- `module_a` - depends on [X] modules
- `module_b` - depends on [Y] modules
**Circular Dependencies:**
- ⚠️ `auth` → `user` → `auth`
- ⚠️ `order` → `payment` → `order`
## Code Quality Assessment
### Metrics Summary
- **Average File Size:** [X] lines
- **Largest File:** `path/to/file.py` ([X] lines) ⚠️
- **TODO/FIXME Comments:** [X] occurrences
- **Code Duplication:** [Low/Medium/High]
### Quality Issues
**Critical Issues (Fix Immediately):**
1. **Security Vulnerability:** SQL injection in `path/to/file.py:45`
2. **Large File:** `god_class.java` (2,500 lines) - violates SRP
3. **Circular Dependency:** [Details]
**High Priority (Address Soon):**
1. **No Error Handling:** Missing try/catch in payment module
2. **Hardcoded Credentials:** Found in `config/settings.py`
3. **Deprecated API:** Using old authentication library
**Medium Priority (Technical Debt):**
1. **Code Duplication:** Copy-pasted validation logic in 5 files
2. **Missing Documentation:** 60% of functions lack docstrings
3. **Long Methods:** 15 methods exceed 100 lines
**Low Priority (Improvements):**
1. **Outdated Naming:** Inconsistent variable names
2. **Missing Type Hints:** (Python) or generics (Java)
3. **Verbose Code:** Could be simplified with modern patterns
### Code Smells Detected
- **God Objects:** [List large classes/modules]
- **Feature Envy:** [Methods accessing other objects' data frequently]
- **Dead Code:** [Unused functions/classes]
- **Magic Numbers:** [Hardcoded values without constants]
## Test Coverage Analysis
### Coverage Summary
- **Overall Coverage:** [X%]
- **Critical Modules Coverage:**
- auth module: [Y%]
- payment module: [Z%]
- user management: [W%]
### Testing Gaps
**Untested Critical Code:**
1. `payment/processor.py` - 0% coverage ⚠️
2. `auth/security.py` - 30% coverage
3. `api/routes.py` - 45% coverage
**Missing Test Types:**
- [ ] No integration tests for payment flow
- [ ] No end-to-end tests for user journey
- [ ] No performance/load tests
### Test Quality Issues
- **Slow Tests:** 20 tests take >5 seconds each
- **Flaky Tests:** `test_async_operation` fails intermittently
- **Coupled Tests:** Tests depend on database state
## Recommendations
### Immediate Actions (This Sprint)
1. **Fix Security Issues**
- Patch SQL injection vulnerability in `auth/login.py`
- Remove hardcoded credentials, use environment variables
2. **Add Critical Tests**
- Write integration tests for payment processor
- Add unit tests for authentication logic
3. **Break Circular Dependencies**
- Refactor `auth` ↔ `user` circular dependency
- Extract shared code to new `common` module
### Short-Term Improvements (This Quarter)
1. **Reduce Technical Debt**
- Refactor `god_class.java` into 3-4 focused classes
- Eliminate code duplication in validation logic
- Update deprecated dependencies
2. **Improve Documentation**
- Add docstrings to all public functions
- Create architecture diagram
- Document deployment process
3. **Enhance Test Coverage**
- Achieve 70% coverage for core modules
- Add integration tests for critical workflows
- Set up CI/CD with automated testing
### Long-Term Improvements (This Year)
1. **Architectural Refactoring**
- Extract microservices for payment and notification
- Implement proper layering (separate business logic from data access)
- Introduce dependency injection for better testability
2. **Modernization**
- Upgrade to [Language] [Latest Version]
- Adopt modern patterns (async/await, type hints, etc.)
- Migrate from [Old Framework] to [New Framework]
3. **Quality Infrastructure**
- Set up automated code quality checks (linting, complexity analysis)
- Implement pre-commit hooks
- Add performance monitoring
## Quick Reference
### Key Files to Understand First
1. `path/to/main.py` - Application entry point
2. `path/to/config.py` - Configuration
3. `path/to/models/user.py` - Core data model
4. `path/to/api/routes.py` - API endpoints
5. `path/to/services/auth_service.py` - Authentication logic
### Common Commands
```bash
# Start application
[command]
# Run tests
[command]
# Build for production
[command]
# Deploy
[command]## Summary Output Examples
### Example 1: Small Python Flask App
```markdown
# Legacy Codebase Summary: Internal Dashboard
## Executive Summary
Internal dashboard for monitoring application metrics, built with Flask.
Provides real-time data visualization and alerting for operations team.
**Key Metrics:**
- Lines of Code: 3,500
- Number of Files: 42
- Primary Language: Python 3.7
- Test Coverage: 45%
- Last Major Update: 18 months ago
## Architecture Overview
Simple Flask application with SQLAlchemy ORM and PostgreSQL database.┌─────────────────┐ │ Flask Routes │ │ (app/routes/) │ └────────┬────────┘ │ ↓ ┌─────────────────┐ │ Services │ │ (app/services/)│ └────────┬────────┘ │ ↓ ┌─────────────────┐ │ Models │ │ (app/models/) │ └────────┬────────┘ │ ↓ ┌─────────────────┐ │ PostgreSQL DB │ └─────────────────┘
### Key Components
1. **Metrics Dashboard** (`app/routes/dashboard.py`)
- Purpose: Display real-time metrics
- Entry Point: `dashboard_view()`
- Dependencies: metrics_service, chart_generator
- Lines of Code: 250
2. **Data Collection** (`app/services/collector.py`)
- Purpose: Fetch metrics from external APIs
- Entry Point: `collect_metrics()` (cron job)
- Dependencies: requests, database models
- Lines of Code: 180
3. **Alert System** (`app/services/alerts.py`)
- Purpose: Send notifications when thresholds exceeded
- Entry Point: `check_alerts()` (background task)
- Dependencies: email_service, metrics_service
- Lines of Code: 150
## Recommendations
### Immediate Actions
1. Update Flask to latest version (security patches)
2. Add tests for alert system (currently 0% coverage)
3. Fix hardcoded database credentials
### Short-Term
1. Increase test coverage to 70%
2. Add API documentation
3. Refactor large dashboard route (300+ lines)# Legacy Codebase Summary: E-Commerce Platform
## Executive Summary
Full-featured e-commerce platform handling product catalog, orders, payments,
and customer management. Serves 100K+ daily active users.
**Key Metrics:**
- Lines of Code: 185,000
- Number of Files: 1,240
- Primary Language: Java 8
- Test Coverage: 62%
- Last Major Update: 6 months ago
## Architecture Overview
Layered Spring Boot application with microservice patterns emerging.
[Detailed architecture diagram showing layers]
### Critical Issues Identified
**High Priority:**
1. **Memory Leak:** Order processing service shows increasing heap usage
2. **N+1 Query Problem:** Product listing generates 500+ DB queries
3. **No Monitoring:** Missing APM tools for production
**Modernization Opportunities:**
1. Migrate to Java 17 (LTS)
2. Extract payment service as microservice
3. Implement caching layer (Redis)
## Recommendations
[Detailed phased approach to refactoring]references/architecture_patterns.md - Common architectural patterns in legacy systems and how to identify themreferences/dependency_analysis.md - Tools and techniques for analyzing module dependencies and couplingreferences/code_quality_checklist.md - Comprehensive checklist for assessing code quality and technical debt| Task | Command/Approach |
|---|---|
| Count LOC | find . -name "*.py" | xargs wc -l |
| Find entry points | grep -r "if __name__ == '__main__'" |
| Analyze imports | grep -rh "^import |^from " | sort | uniq |
| Find large files | find . -name "*.py" -exec wc -l {} \\; | sort -rn |
| Test coverage | pytest --cov=src --cov-report=term |
| Find TODOs | grep -rn "TODO|FIXME" |
0f00a4f
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.