Split the 630-line README.md into focused, well-organized documentation: - README.md: Concise overview with quick start and links - docs/INSTALLATION.md: Installation instructions and setup - docs/CONFIGURATION.md: Configuration options and custom categories - docs/USAGE.md: Command-line options and usage examples - docs/HOW_IT_WORKS.md: Architecture and internal processes - docs/TROUBLESHOOTING.md: Common issues and solutions - docs/DEVELOPMENT.md: Project structure and development guide - docs/CONTRIBUTING.md: Contribution guidelines and standards Benefits: - Main README is now clean and welcoming (~150 lines vs 630) - Each doc has a clear, focused purpose - Better navigation with cross-linking between docs - Follows GitHub best practices with docs/ directory - Easier to maintain and update specific sections
503 lines
12 KiB
Markdown
503 lines
12 KiB
Markdown
# Development Guide
|
|
|
|
This guide is for developers who want to understand NoEntropy's codebase, contribute features, or extend functionality.
|
|
|
|
## Project Structure
|
|
|
|
NoEntropy follows a clean modular architecture for better maintainability and testability:
|
|
|
|
```
|
|
noentropy/
|
|
├── .github/
|
|
│ └── workflows/
|
|
│ └── rust.yml # CI/CD workflow
|
|
├── docs/ # Documentation files
|
|
│ ├── INSTALLATION.md
|
|
│ ├── CONFIGURATION.md
|
|
│ ├── USAGE.md
|
|
│ ├── HOW_IT_WORKS.md
|
|
│ ├── TROUBLESHOOTING.md
|
|
│ ├── DEVELOPMENT.md # This file
|
|
│ └── CONTRIBUTING.md
|
|
├── src/
|
|
│ ├── cli/
|
|
│ │ ├── mod.rs # CLI module exports
|
|
│ │ ├── args.rs # Command-line argument definitions
|
|
│ │ └── orchestrator.rs # Organization & undo orchestration
|
|
│ ├── files/
|
|
│ │ ├── mod.rs # File module exports
|
|
│ │ ├── batch.rs # File batch processing
|
|
│ │ ├── detector.rs # File type detection
|
|
│ │ ├── mover.rs # File moving operations
|
|
│ │ └── undo.rs # Undo file operations
|
|
│ ├── gemini/
|
|
│ │ ├── mod.rs # Gemini API module exports
|
|
│ │ ├── client.rs # Gemini API client
|
|
│ │ ├── errors.rs # Gemini error handling
|
|
│ │ ├── prompt.rs # AI prompt construction
|
|
│ │ └── types.rs # Gemini API types
|
|
│ ├── models/
|
|
│ │ ├── mod.rs # Data models exports
|
|
│ │ ├── metadata.rs # File metadata structures
|
|
│ │ ├── move_record.rs # File move tracking
|
|
│ │ └── organization.rs # Organization plan structures
|
|
│ ├── settings/
|
|
│ │ ├── mod.rs # Settings module exports
|
|
│ │ ├── config.rs # Configuration management
|
|
│ │ ├── prompt.rs # Interactive configuration prompts
|
|
│ │ └── tests.rs # Settings tests
|
|
│ ├── storage/
|
|
│ │ ├── mod.rs # Storage module exports
|
|
│ │ ├── cache.rs # Caching system
|
|
│ │ └── undo_log.rs # Undo log management
|
|
│ ├── main.rs # Application entry point
|
|
│ └── lib.rs # Library exports
|
|
├── Cargo.toml # Dependencies and project metadata
|
|
├── Cargo.lock # Dependency lock file
|
|
├── config.example.toml # Configuration template
|
|
└── README.md # Project overview
|
|
```
|
|
|
|
## Module Overview
|
|
|
|
### cli/
|
|
**Purpose**: Command-line interface and orchestration logic
|
|
|
|
- **args.rs**: Defines CLI arguments using `clap` crate
|
|
- Command-line flags and options
|
|
- Argument validation
|
|
- Help text generation
|
|
|
|
- **orchestrator.rs**: Main orchestration logic
|
|
- Coordinates file scanning, categorization, and moving
|
|
- Handles undo operations
|
|
- Manages user interaction and confirmation
|
|
|
|
### files/
|
|
**Purpose**: File detection, batching, moving, and undo operations
|
|
|
|
- **batch.rs**: Batch file processing
|
|
- Groups files for processing
|
|
- Manages concurrent operations
|
|
- Handles batch-level errors
|
|
|
|
- **detector.rs**: File type detection
|
|
- Identifies text files that can be inspected
|
|
- Determines file categories by extension
|
|
- Supports 30+ text file formats
|
|
|
|
- **mover.rs**: File moving operations
|
|
- Moves files to destination folders
|
|
- Creates directories as needed
|
|
- Handles file operation errors
|
|
|
|
- **undo.rs**: Undo functionality
|
|
- Reverses file moves
|
|
- Handles conflicts and edge cases
|
|
- Cleans up empty directories
|
|
|
|
### gemini/
|
|
**Purpose**: Google Gemini API integration
|
|
|
|
- **client.rs**: API client implementation
|
|
- Makes HTTP requests to Gemini API
|
|
- Handles authentication
|
|
- Manages response parsing
|
|
|
|
- **errors.rs**: Error types and handling
|
|
- Custom error types for API failures
|
|
- Error conversion and propagation
|
|
- User-friendly error messages
|
|
|
|
- **prompt.rs**: Prompt engineering
|
|
- Constructs prompts for categorization
|
|
- Handles custom vs default categories
|
|
- Formats file lists for API
|
|
|
|
- **types.rs**: API request/response types
|
|
- Serialization/deserialization structures
|
|
- Type-safe API communication
|
|
- JSON parsing
|
|
|
|
### models/
|
|
**Purpose**: Core data structures
|
|
|
|
- **metadata.rs**: File metadata
|
|
- File size, modification time
|
|
- Used for cache invalidation
|
|
- Lightweight file tracking
|
|
|
|
- **move_record.rs**: File move tracking
|
|
- Records source and destination paths
|
|
- Tracks move status (completed/undone/failed)
|
|
- Used by undo system
|
|
|
|
- **organization.rs**: Organization plan
|
|
- Represents categorization decisions
|
|
- Maps files to destination paths
|
|
- Used for preview and execution
|
|
|
|
### settings/
|
|
**Purpose**: Configuration management
|
|
|
|
- **config.rs**: Configuration loading/saving
|
|
- Reads from TOML file
|
|
- Validates configuration
|
|
- Provides default values
|
|
|
|
- **prompt.rs**: Interactive setup
|
|
- Prompts user for missing config
|
|
- Validates user input
|
|
- Saves configuration
|
|
|
|
- **tests.rs**: Configuration tests
|
|
- Unit tests for config parsing
|
|
- Validation tests
|
|
- Edge case handling
|
|
|
|
### storage/
|
|
**Purpose**: Persistent data layer
|
|
|
|
- **cache.rs**: Caching system
|
|
- Stores API responses
|
|
- Implements expiry logic (7 days)
|
|
- LRU eviction when limit reached
|
|
|
|
- **undo_log.rs**: Undo log management
|
|
- Persists move records
|
|
- Implements retention policy (30 days)
|
|
- Handles log corruption
|
|
|
|
## Building and Testing
|
|
|
|
### Build in Debug Mode
|
|
|
|
For development with debug symbols:
|
|
```bash
|
|
cargo build
|
|
```
|
|
|
|
Binary location: `./target/debug/noentropy`
|
|
|
|
### Build in Release Mode
|
|
|
|
Optimized build for production:
|
|
```bash
|
|
cargo build --release
|
|
```
|
|
|
|
Binary location: `./target/release/noentropy`
|
|
|
|
### Run Tests
|
|
|
|
Run all unit tests:
|
|
```bash
|
|
cargo test
|
|
```
|
|
|
|
Run tests with output:
|
|
```bash
|
|
cargo test -- --nocapture
|
|
```
|
|
|
|
Run specific test:
|
|
```bash
|
|
cargo test test_name
|
|
```
|
|
|
|
### Run with Debug Output
|
|
|
|
Enable verbose logging:
|
|
```bash
|
|
RUST_LOG=debug cargo run
|
|
```
|
|
|
|
### Run Clippy (Linting)
|
|
|
|
Check for common mistakes and code quality:
|
|
```bash
|
|
cargo clippy
|
|
```
|
|
|
|
Fix auto-fixable issues:
|
|
```bash
|
|
cargo clippy --fix
|
|
```
|
|
|
|
### Check Code
|
|
|
|
Fast check without building:
|
|
```bash
|
|
cargo check
|
|
```
|
|
|
|
### Format Code
|
|
|
|
Format code according to Rust standards:
|
|
```bash
|
|
cargo fmt
|
|
```
|
|
|
|
Check formatting without changing files:
|
|
```bash
|
|
cargo fmt -- --check
|
|
```
|
|
|
|
## Key Dependencies
|
|
|
|
NoEntropy uses these main dependencies (see `Cargo.toml` for full list):
|
|
|
|
- **clap**: Command-line argument parsing
|
|
- **tokio**: Async runtime for concurrent operations
|
|
- **reqwest**: HTTP client for API requests
|
|
- **serde**: Serialization/deserialization
|
|
- **serde_json**: JSON parsing
|
|
- **toml**: TOML configuration parsing
|
|
- **directories**: Cross-platform directory paths
|
|
|
|
## Development Workflow
|
|
|
|
### 1. Set Up Development Environment
|
|
|
|
```bash
|
|
# Clone repository
|
|
git clone https://github.com/glitchySid/noentropy.git
|
|
cd noentropy
|
|
|
|
# Build project
|
|
cargo build
|
|
|
|
# Run tests
|
|
cargo test
|
|
|
|
# Check code quality
|
|
cargo clippy
|
|
```
|
|
|
|
### 2. Create Feature Branch
|
|
|
|
```bash
|
|
git checkout -b feature/your-feature-name
|
|
```
|
|
|
|
### 3. Make Changes
|
|
|
|
- Write code following Rust conventions
|
|
- Add tests for new functionality
|
|
- Update documentation as needed
|
|
- Run `cargo fmt` before committing
|
|
|
|
### 4. Test Changes
|
|
|
|
```bash
|
|
# Run tests
|
|
cargo test
|
|
|
|
# Test manually with your changes
|
|
cargo run -- --dry-run
|
|
|
|
# Check for issues
|
|
cargo clippy
|
|
```
|
|
|
|
### 5. Commit and Push
|
|
|
|
```bash
|
|
git add .
|
|
git commit -m "Add feature: description"
|
|
git push origin feature/your-feature-name
|
|
```
|
|
|
|
### 6. Create Pull Request
|
|
|
|
- Go to GitHub and create a pull request
|
|
- Describe your changes clearly
|
|
- Reference any related issues
|
|
- Wait for review and feedback
|
|
|
|
## Code Style Guidelines
|
|
|
|
### Rust Conventions
|
|
|
|
- Follow standard Rust naming conventions
|
|
- Use `snake_case` for functions and variables
|
|
- Use `PascalCase` for types and traits
|
|
- Use `SCREAMING_SNAKE_CASE` for constants
|
|
|
|
### Documentation
|
|
|
|
- Add doc comments (`///`) for public APIs
|
|
- Explain complex logic with inline comments
|
|
- Update relevant documentation files
|
|
|
|
### Error Handling
|
|
|
|
- Use `Result` types for operations that can fail
|
|
- Provide meaningful error messages
|
|
- Avoid panicking in library code
|
|
|
|
### Testing
|
|
|
|
- Write unit tests for new functionality
|
|
- Test error cases and edge conditions
|
|
- Use descriptive test names
|
|
|
|
## Common Development Tasks
|
|
|
|
### Adding a New CLI Flag
|
|
|
|
1. Edit `src/cli/args.rs`
|
|
2. Add new field to `Args` struct
|
|
3. Add clap attribute for the flag
|
|
4. Update `orchestrator.rs` to use the flag
|
|
|
|
### Adding a New File Type
|
|
|
|
1. Edit `src/files/detector.rs`
|
|
2. Add extension to relevant detection function
|
|
3. Add tests for the new file type
|
|
|
|
### Modifying AI Prompts
|
|
|
|
1. Edit `src/gemini/prompt.rs`
|
|
2. Update prompt construction logic
|
|
3. Test with various file types
|
|
4. Consider token limits
|
|
|
|
### Adding New Configuration Options
|
|
|
|
1. Edit `src/settings/config.rs`
|
|
2. Add field to `Config` struct
|
|
3. Update serialization/deserialization
|
|
4. Add validation if needed
|
|
5. Update `config.example.toml`
|
|
6. Update documentation
|
|
|
|
## Debugging Tips
|
|
|
|
### Enable Detailed Logging
|
|
|
|
```bash
|
|
RUST_LOG=debug cargo run
|
|
```
|
|
|
|
### Use Rust Backtrace
|
|
|
|
```bash
|
|
RUST_BACKTRACE=1 cargo run
|
|
```
|
|
|
|
Full backtrace:
|
|
```bash
|
|
RUST_BACKTRACE=full cargo run
|
|
```
|
|
|
|
### Debug Specific Modules
|
|
|
|
```bash
|
|
RUST_LOG=noentropy::gemini=debug cargo run
|
|
```
|
|
|
|
### Print Debugging
|
|
|
|
Use `dbg!()` macro for quick debugging:
|
|
```rust
|
|
dbg!(&some_variable);
|
|
```
|
|
|
|
## Performance Profiling
|
|
|
|
### Basic Timing
|
|
|
|
```bash
|
|
time cargo run --release
|
|
```
|
|
|
|
### CPU Profiling (Linux)
|
|
|
|
```bash
|
|
cargo install flamegraph
|
|
sudo cargo flamegraph
|
|
```
|
|
|
|
### Memory Profiling
|
|
|
|
```bash
|
|
cargo install cargo-instruments
|
|
cargo instruments -t Allocations
|
|
```
|
|
|
|
## Continuous Integration
|
|
|
|
NoEntropy uses GitHub Actions for CI/CD (see `.github/workflows/rust.yml`):
|
|
|
|
- Runs on push and pull requests
|
|
- Tests on Linux, macOS, and Windows
|
|
- Checks formatting, linting, and tests
|
|
- Builds release binaries
|
|
|
|
## Release Process
|
|
|
|
1. Update version in `Cargo.toml`
|
|
2. Update `CHANGELOG.md` (if exists)
|
|
3. Create git tag: `git tag v1.0.0`
|
|
4. Push tag: `git push origin v1.0.0`
|
|
5. GitHub Actions builds and publishes release
|
|
|
|
## Architecture Decisions
|
|
|
|
### Why Rust?
|
|
|
|
- Performance: Fast file operations and concurrent processing
|
|
- Safety: Memory safety without garbage collection
|
|
- Ecosystem: Great libraries for CLI, HTTP, and serialization
|
|
|
|
### Why Tokio?
|
|
|
|
- Async/await support for concurrent API calls
|
|
- Efficient handling of I/O operations
|
|
- Industry-standard async runtime
|
|
|
|
### Why Gemini API?
|
|
|
|
- Powerful language understanding
|
|
- Good at semantic file categorization
|
|
- Reasonable rate limits and pricing
|
|
|
|
### Why Caching?
|
|
|
|
- Reduces API costs
|
|
- Improves performance for repeated runs
|
|
- Minimizes redundant analysis
|
|
|
|
## Future Enhancements
|
|
|
|
Ideas for future development:
|
|
|
|
- [ ] Custom AI model support (OpenAI, Claude, etc.)
|
|
- [ ] GUI version (desktop application)
|
|
- [ ] Watch mode (automatic organization on file creation)
|
|
- [ ] Plugin system for custom organization rules
|
|
- [ ] Dry-run improvements with more detailed analysis
|
|
- [ ] Better conflict resolution strategies
|
|
- [ ] Multi-folder support
|
|
- [ ] Rule-based organization (in addition to AI)
|
|
|
|
## Getting Help
|
|
|
|
- Read existing documentation
|
|
- Check GitHub issues for similar questions
|
|
- Ask in GitHub Discussions
|
|
- Contact maintainers
|
|
|
|
## Resources
|
|
|
|
- [Rust Book](https://doc.rust-lang.org/book/)
|
|
- [Tokio Tutorial](https://tokio.rs/tokio/tutorial)
|
|
- [Clap Documentation](https://docs.rs/clap/)
|
|
- [Serde Guide](https://serde.rs/)
|
|
|
|
---
|
|
|
|
[Back to Main README](../README.md) | [Contributing Guide](CONTRIBUTING.md)
|