feat: Add comprehensive improvements - CLI, error handling, and docs

- Add CLI argument parsing with clap (dry-run, max-concurrent options)
- Replace .env configuration with interactive prompts and TOML config
- Add BaseDirs-based configuration storage in ~/.config/noentropy/
- Improve Gemini API client with configurable model and timeout
- Add concurrent processing with semaphore for rate limiting
- Improve error handling with retry logic and exponential backoff
- Add comprehensive README with installation and usage instructions
- Add config.example.toml template for users
- Update main.rs with better UX and colored output
- Add lib.rs exports for config module
- Refactor error response parsing for cleaner code
- Update API endpoint to use configurable model parameter
- Add proper error type handling in gemini_errors.rs
This commit is contained in:
2025-12-29 00:11:27 +05:30
parent bbf88fc4fc
commit 3cdcd33439
6 changed files with 530 additions and 92 deletions

368
README.md Normal file
View File

@@ -0,0 +1,368 @@
# NoEntropy 🗂️
> AI-powered file organizer that intelligently sorts your messy Downloads folder using Google Gemini API
![Rust](https://img.shields.io/badge/rust-2024-orange)
![License](https://img.shields.io/badge/license-MIT-blue)
![Platform](https://img.shields.io/badge/platform-Linux%20%7C%20macOS%20%7C%20Windows-lightgrey)
## About
NoEntropy is a smart command-line tool that organizes your cluttered Downloads folder automatically. It uses Google's Gemini AI to analyze files, understand their content, and categorize them into organized folder structures. Say goodbye to manually sorting through hundreds of downloads!
### Use Cases
- 📂 Organize a messy Downloads folder
- 🤖 Auto-categorize downloaded files by type and content
- 🔍 Smart sub-folder creation based on file content
- 🚀 Batch file organization without manual effort
- 💾 Reduce clutter and improve file system organization
## Features
- **🧠 AI-Powered Categorization** - Uses Google Gemini API for intelligent file sorting
- **📁 Automatic Sub-Folders** - Creates relevant sub-folders based on file content analysis
- **💨 Smart Caching** - Minimizes API calls with metadata-based caching (7-day expiry)
- **⚡ Concurrent Processing** - Parallel file inspection with configurable limits
- **👀 Dry-Run Mode** - Preview changes without moving any files
- **🔄 Retry Logic** - Exponential backoff for resilient API handling
- **📝 Text File Support** - Inspects 30+ text formats for better categorization
- **✅ Interactive Confirmation** - Review organization plan before execution
- **🎯 Configurable** - Adjust concurrency limits and model settings
## Prerequisites
- **Rust 2024 Edition** or later
- **Google Gemini API Key** - Get one at [https://ai.google.dev/](https://ai.google.dev/)
- A folder full of unorganized files to clean up!
## Installation
1. **Clone repository**
```bash
git clone https://github.com/yourusername/noentropy.git
cd noentropy
```
2. **Build the application**
```bash
cargo build --release
```
3. **Run the application**
On first run, NoEntropy will guide you through interactive setup:
```bash
./target/release/noentropy
```
Or manually create config file at `~/.config/noentropy/config.toml`:
```bash
cp config.example.toml ~/.config/noentropy/config.toml
nano ~/.config/noentropy/config.toml
```
## Configuration
NoEntropy stores configuration in `~/.config/noentropy/config.toml` following XDG Base Directory specifications.
### Configuration File Format
```toml
api_key = "AIzaSyDTEhAq414SHY094A5oy5lxNA0vhbY1O3k"
download_folder = "/home/user/Downloads"
```
| Setting | Description | Example |
|---------|-------------|---------|
| `api_key` | Your Google Gemini API key | `AIzaSy...` |
| `download_folder` | Path to folder to organize | `/home/user/Downloads` |
### Getting a Gemini API Key
1. Visit [Google AI Studio](https://ai.google.dev/)
2. Sign in with your Google account
3. Create a new API key
4. Copy the key to your configuration file
### Interactive Setup
NoEntropy provides an interactive setup on first run:
- **Missing API key?** → You'll be prompted to enter it
- **Missing download folder?** → You'll be prompted to specify it (with default suggestion)
- **Both missing?** → You'll be guided through complete setup
Configuration is automatically saved to `~/.config/noentropy/config.toml` after interactive setup.
## Usage
### Basic Usage
Organize your Downloads folder with default settings:
```bash
cargo run --release
```
### Dry-Run Mode
Preview what would happen without moving any files:
```bash
cargo run --release -- --dry-run
```
### Custom Concurrency
Adjust the number of concurrent API calls (default: 5):
```bash
cargo run --release -- --max-concurrent 10
```
### Combined Options
Use multiple options together:
```bash
cargo run --release -- --dry-run --max-concurrent 3
```
### Command-Line Options
| Option | Short | Default | Description |
|--------|-------|---------|-------------|
| `--dry-run` | None | `false` | Preview changes without moving files |
| `--max-concurrent` | None | `5` | Maximum concurrent API requests |
| `--help` | `-h` | - | Show help message |
## How It Works
NoEntropy follows a five-step process to organize your files:
```
┌─────────────────┐
│ 1. Scan Files │ → Read all files in DOWNLOAD_FOLDER
└────────┬────────┘
┌─────────────────────────┐
│ 2. Initial Categorization │ → Ask Gemini to categorize by filename
└────────┬────────────────┘
┌──────────────────────┐
│ 3. Deep Inspection │ → Read text files for sub-categories
│ (Concurrent) │ • Reads file content
│ │ • Asks AI for sub-folder
└────────┬──────────────┘
┌──────────────────────┐
│ 4. Preview & Confirm│ → Show organization plan
│ │ • Ask user approval
└────────┬──────────────┘
┌──────────────────────┐
│ 5. Execute Moves │ → Move files to organized folders
└──────────────────────┘
```
### Example Terminal Output
```bash
$ cargo run --release
Found 47 files. Asking Gemini to organize...
Gemini Plan received! Performing deep inspection...
Reading content of report.pdf...
Reading content of config.yaml...
Reading content of script.py...
Deep inspection complete! Moving Files.....
--- EXECUTION PLAN ---
Plan: image1.png -> Images/
Plan: document.pdf -> Documents/
Plan: setup.exe -> Installers/
Plan: notes.txt -> Documents/Notes/
Plan: config.yaml -> Code/Config/
Plan: script.py -> Code/Scripts/
Do you want to apply these changes? [y/N]: y
--- MOVING FILES ---
Moved: image1.png -> Images/
Moved: document.pdf -> Documents/
Moved: setup.exe -> Installers/
Moved: notes.txt -> Documents/Notes/
Moved: config.yaml -> Code/Config/
Moved: script.py -> Code/Scripts/
Organization Complete!
Files moved: 47, Errors: 0
Done!
```
## Supported Categories
NoEntropy organizes files into these categories:
| Category | Description |
|----------|-------------|
| **Images** | PNG, JPG, GIF, SVG, etc. |
| **Documents** | PDF, DOC, DOCX, TXT, MD, etc. |
| **Installers** | EXE, DMG, APP, PKG, etc. |
| **Music** | MP3, WAV, FLAC, M4A, etc. |
| **Archives** | ZIP, TAR, RAR, 7Z, etc. |
| **Code** | Source code and configuration files |
| **Misc** | Everything else |
## Supported Text Formats
NoEntropy can read and analyze the content of 30+ text file formats:
```
Source Code: rs, py, js, ts, jsx, tsx, java, go, c, cpp, h, hpp, rb, php, swift, kt, scala, lua, r, m
Web/Config: html, css, json, xml, yaml, yml, toml, ini, cfg, conf
Documentation: txt, md, sql, sh, bat, ps1, log
```
## Caching
NoEntropy includes an intelligent caching system to minimize API calls:
- **Location**: `.noentropy_cache.json` in project root
- **Expiry**: 7 days (old entries auto-removed)
- **Change Detection**: Uses file metadata (size + modification time) instead of full content hashing
- **Max Entries**: 1000 entries (oldest evicted when limit reached)
### How Caching Works
1. **First Run**: Files are analyzed and categorized via Gemini API
2. **Response Cached**: Organization plan saved with file metadata
3. **Subsequent Runs**:
- Checks if files changed (size/modification time)
- If unchanged, uses cached categorization
- If changed, re-analyzes via API
4. **Auto-Cleanup**: Removes cache entries older than 7 days
## Troubleshooting
### "API key not configured"
**Solution**: NoEntropy will prompt you for your API key on first run. Alternatively, manually create `~/.config/noentropy/config.toml`:
```toml
api_key = "your_actual_api_key"
download_folder = "/home/user/Downloads"
```
### "Download folder not configured"
**Solution**: NoEntropy will prompt you for the folder path on first run. Alternatively, manually add it to your config:
```toml
download_folder = "/path/to/your/Downloads"
```
### "API rate limit exceeded"
**Solution**:
- Wait a few minutes before trying again
- Reduce `--max-concurrent` to limit API calls
- Use caching to reduce redundant requests
### "Network error"
**Solution**:
- Check your internet connection
- Verify Gemini API service is operational
- Ensure firewall allows outbound HTTPS requests
### "Failed to move file"
**Solution**:
- Check file permissions
- Ensure destination folder is writable
- Verify source files still exist
### "Cache corrupted"
**Solution**: Delete `.noentropy_cache.json` and run again. A new cache will be created.
## Development
### Build in Debug Mode
```bash
cargo build
```
### Build in Release Mode
```bash
cargo build --release
```
### Run Tests
```bash
cargo test
```
### Run Clippy (Linting)
```bash
cargo clippy
```
### Check Code
```bash
cargo check
```
## Project Structure
```
noentropy/
├── src/
│ ├── main.rs # Entry point and CLI handling
│ ├── lib.rs # Library exports
│ ├── config.rs # Configuration management
│ ├── gemini.rs # Gemini API client
│ ├── gemini_errors.rs # Error handling
│ ├── cache.rs # Caching system
│ └── files.rs # File operations
├── Cargo.toml # Dependencies
├── config.example.toml # Configuration template
└── README.md # This file
```
## Future Enhancements
Based on community feedback, we're planning:
- [ ] **Custom Categories** - Define custom categories in `config.toml`
- [ ] **Recursive Mode** - Organize files in subdirectories with `--recursive` flag
- [ ] **Undo Functionality** - Revert file organization changes
- [ ] **Custom Models** - Support for other AI providers
- [ ] **GUI Version** - Desktop application for non-CLI users
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Built with [Rust](https://www.rust-lang.org/)
- Powered by [Google Gemini API](https://ai.google.dev/)
- Inspired by the endless struggle to keep Downloads folders organized
## Show Your Support
⭐ Star this repository if you find it useful!
---
Made with ❤️ by the NoEntropy team

9
config.example.toml Normal file
View File

@@ -0,0 +1,9 @@
# NoEntropy Configuration File
# Location: ~/.config/noentropy/config.toml
# Your Google Gemini API Key
# Get one at: https://ai.google.dev/
api_key = "your_api_key_here"
# Path to folder to organize (e.g., ~/Downloads)
download_folder = "/path/to/your/downloads"

View File

@@ -42,14 +42,28 @@ pub struct GeminiClient {
api_key: String,
client: Client,
base_url: String,
model: String,
timeout: Duration,
}
impl GeminiClient {
pub fn new(api_key: String) -> Self {
Self::with_model(api_key, "gemini-3-flash-preview".to_string())
}
pub fn with_model(api_key: String, model: String) -> Self {
Self {
api_key,
client: Client::new(),
base_url: "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent".to_string(),
client: Client::builder()
.timeout(Duration::from_secs(30))
.build()
.unwrap_or_default(),
base_url: format!(
"https://generativelanguage.googleapis.com/v1beta/models/{}:generateContent",
model
),
model,
timeout: Duration::from_secs(30),
}
}
@@ -71,10 +85,10 @@ impl GeminiClient {
let url = format!("{}?key={}", self.base_url, self.api_key);
// Check cache first if available
if let (Some(cache_ref), Some(base_path)) = (cache.as_ref(), base_path) {
if let Some(cached_response) = cache_ref.get_cached_response(&filenames, base_path) {
return Ok(cached_response);
}
if let (Some(cache_ref), Some(base_path)) = (cache.as_ref(), base_path)
&& let Some(cached_response) = cache_ref.get_cached_response(&filenames, base_path)
{
return Ok(cached_response);
}
// 1. Construct the Prompt
@@ -101,14 +115,19 @@ impl GeminiClient {
// 4. Parse
if res.status().is_success() {
let gemini_response: GeminiResponse = res.json().await.map_err(GeminiError::NetworkError)?;
let gemini_response: GeminiResponse =
res.json().await.map_err(GeminiError::NetworkError)?;
// Extract raw JSON string from Gemini using proper structs
let raw_text = &gemini_response.candidates
.get(0)
.ok_or_else(|| GeminiError::InvalidResponse("No candidates in response".to_string()))?
.content.parts
.get(0)
let raw_text = &gemini_response
.candidates
.first()
.ok_or_else(|| {
GeminiError::InvalidResponse("No candidates in response".to_string())
})?
.content
.parts
.first()
.ok_or_else(|| GeminiError::InvalidResponse("No parts in content".to_string()))?
.text;
@@ -147,6 +166,7 @@ impl GeminiClient {
) -> Result<reqwest::Response, GeminiError> {
let mut attempts = 0;
let max_attempts = 3;
let mut base_delay = Duration::from_secs(2);
loop {
attempts += 1;
@@ -160,21 +180,32 @@ impl GeminiClient {
let error = GeminiError::from_response(response).await;
if error.is_retryable() && attempts < max_attempts {
if let Some(delay) = error.retry_delay() {
println!("API Error: {}. Retrying in {} seconds (attempt {}/{})",
error, delay.as_secs(), attempts, max_attempts);
tokio::time::sleep(delay).await;
continue;
}
let delay = error.retry_delay().unwrap_or(base_delay);
println!(
"API Error: {}. Retrying in {} seconds (attempt {}/{})",
error,
delay.as_secs(),
attempts,
max_attempts
);
tokio::time::sleep(delay).await;
base_delay *= 2;
continue;
}
return Err(error);
}
Err(e) => {
if attempts < max_attempts {
println!("Network error: {}. Retrying in {} seconds (attempt {}/{})",
e, 5, attempts, max_attempts);
tokio::time::sleep(Duration::from_secs(5)).await;
println!(
"Network error: {}. Retrying in {} seconds (attempt {}/{})",
e,
base_delay.as_secs(),
attempts,
max_attempts
);
tokio::time::sleep(base_delay).await;
base_delay *= 2;
continue;
}
return Err(GeminiError::NetworkError(e));
@@ -202,27 +233,45 @@ impl GeminiClient {
}]
});
let res = self.client.post(&url).json(&request_body).send().await;
let res = match self.client.post(&url).json(&request_body).send().await {
Ok(res) => res,
Err(e) => {
eprintln!(
"Warning: Failed to get sub-category for {}: {}",
filename, e
);
return "General".to_string();
}
};
if let Ok(res) = res {
if res.status().is_success() {
let gemini_response: GeminiResponse = res.json().await.unwrap_or_default();
let sub_category = gemini_response.candidates
.get(0)
.and_then(|c| c.content.parts.get(0))
.map(|p| p.text.trim())
.unwrap_or("General")
.to_string();
if sub_category.is_empty() {
"General".to_string()
} else {
sub_category
if res.status().is_success() {
let gemini_response: GeminiResponse = match res.json().await {
Ok(r) => r,
Err(e) => {
eprintln!("Warning: Failed to parse response for {}: {}", filename, e);
return "General".to_string();
}
} else {
};
let sub_category = gemini_response
.candidates
.first()
.and_then(|c| c.content.parts.first())
.map(|p| p.text.trim())
.unwrap_or("General")
.to_string();
if sub_category.is_empty() {
"General".to_string()
} else {
sub_category
}
} else {
eprintln!(
"Warning: API returned error for {}: {}",
filename,
res.status()
);
"General".to_string()
}
}

View File

@@ -74,7 +74,6 @@ impl GeminiError {
pub async fn from_response(response: Response) -> Self {
let status = response.status();
// Try to parse error response body
let error_text = match response.text().await {
Ok(text) => text,
Err(e) => {
@@ -82,12 +81,10 @@ impl GeminiError {
}
};
// Try to parse structured error response
if let Ok(gemini_error) = serde_json::from_str::<GeminiErrorResponse>(&error_text) {
return Self::from_gemini_error(gemini_error.error, status.as_u16());
}
// Fallback to HTTP status code based errors
Self::from_status_code(status, &error_text)
}
@@ -96,13 +93,11 @@ impl GeminiError {
match error_detail.status.as_str() {
"RESOURCE_EXHAUSTED" => {
if let Some(retry_info) = details.iter().find(|d| d.retry_delay.is_some()) {
if let Some(retry_delay) = &retry_info.retry_delay {
if let Ok(seconds) = retry_delay.parse::<u32>() {
if let Some(retry_info) = details.iter().find(|d| d.retry_delay.is_some())
&& let Some(retry_delay) = &retry_info.retry_delay
&& let Ok(seconds) = retry_delay.parse::<u32>() {
return GeminiError::RateLimitExceeded { retry_after: seconds };
}
}
}
if let Some(quota_info) = details.iter().find(|d| d.quota_limit.is_some()) {
let limit = quota_info.quota_limit.as_deref().unwrap_or("unknown");
@@ -177,7 +172,7 @@ impl GeminiError {
500 => GeminiError::InternalError {
details: error_text.to_string()
},
502 | 503 | 504 => GeminiError::ServiceUnavailable {
502..=504 => GeminiError::ServiceUnavailable {
reason: error_text.to_string()
},
_ => GeminiError::ApiError {
@@ -189,14 +184,14 @@ impl GeminiError {
/// Check if this error is retryable
pub fn is_retryable(&self) -> bool {
match self {
GeminiError::RateLimitExceeded { .. } => true,
GeminiError::ServiceUnavailable { .. } => true,
GeminiError::Timeout { .. } => true,
GeminiError::NetworkError(_) => true,
GeminiError::InternalError { .. } => true,
_ => false,
}
matches!(
self,
GeminiError::RateLimitExceeded { .. }
| GeminiError::ServiceUnavailable { .. }
| GeminiError::Timeout { .. }
| GeminiError::NetworkError(_)
| GeminiError::InternalError { .. }
)
}
/// Get retry delay for retryable errors
@@ -217,10 +212,9 @@ impl GeminiError {
fn extract_model_name(message: &str) -> String {
// Try to extract model name from error message
// Example: "Model 'gemini-1.5-flash' not found"
if let Some(start) = message.find('\'') {
if let Some(end) = message[start + 1..].find('\'') {
if let Some(start) = message.find('\'')
&& let Some(end) = message[start + 1..].find('\'') {
return message[start + 1..start + 1 + end].to_string();
}
}
"unknown".to_string()
}

View File

@@ -1,4 +1,5 @@
pub mod cache;
pub mod config;
pub mod files;
pub mod gemini;
pub mod gemini_errors;

View File

@@ -1,36 +1,46 @@
use clap::Parser;
use colored::*;
use futures::future::join_all;
use noentropy::cache::Cache;
use noentropy::config;
use noentropy::files::{FileBatch, OrganizationPlan, execute_move};
use noentropy::gemini::GeminiClient;
use noentropy::gemini_errors::GeminiError;
use std::path::{Path, PathBuf};
use std::path::Path;
use std::sync::Arc;
#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Args {
#[arg(short, long, help = "Preview changes without moving files")]
dry_run: bool,
#[arg(
short,
long,
default_value_t = 5,
help = "Maximum concurrent API requests"
)]
max_concurrent: usize,
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
dotenv::dotenv().ok();
let api_key = std::env::var("GEMINI_API_KEY")
.map_err(|_| "GEMINI_API_KEY environment variable not set. Please set it in your .env file.")?;
let download_path_var = std::env::var("DOWNLOAD_FOLDER")
.map_err(|_| "DOWNLOAD_FOLDER environment variable not set. Please set it in your .env file.")?;
let args = Args::parse();
let api_key = config::get_or_prompt_api_key()?;
let download_path = config::get_or_prompt_download_folder()?;
// 1. Setup
let download_path: PathBuf = PathBuf::from(download_path_var.to_string());
let client: GeminiClient = GeminiClient::new(api_key);
// Initialize cache
let cache_path = Path::new(".noentropy_cache.json");
let mut cache = Cache::load_or_create(cache_path);
// Clean up old cache entries (older than 7 days)
cache.cleanup_old_entries(7 * 24 * 60 * 60);
// 2. Get Files
let batch = FileBatch::from_path(download_path.clone());
if batch.filenames.is_empty() {
println!("No files found to organize!");
println!("{}", "No files found to organize!".yellow());
return Ok(());
}
@@ -39,7 +49,6 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
batch.count()
);
// 3. Call Gemini for Initial Categorization
let mut plan: OrganizationPlan = match client
.organize_files_with_cache(batch.filenames, Some(&mut cache), Some(&download_path))
.await
@@ -51,22 +60,26 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
}
};
println!("Gemini Plan received! Performing deep inspection...");
println!("{}", "Gemini Plan received! Performing deep inspection...".green());
// 4. Deep Inspection - Process files concurrently
let client = Arc::new(client);
let semaphore = Arc::new(tokio::sync::Semaphore::new(args.max_concurrent));
let tasks: Vec<_> = plan.files.iter_mut()
let tasks: Vec<_> = plan
.files
.iter_mut()
.zip(batch.paths.iter())
.map(|(file_category, path)| {
let client = Arc::clone(&client);
let filename = file_category.filename.clone();
let category = file_category.category.clone();
let path = path.clone();
let semaphore = Arc::clone(&semaphore);
async move {
if noentropy::files::is_text_file(&path) {
if let Some(content) = noentropy::files::read_file_sample(&path, 2000) {
let _permit = semaphore.acquire().await.unwrap();
if let Some(content) = noentropy::files::read_file_sample(&path, 5000) {
println!("Reading content of {}...", filename.green());
client.get_ai_sub_category(&filename, &category, &content).await
} else {
@@ -79,22 +92,26 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
})
.collect();
// Wait for all concurrent tasks to complete
let sub_categories = join_all(tasks).await;
// Apply the results back to the plan
for (file_category, sub_category) in plan.files.iter_mut().zip(sub_categories) {
file_category.sub_category = sub_category;
}
println!("Deep inspection complete! Moving Files.....");
// 5. Execute
execute_move(&download_path, plan);
println!("Done!");
println!("{}", "Deep inspection complete! Moving Files.....".green());
if args.dry_run {
println!(
"{} Dry run mode - skipping file moves.",
"INFO:".cyan()
);
} else {
execute_move(&download_path, plan);
}
println!("{}", "Done!".green().bold());
// Save cache before exiting
if let Err(e) = cache.save(cache_path) {
println!("Warning: Failed to save cache: {}", e);
eprintln!("Warning: Failed to save cache: {}", e);
}
Ok(())