docs: add GitM design spec

Design document for Gitea repository sync tool with Go backend,
Vue 3 frontend, SQLite storage, and single-binary deployment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
panw
2026-03-31 15:29:49 +08:00
commit 654c8a2c85

View File

@@ -0,0 +1,271 @@
# GitM - Gitea Repository Sync Tool Design
## Overview
GitM is a cross-platform tool that synchronizes all repositories from multiple Gitea servers to local storage. It runs as a single-binary web service with an embedded Vue 3 management UI, supporting both Windows and Linux.
## Tech Stack
- **Backend**: Go + Gin
- **Frontend**: Vue 3 + Element Plus + Pinia + Vite
- **Database**: SQLite (mattn/go-sqlite3)
- **Authentication**: JWT with simple password
- **Deployment**: Single binary with embedded frontend (Go embed)
## Architecture
```
┌─────────────────────────────────────────┐
│ gitm (single binary) │
│ │
│ ┌───────────────┐ ┌───────────────┐ │
│ │ Web UI │ │ API Server │ │
│ │ (Vue3+EP) │ │ (Gin) │ │
│ │ embedded │ │ │ │
│ └───────┬───────┘ └───────┬───────┘ │
│ │ │ │
│ ┌───────┴──────────────────┴───────┐ │
│ │ Business Logic │ │
│ │ - Server management (CRUD) │ │
│ │ - Repo discovery & sync │ │
│ │ - Scheduled task runner │ │
│ │ - Sync logging │ │
│ └───────────────┬───────────────┘ │
│ │ │
│ ┌───────────────┴───────────────┐ │
│ │ SQLite Database │ │
│ └───────────────────────────────┘ │
│ │
│ ┌───────────────────────────────┐ │
│ │ Local Storage │ │
│ │ repos/ │ │
│ │ ├── server1/ │ │
│ │ │ ├── owner1/ │ │
│ │ │ │ └── repo1.git/ │ │
│ │ │ └── owner2/ │ │
│ │ └── server2/ │ │
│ └───────────────────────────────┘ │
└───────────────────────────────────────┘
```
## Data Model
### gitea_servers
| Field | Type | Description |
|-------|------|-------------|
| id | INTEGER PK | Auto-increment ID |
| name | TEXT NOT NULL | Display name / alias |
| url | TEXT NOT NULL | Gitea server URL (e.g. `https://git.example.com`) |
| token | TEXT NOT NULL | API access token |
| sync_interval | INTEGER DEFAULT 0 | Sync interval in minutes (0 = manual only) |
| last_sync_at | DATETIME | Last successful sync time |
| status | TEXT DEFAULT 'active' | active / disabled |
| created_at | DATETIME | Creation time |
### repos
| Field | Type | Description |
|-------|------|-------------|
| id | INTEGER PK | Auto-increment ID |
| server_id | INTEGER FK | References gitea_servers.id |
| name | TEXT | Repository name |
| full_name | TEXT | Full path (owner/repo) |
| clone_url | TEXT | Clone URL |
| local_path | TEXT | Local storage path |
| size | INTEGER | Repository size in bytes |
| last_sync_at | DATETIME | Last sync time |
| sync_status | TEXT | syncing / success / failed / pending |
| created_at | DATETIME | Discovery time |
### sync_logs
| Field | Type | Description |
|-------|------|-------------|
| id | INTEGER PK | Auto-increment ID |
| server_id | INTEGER FK | References gitea_servers.id |
| repo_id | INTEGER FK | References repos.id (nullable for full sync) |
| status | TEXT | success / failed / in_progress |
| message | TEXT | Log message or error details |
| started_at | DATETIME | Start time |
| finished_at | DATETIME | End time |
### settings
| Field | Type | Description |
|-------|------|-------------|
| key | TEXT PK | Setting key |
| value | TEXT | Setting value |
Predefined settings: `admin_password`, `listen_addr` (default `:9000`), `repos_dir`, `max_concurrent` (default 3).
## Gitea API Integration
### Authentication
All API requests use `Authorization: token <token>` header or `token=<token>` query parameter.
### Core API Endpoints Used
1. **Validate token**: `GET /api/v1/user` - verify token and get user info
2. **Search repos**: `GET /api/v1/repos/search?limit=50&page=1` - paginated list of all accessible repos
3. **Admin repos** (if admin token): `GET /api/v1/admin/repos` - all repos on server
### Sync Flow
```
For each active server (per schedule or manual trigger):
1. Validate token → GET /api/v1/user
2. Discover repos → GET /api/v1/repos/search (paginate through all pages)
3. Compare with database:
- New repos → git clone --mirror <url> to repos/<server>/<owner>/<repo>.git
- Existing repos → git fetch --all --prune (incremental update)
- Repos removed from Gitea → mark as deleted in DB (optional cleanup)
4. Record sync result to sync_logs
5. Update repos table (last_sync_at, sync_status)
```
### Concurrency Control
- Worker pool with configurable max concurrency (default: 3)
- Each server sync is serialized (repos within a server sync concurrently up to limit)
- Long-running sync operations return task_id for status polling
## API Routes
### Authentication
```
POST /api/login # Password auth, returns JWT token
```
### Settings
```
GET /api/settings # Get all settings (password masked)
PUT /api/settings # Update settings
```
### Server Management
```
GET /api/servers # List all servers
POST /api/servers # Add server
PUT /api/servers/:id # Update server
DELETE /api/servers/:id # Delete server
POST /api/servers/:id/test # Test connection (validate token)
```
### Repository Management
```
GET /api/servers/:id/repos # List repos for a server
POST /api/servers/:id/discover # Trigger repo discovery from Gitea API
```
### Sync Operations
```
POST /api/servers/:id/sync # Trigger sync for one server
POST /api/sync/all # Trigger sync for all servers
GET /api/servers/:id/sync/status # Get current sync status
```
### Logs & Stats
```
GET /api/sync/logs # Sync logs (paginated, filterable)
GET /api/sync/stats # Dashboard stats
```
All routes except `/api/login` require `Authorization: Bearer <token>` header.
## Web UI Pages
1. **Dashboard** - Overview: server count, total repos, disk usage, recent sync status
2. **Server Management** - Add/edit/delete Gitea servers, test connection, trigger sync
3. **Repository List** - Filter by server, show name, size, sync status, last sync time
4. **Sync Logs** - Sync history, filter by status (success/failed)
5. **Settings** - Change password, listen address, storage path, concurrency config
## Project Structure
```
gitm/
├── main.go # Entry point: parse flags, start server
├── go.mod
├── go.sum
├── internal/
│ ├── config/ # Configuration management
│ ├── database/ # SQLite init & migration
│ ├── models/ # Data model definitions
│ ├── gitea/ # Gitea API client
│ ├── sync/ # Sync engine (clone/fetch/scheduler)
│ ├── middleware/ # JWT auth middleware
│ └── handler/ # HTTP handlers (API route handlers)
├── web/ # Vue 3 frontend source
│ ├── src/
│ │ ├── views/ # Page components
│ │ ├── components/ # Shared components
│ │ ├── api/ # API call wrappers
│ │ └── stores/ # Pinia state management
│ ├── dist/ # Build output (embed target)
│ └── vite.config.ts
├── Makefile # Build & package commands
└── docs/ # Documentation
```
## Build & Run
### Build
```bash
# Build frontend
cd web && npm install && npm run build
# Build binary
go build -o gitm .
# Cross-compile
GOOS=linux GOARCH=amd64 go build -o gitm-linux .
GOOS=windows GOARCH=amd64 go build -o gitm.exe .
```
### Run
```bash
# First run (init database, set password)
./gitm --init
# Normal run (default port 9000)
./gitm
# Custom port
./gitm --addr :9090
# Custom data directory
./gitm --data /path/to/data
```
### Runtime Flags
| Flag | Default | Description |
|------|---------|-------------|
| `--addr` | `:9000` | Listen address |
| `--data` | `./data` | Data directory (SQLite DB + repos) |
| `--init` | false | Initialize config and set password |
## Error Handling
- Gitea API errors: log and continue with next repo, mark failed in sync_logs
- Git clone/fetch failures: retry once, then mark as failed
- Network timeouts: 30s for API calls, no timeout for git operations (large repos)
- Disk space: check available space before sync, warn if below 1GB
## Security
- Password hashed with bcrypt before storage
- JWT tokens with configurable expiry (default 24h)
- Gitea tokens encrypted at rest using AES-256 with a key derived from the admin password
- API not accessible externally by default (bind to localhost, configurable)