做一次小小的备份,等会又删掉了

2025-11-11 10:09:45 +08:00
parent 37c4404293
commit 9600e5b6e0
35 changed files with 4564 additions and 56 deletions
--- a/specs/001-fiber-middleware-integration/spec.md
+++ b/specs/001-fiber-middleware-integration/spec.md
@@ -0,0 +1,256 @@
+# Feature Specification: Fiber Middleware Integration with Configuration Management
+
+**Feature Branch**: `001-fiber-middleware-integration`  
+**Created**: 2025-11-10  
+**Status**: Draft  
+**Input**: User description: "我需要你把以下东西集成到fiber中以及我们系统中,不需要你去go get,我自己会去go mod tidy
+
+关于fiber的各种使用方法可以访问 https://docs.gofiber.io/ 来获取
+
+同时我还需要你建立一个统一的返回结构
+结构
+{
+  \"code\": 0000,
+  \"data\": {}/[],
+  \"msg\": \"\"
+}
+viper配置(需要支持热加载)
+zap以及Lumberjack.v2
+github.com/gofiber/fiber/v2/middleware/logger中间件
+github.com/gofiber/fiber/v2/middleware/recover中间件
+github.com/gofiber/fiber/v2/middleware/requestid中间件
+github.com/gofiber/fiber/v2/middleware/keyauth中间件(应该是去redis去验证是否存在token,应该是从header头拿名为 token的字段来做比对)
+github.com/gofiber/fiber/v2/middleware/limiter中间件(可以先做进来,做完后注释掉全部的代码,然后说明怎么用怎么改)"
+
+## Clarifications
+
+### Session 2025-11-10
+
+- Q: What specific types of logs should the Zap + Lumberjack integration handle? → A: Both application logs and HTTP access logs, with configurable separation into different files (app.log, access.log) to enable independent retention policies and analysis workflows.
+- Q: When Redis is unavailable during token validation (FR-016), what should the authentication behavior be? → A: Fail closed: All authentication requests fail immediately when Redis is unavailable (return HTTP 503)
+- Q: What data structure and content should be stored in Redis for authentication tokens? → A: Token as key only (simple existence check): Store tokens as Redis keys with user ID as value, using Redis TTL for expiration
+- Q: What identifier should the rate limiter use to track and enforce request limits? → A: Per-IP address: Rate limit based on client IP address with configurable requests per time window (e.g., 100 req/min per IP)
+- Q: What format should be used for generating unique request IDs in the requestid middleware? → A: UUID v4 (random): Standard UUID format for maximum compatibility with distributed tracing systems and log aggregation tools
+
+## User Scenarios & Testing
+
+### User Story 1 - Configuration Hot Reload (Priority: P1)
+
+When system administrators or DevOps engineers modify application configuration files (such as server ports, database connections, log levels), the system should automatically detect and apply these changes without requiring a service restart, ensuring zero-downtime configuration updates.
+
+**Why this priority**: Configuration management is foundational for all other features. Without proper configuration loading and hot reload capability, the system cannot support runtime adjustments, which is critical for production environments.
+
+**Independent Test**: Can be fully tested by modifying a configuration value in the config file and verifying the system picks up the new value within seconds without restart, delivering immediate configuration flexibility.
+
+**Acceptance Scenarios**:
+
+1. **Given** the system is running with initial configuration, **When** an administrator updates the log level in the config file, **Then** the system detects the change within 5 seconds and applies the new log level to all subsequent log entries
+2. **Given** the system is running, **When** configuration file contains invalid syntax, **Then** the system logs a warning and continues using the previous valid configuration
+3. **Given** configuration hot reload is enabled, **When** multiple configuration parameters are changed simultaneously, **Then** all changes are applied atomically without partial updates
+
+---
+
+### User Story 2 - Structured Logging and Log Rotation (Priority: P1)
+
+When the system processes requests and business operations, all events, errors, and debugging information should be recorded in structured JSON format with automatic log file rotation based on size and time, ensuring comprehensive audit trails without disk space exhaustion. The system maintains separate log files for application logs (app.log) and HTTP access logs (access.log) with independent retention policies.
+
+**Why this priority**: Logging is essential for debugging, monitoring, and compliance. Structured logs enable efficient querying and analysis, while automatic rotation prevents operational issues. Separating application and access logs allows for different retention policies and analysis workflows.
+
+**Independent Test**: Can be fully tested by generating various log events and verifying they appear in structured JSON format in the appropriate files, and that log files rotate when size/time thresholds are reached, delivering production-ready logging capability.
+
+**Acceptance Scenarios**:
+
+1. **Given** the system is processing requests, **When** any application operation occurs, **Then** logs are written to app.log in JSON format containing timestamp, level, message, request ID, and contextual data
+2. **Given** the system is processing HTTP requests, **When** requests complete, **Then** access logs are written to access.log with request method, path, status, duration, and request ID
+3. **Given** a log file reaches the configured size limit, **When** new log entries are generated, **Then** the current log file is archived and a new log file is created
+4. **Given** log retention is configured for 30 days for application logs and 90 days for access logs, **When** log files exceed the retention period, **Then** older log files are automatically removed according to their respective policies
+5. **Given** multiple log levels are configured (debug, info, warn, error), **When** logging at different levels, **Then** only messages at or above the configured level are written
+
+---
+
+### User Story 3 - Unified API Response Format (Priority: P1)
+
+When API consumers (frontend applications, mobile apps, third-party integrations) make requests to any endpoint, they should receive responses in a consistent JSON structure containing status code, data payload, and message, regardless of success or failure, enabling predictable error handling and data parsing.
+
+**Why this priority**: Consistent response format is critical for API consumers to reliably parse responses. Without this, every endpoint integration becomes custom work, increasing development time and bug potential.
+
+**Independent Test**: Can be fully tested by calling any endpoint (successful or failed) and verifying the response structure matches the defined format with appropriate code, data, and message fields, delivering immediate API consistency.
+
+**Acceptance Scenarios**:
+
+1. **Given** a valid API request, **When** the request succeeds, **Then** the response contains `{"code": 0, "data": {...}, "msg": "success"}`
+2. **Given** an invalid API request, **When** validation fails, **Then** the response contains `{"code": [error_code], "data": null, "msg": "[error description]"}`
+3. **Given** any API endpoint, **When** processing completes, **Then** the response structure always includes code, data, and msg fields
+4. **Given** list/array data is returned, **When** the response is generated, **Then** the data field contains an array instead of an object
+
+---
+
+### User Story 4 - Request Logging and Tracing (Priority: P2)
+
+When HTTP requests arrive at the system, each request should be assigned a unique identifier and all request details (method, path, duration, status) should be logged, enabling request tracking across distributed components and performance analysis.
+
+**Why this priority**: Request logging provides visibility into system usage patterns and performance. The unique request ID enables correlation of logs across services for troubleshooting.
+
+**Independent Test**: Can be fully tested by making multiple concurrent requests and verifying each has a unique request ID in logs and response headers, and that request metrics are captured, delivering complete request observability.
+
+**Acceptance Scenarios**:
+
+1. **Given** an HTTP request arrives, **When** it enters the system, **Then** a unique request ID in UUID v4 format (e.g., "550e8400-e29b-41d4-a716-446655440000") is generated and added to the request context
+2. **Given** a request is being processed, **When** any logging occurs during that request, **Then** the request ID is automatically included in log entries
+3. **Given** a request completes, **When** the response is sent, **Then** the request ID is included in response headers (X-Request-ID) and a summary log entry records method, path, status, and duration
+4. **Given** multiple concurrent requests, **When** processed simultaneously, **Then** each request maintains its own unique UUID v4 request ID without collision
+
+---
+
+### User Story 5 - Automatic Error Recovery (Priority: P2)
+
+When unexpected errors or panics occur during request processing, the system should automatically recover from the failure, log detailed error information, return an appropriate error response to the client, and continue serving subsequent requests without crashing.
+
+**Why this priority**: Error recovery prevents cascading failures and ensures service availability. A single panic should not bring down the entire application.
+
+**Independent Test**: Can be fully tested by triggering a controlled panic in a handler and verifying the system returns an error response, logs the panic details, and continues processing subsequent requests normally, delivering fault tolerance.
+
+**Acceptance Scenarios**:
+
+1. **Given** a request handler panics, **When** the panic occurs, **Then** the middleware recovers, logs the panic stack trace, and returns HTTP 500 with error details
+2. **Given** a panic is recovered, **When** subsequent requests arrive, **Then** they are processed normally without any impact from the previous panic
+3. **Given** a panic includes error details, **When** logged, **Then** the log entry contains the panic message, stack trace, request ID, and request details
+
+---
+
+### User Story 6 - Token-Based Authentication (Priority: P2)
+
+When external clients make API requests, they must provide a valid authentication token in the request header, which the system validates against stored tokens in Redis cache, ensuring only authorized requests can access protected resources.
+
+**Why this priority**: Authentication is essential for security but depends on the foundational components (config, logging, response format) being in place first.
+
+**Independent Test**: Can be fully tested by making requests with valid/invalid/missing tokens and verifying that valid tokens grant access while invalid ones are rejected with appropriate error codes, delivering access control capability.
+
+**Acceptance Scenarios**:
+
+1. **Given** a request to a protected endpoint, **When** the "token" header is missing, **Then** the system returns HTTP 401 with `{"code": 1001, "data": null, "msg": "Missing authentication token"}`
+2. **Given** a request with a token, **When** the token exists as a key in Redis, **Then** the system retrieves the user ID from the value and allows the request to proceed with user context
+3. **Given** a request with a token, **When** the token does not exist in Redis (either never created or TTL expired), **Then** the system returns HTTP 401 with `{"code": 1002, "data": null, "msg": "Invalid or expired token"}`
+4. **Given** Redis is unavailable, **When** token validation is attempted, **Then** the system immediately fails closed, logs the Redis connection error, and returns HTTP 503 with `{"code": 1004, "data": null, "msg": "Authentication service unavailable"}` without attempting fallback mechanisms
+
+---
+
+### User Story 7 - Rate Limiting Configuration (Priority: P3)
+
+The system should provide configurable IP-based rate limiting capabilities that can restrict the number of requests from a specific client IP address within a time window, with the functionality initially implemented but disabled by default, allowing future activation based on specific endpoint requirements.
+
+**Why this priority**: Rate limiting is important for production but not critical for initial deployment. It can be activated later when traffic patterns are better understood.
+
+**Independent Test**: Can be fully tested by enabling the limiter configuration, making repeated requests from the same IP exceeding the limit, and verifying that excess requests are rejected with rate limit error messages, delivering DoS protection capability when needed.
+
+**Acceptance Scenarios**:
+
+1. **Given** rate limiting is configured and enabled for an endpoint with 100 requests per minute per IP, **When** a client IP exceeds the request limit within the time window, **Then** subsequent requests from that IP return HTTP 429 with `{"code": 1003, "data": null, "msg": "Too many requests"}`
+2. **Given** the rate limit time window expires, **When** new requests arrive from the same client IP, **Then** the request counter resets and requests are allowed again
+3. **Given** rate limiting is disabled (default), **When** any number of requests arrive, **Then** all requests are processed without rate limit checks
+4. **Given** rate limiting is enabled, **When** requests arrive from different IP addresses, **Then** each IP address has its own independent request counter and limit
+
+---
+
+### Edge Cases
+
+- What happens when the configuration file is deleted while the system is running? (System should log error and continue with current configuration)
+- What happens when Redis connection is lost during token validation? (System immediately fails closed, returns HTTP 503 with code 1004, logs connection failure, and does not attempt any fallback authentication)
+- What happens when log directory is not writable? (System should fail to start with clear error message)
+- What happens when a request ID collision occurs? (With UUID v4, collision probability is negligible: ~1 in 2^122; no special handling needed)
+- What happens when configuration hot reload occurs during active request processing? (Configuration changes should not affect in-flight requests)
+- What happens when log rotation occurs while writing a log entry? (Log rotation should be atomic and not lose log entries)
+- What happens when invalid configuration values are provided (e.g., negative numbers for limits)? (System should validate config on load and reject invalid values with clear error messages)
+
+## Requirements
+
+### Functional Requirements
+
+- **FR-001**: System MUST load configuration from files using Viper configuration library
+- **FR-002**: System MUST support hot reload of configuration files, detecting changes within 5 seconds and applying them without service restart
+- **FR-003**: System MUST validate configuration values on load and reject invalid configurations with descriptive error messages
+- **FR-004**: System MUST write all logs in structured JSON format using Zap logging library
+- **FR-004a**: System MUST separate application logs (app.log) and HTTP access logs (access.log) into different files with independent configuration
+- **FR-005**: System MUST rotate log files automatically using Lumberjack.v2 based on configurable size and age parameters for both application and access logs
+- **FR-006**: System MUST retain log files according to configured retention policy and automatically remove expired logs, with separate retention settings for application and access logs
+- **FR-007**: All API responses MUST follow the unified format: `{"code": [number], "data": [object/array/null], "msg": [string]}`
+- **FR-008**: System MUST assign a unique request ID to every incoming HTTP request using requestid middleware
+- **FR-008a**: Request IDs MUST be generated using UUID v4 format for maximum compatibility with distributed tracing systems and log aggregation tools
+- **FR-009**: System MUST include the request ID in all log entries associated with that request
+- **FR-010**: System MUST include the request ID in HTTP response headers for client-side tracing
+- **FR-011**: System MUST log all HTTP requests with method, path, status code, duration, and request ID using logger middleware
+- **FR-012**: System MUST automatically recover from panics during request processing using recover middleware
+- **FR-013**: When a panic is recovered, system MUST log the full stack trace and error details
+- **FR-014**: When a panic is recovered, system MUST return HTTP 500 with unified error response format
+- **FR-015**: System MUST validate authentication tokens from the "token" request header using keyauth middleware
+- **FR-016**: System MUST check token validity by verifying existence in Redis cache using token string as key
+- **FR-016a**: System MUST store tokens in Redis as simple key-value pairs with token as key and user ID as value, using Redis TTL for expiration management
+- **FR-016b**: When Redis is unavailable during token validation, system MUST fail closed and return HTTP 503 immediately without fallback or caching mechanisms
+- **FR-017**: System MUST return HTTP 401 with appropriate error code and message when token is missing or invalid
+- **FR-018**: System MUST provide configurable IP-based rate limiting capability using limiter middleware
+- **FR-018a**: Rate limiting MUST track request counts per client IP address with configurable limits (requests per time window)
+- **FR-018b**: When rate limit is exceeded, system MUST return HTTP 429 with code 1003 and appropriate error message
+- **FR-019**: Rate limiting implementation MUST be provided but disabled by default in initial deployment
+- **FR-020**: System MUST include documentation on how to configure and enable rate limiting per endpoint with example configurations
+- **FR-021**: System MUST use consistent error codes across all error scenarios with bilingual (Chinese/English) support
+- **FR-022**: Configuration MUST support different environments (development, staging, production) with separate config files
+
+### Technical Requirements (Constitution-Driven)
+
+**Tech Stack Compliance**:
+- [x] All HTTP operations use Fiber framework (no `net/http` shortcuts)
+- [x] All async tasks use Asynq (if applicable)
+- [x] All logging uses Zap + Lumberjack.v2
+- [x] All configuration uses Viper
+
+**Architecture Requirements**:
+- [x] Implementation follows Handler → Service → Store → Model layers (applies to auth token validation)
+- [x] Dependencies injected via Service/Store structs
+- [x] Unified error codes defined in `pkg/errors/`
+- [x] Unified API responses via `pkg/response/`
+- [x] All constants defined in `pkg/constants/` (no magic numbers/strings)
+- [x] All Redis keys managed via `pkg/constants/` key generation functions
+
+**API Design Requirements**:
+- [x] All APIs follow RESTful principles
+- [x] All responses use unified JSON format with code/message/data/timestamp
+- [x] All error messages include error codes and bilingual descriptions
+- [x] All time fields use ISO 8601 format (RFC3339)
+
+**Performance Requirements**:
+- [x] API response time (P95) < 200ms
+- [x] Database queries < 50ms (if applicable)
+- [x] Non-realtime operations delegated to async tasks (if applicable)
+
+**Testing Requirements**:
+- [x] Unit tests for all Service layer business logic
+- [x] Integration tests for all API endpoints
+- [x] Tests are independent and use mocks/testcontainers
+- [x] Target coverage: 70%+ overall, 90%+ for core business logic
+
+### Key Entities
+
+- **Configuration**: Represents application configuration settings including server parameters, database connections, Redis settings, logging configuration (with separate settings for app.log and access.log including independent rotation and retention policies), and middleware settings. Supports hot reload capability to apply changes without restart.
+  
+- **AuthToken**: Represents an authentication token stored in Redis cache as a simple key-value pair. The token string is used as the Redis key, and the user ID is stored as the value. Token expiration is managed via Redis TTL mechanism. This structure enables O(1) existence checks for authentication validation.
+
+- **Request Context**: Represents the execution context of an HTTP request, containing unique request ID (UUID v4 format), authentication information (user ID from token validation), request start time, and other metadata used for logging and tracing.
+
+- **Log Entry**: Represents a structured log record containing timestamp, severity level, message, request ID, user context, and additional contextual fields, written in JSON format.
+
+- **Rate Limit State**: Represents the current request count and time window for a specific client IP address, used to enforce per-IP rate limiting policies. Tracks remaining quota and window reset time for each unique IP.
+
+## Success Criteria
+
+### Measurable Outcomes
+
+- **SC-001**: System administrators can modify any configuration value and see it applied within 5 seconds without service restart
+- **SC-002**: All API responses follow the unified `{code, data, msg}` structure with 100% consistency across all endpoints
+- **SC-003**: Every HTTP request generates a unique UUID v4 request ID that appears in the X-Request-ID response header and all associated log entries
+- **SC-004**: System continues processing new requests within 100ms after recovering from a panic, with zero downtime
+- **SC-005**: Log files automatically rotate when reaching configured size limits (e.g., 100MB) without manual intervention
+- **SC-006**: Invalid authentication tokens are rejected within 50ms with clear error messages, preventing unauthorized access
+- **SC-007**: All logs are written in valid JSON format that can be parsed by standard log aggregation tools without errors
+- **SC-008**: 100% of HTTP requests are logged with method, path, status, duration, and request ID for complete audit trail
+- **SC-009**: Rate limiting (when enabled) successfully blocks requests exceeding configured limits within the time window with appropriate error responses
+- **SC-010**: System successfully loads configuration from different environment-specific files (dev, staging, prod) based on environment variable