csxj2026/junhong_cmp_fiber

Files

huang fb83c9a706 feat: 实现统一错误处理系统 (003-error-handling)

- 新增统一错误码定义和管理 (pkg/errors/codes.go)
- 新增全局错误处理器和中间件 (pkg/errors/handler.go, internal/middleware/error_handler.go)
- 新增错误上下文管理 (pkg/errors/context.go)
- 增强 Panic 恢复中间件 (internal/middleware/recover.go)
- 新增完整的单元测试和集成测试
- 新增功能文档 (docs/003-error-handling/)
- 新增功能规范 (specs/003-error-handling/)
- 更新 CLAUDE.md 和 README.md

2025-11-15 12:17:44 +08:00

16 KiB

Raw Blame History

Implementation Plan: Fiber 错误处理集成

Branch: 003-error-handling | Date: 2025-11-14 | Spec: spec.md
Input: Feature specification from /specs/003-error-handling/spec.md

Summary

实现统一的 Fiber 错误处理机制,包括全局 ErrorHandler、Panic 恢复、错误分类和安全的错误响应。核心目标是捕获所有错误和 panic,返回统一格式的 JSON 响应,同时隐藏敏感信息,记录完整的错误上下文到日志。

技术方案: 使用 Fiber ErrorHandler + defer/recover 双层保护,基于错误码范围映射 HTTP 状态码,Request ID 通过 Header 传递,日志采用静默失败策略。

Technical Context

Language/Version: Go 1.25.4
Primary Dependencies: Fiber v2 (HTTP 框架), Zap (日志), sonic (JSON), 标准库 errors
Storage: N/A (无持久化数据,仅运行时错误处理)
Testing: Go 标准 testing 框架 + httptest
Target Platform: Linux server (Docker 容器)
Project Type: single (后端 API 服务)
Performance Goals: 错误处理延迟 < 1ms (P95), 不显著增加请求处理时间
Constraints:

错误响应不能暴露敏感信息 (数据库错误、文件路径、堆栈跟踪)
日志失败不能阻塞响应
ErrorHandler 自身必须防止 panic 无限循环
响应已发送后不能修改响应内容

Scale/Scope:

影响所有 API 端点 (用户、订单、任务等)
约 10+ 错误码定义
3-5 个新增/修改的文件

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

Tech Stack Adherence:

Feature uses Fiber + GORM + Viper + Zap + Lumberjack.v2 + Validator + sonic JSON + Asynq + PostgreSQL
No native calls bypass framework (no database/sql, net/http, encoding/json direct use)
All HTTP operations use Fiber framework
All database operations use GORM (N/A - 本功能无数据库操作)
All async tasks use Asynq (N/A - 本功能无异步任务)
Uses Go official toolchain: go fmt, go vet, golangci-lint
Uses Go Modules for dependency management

Code Quality Standards:

Follows Handler → Service → Store → Model architecture (本功能主要在 pkg/ 包中)
Handler layer only handles HTTP, no business logic
Service layer contains business logic with cross-module support (N/A - 本功能为基础设施)
Store layer manages all data access with transaction support (N/A - 无数据访问)
Uses dependency injection via struct fields (not constructor patterns)
Unified error codes in pkg/errors/ ✅ 本功能核心
Unified API responses via pkg/response/ ✅ 本功能核心
All constants defined in pkg/constants/
All Redis keys managed via key generation functions (N/A - 无 Redis 操作)
No hardcoded magic numbers or strings (3+ occurrences must be constants) ✅ 错误码和消息均为常量
Defined constants are used instead of hardcoding duplicate values ✅ 错误消息通过映射表管理
Code comments prefer Chinese for readability ✅ 所有注释使用中文
Log messages use Chinese ✅ 所有日志消息使用中文
Error messages support Chinese ✅ 错误消息中文优先
All exported functions/types have Go-style doc comments
Code formatted with gofmt
Follows Effective Go and Go Code Review Comments

Documentation Standards (Constitution Principle VII):

Feature summary docs placed in docs/{feature-id}/ mirroring specs/{feature-id}/
Summary doc filenames use Chinese (功能总结.md, 使用指南.md, etc.)
Summary doc content uses Chinese
README.md updated with brief Chinese summary (2-3 sentences)
Documentation is concise for first-time contributors

Go Idiomatic Design:

Package structure is flat (max 2-3 levels), organized by feature ✅ pkg/errors/
Interfaces are small (1-3 methods), defined at use site ✅ fiber.ErrorHandler
No Java-style patterns: no I-prefix, no Impl-suffix, no getters/setters
Error handling is explicit (return errors, no panic/recover abuse) ✅ 核心功能
Uses composition over inheritance
Uses goroutines and channels (not thread pools) (N/A - 本功能无并发)
Uses context.Context for cancellation and timeouts (N/A - 错误处理无需 context)
Naming follows Go conventions: short receivers, consistent abbreviations
No Hungarian notation or type prefixes
Simple constructors (New/NewXxx), no Builder pattern unless necessary

Testing Standards:

Unit tests for all core business logic (Service layer)
Integration tests for all API endpoints ✅ 错误处理集成测试
Tests use Go standard testing framework
Test files named *_test.go in same directory
Test functions use Test prefix, benchmarks use Benchmark prefix
Table-driven tests for multiple test cases ✅ 多种错误场景测试
Test helpers marked with t.Helper()
Tests are independent (no external service dependencies)
Target coverage: 70%+ overall, 90%+ for core business ✅ 错误处理核心逻辑 90%+

User Experience Consistency:

All APIs use unified JSON response format ✅ 本功能核心
Error responses include clear error codes and bilingual messages ✅ 中文消息
RESTful design principles followed
Unified pagination parameters (N/A - 本功能无分页)
Time fields use ISO 8601 format (RFC3339) ✅ timestamp 字段
Currency amounts use integers (N/A - 本功能无货币)

Performance Requirements:

API response time (P95) < 200ms, (P99) < 500ms ✅ 错误处理 < 1ms
Batch operations use bulk queries/inserts (N/A - 本功能无批量操作)
All database queries have appropriate indexes (N/A - 无数据库操作)
List queries implement pagination (N/A - 无列表查询)
Non-realtime operations use async tasks (N/A - 错误处理必须同步)
Database and Redis connection pools properly configured (N/A)
Uses goroutines/channels for concurrency (N/A - 错误处理同步执行)
Uses context.Context for timeout control (N/A)
Uses sync.Pool for frequently allocated objects (可选优化 - ErrorContext)

Access Logging Standards (Constitution Principle VIII):

ALL HTTP requests logged to access.log without exception ✅ 已有实现
Request parameters (query + body) logged (limited to 50KB) ✅ 已有实现
Response parameters (body) logged (limited to 50KB) ✅ 已有实现
Logging happens via centralized Logger middleware ✅ 已有实现
No middleware bypasses access logging ✅ ErrorHandler 不绕过日志
Body truncation indicates "... (truncated)" when over 50KB limit ✅ 已有实现
Access log includes all required fields ✅ 已有实现

Project Structure

Documentation (this feature)

设计文档（specs/ 目录）：开发前的规划和设计

specs/003-error-handling/
├── plan.md              # This file (/speckit.plan command output)
├── research.md          # Phase 0 output - 技术研究和决策
├── data-model.md        # Phase 1 output - 错误处理数据结构
├── quickstart.md        # Phase 1 output - 快速上手指南
├── contracts/           # Phase 1 output - API contracts
│   └── error-responses.yaml  # 错误响应规范 (OpenAPI)
└── tasks.md             # Phase 2 output - 任务分解 (NOT created by /speckit.plan)

总结文档（docs/ 目录）：开发完成后的总结和使用指南（遵循 Constitution Principle VII）

docs/003-error-handling/
├── 功能总结.md          # 功能概述、核心实现、技术要点
├── 使用指南.md          # 如何使用错误处理机制
└── 架构说明.md          # 错误处理架构设计（可选）

README.md 更新：完成功能后添加简短描述

## 核心功能
- **统一错误处理**：全局 ErrorHandler + Panic 恢复,统一错误响应格式,安全的敏感信息隐藏

Source Code (repository root)

pkg/
├── errors/
│   ├── errors.go        # 已存在 - 需扩展 AppError
│   ├── codes.go         # 新增 - 错误码枚举和消息映射
│   ├── handler.go       # 新增 - Fiber ErrorHandler 实现
│   └── context.go       # 新增 - 错误上下文提取
├── response/
│   └── response.go      # 已存在 - 无需修改
├── constants/
│   └── constants.go     # 已存在 - 可能需要添加 Request ID 常量
└── logger/
    └── logger.go        # 已存在 - 无需修改

internal/middleware/
└── recover.go           # 已存在 - 可能需要小幅调整

cmd/api/
└── main.go              # 需修改 - 配置 Fiber ErrorHandler

tests/integration/
└── error_handler_test.go  # 新增 - 错误处理集成测试

Structure Decision: 单一项目结构,错误处理作为基础设施包放在 pkg/errors/ 下,供所有模块使用。与现有 pkg/response/ 和 pkg/logger/ 包协同工作。

Complexity Tracking

Fill ONLY if Constitution Check has violations that must be justified

无违反项。所有设计决策符合项目宪章要求。

Phase 0: Research (Complete ✅)

Output: research.md

已完成技术研究,解决了以下关键问题:

Fiber ErrorHandler 机制和中间件集成
ErrorHandler 自身保护 (defer/recover)
敏感信息识别和隐藏策略
响应已发送后的错误处理
日志系统集成和静默失败策略
错误分类和 HTTP 状态码映射
Request ID 传递方式

核心决策:

使用 Fiber ErrorHandler + defer/recover 双层保护
所有 5xx 错误返回通用消息,原始错误仅记录日志
日志采用静默失败策略,不阻塞响应
基于错误码范围 (1000-1999, 2000-2999) 映射 HTTP 状态码
Request ID 仅在 Header 中传递,不在响应体中

Phase 1: Design & Contracts (Complete ✅)

Prerequisites: research.md complete ✅

Data Model

Output: data-model.md

定义了错误处理的核心数据结构:

AppError: 应用错误类型,包含错误码、消息、HTTP 状态码、错误链
ErrorResponse: 统一的 JSON 错误响应格式
ErrorContext: 错误发生时的请求上下文 (用于日志)
ErrorCode: 错误码枚举和消息映射

关键实体:

无持久化实体 (运行时对象)
错误处理流程数据流已定义
性能约束: ErrorContext 创建 < 0.1ms, 总延迟 < 1ms

API Contracts

Output: contracts/error-responses.yaml

OpenAPI 3.0 格式定义了:

统一的 ErrorResponse schema
常见错误响应 (400, 401, 403, 404, 409, 429, 500, 503, 504)
完整的错误码清单 (1001-1009, 2001-2006)
HTTP 状态码映射规则
安全规范和错误处理流程
实际示例 (成功、客户端错误、服务端错误、限流)

Quick Start Guide

Output: quickstart.md

为开发者提供:

5 分钟快速开始指南
常用错误码表格
Handler 中返回错误的 3 种方式
客户端错误处理示例 (TypeScript, Python)
进阶使用: 自定义消息、错误链、Panic 恢复
调试技巧: Request ID 追踪
常见错误场景和最佳实践
测试示例和 FAQ

Agent Context Update

Output: CLAUDE.md updated ✅

已更新 Claude 上下文文件,添加错误处理相关技术栈信息。

Phase 2: Implementation Planning

This phase is handled by /speckit.tasks command, NOT by /speckit.plan.

/speckit.plan 命令在此停止。下一步:

运行 /speckit.tasks 生成详细的任务分解 (tasks.md)
运行 /speckit.implement 执行实施

预期的 tasks.md 将包含:

Task 1: 扩展 pkg/errors/errors.go (添加 HTTPStatus 字段和方法)
Task 2: 创建 pkg/errors/codes.go (错误码枚举和消息映射)
Task 3: 创建 pkg/errors/handler.go (Fiber ErrorHandler 实现)
Task 4: 创建 pkg/errors/context.go (错误上下文提取)
Task 5: 更新 cmd/api/main.go (配置 ErrorHandler)
Task 6: 调整 internal/middleware/recover.go (如需)
Task 7: 创建集成测试 tests/integration/error_handler_test.go
Task 8: 更新文档 docs/003-error-handling/

Implementation Notes

关键依赖关系

错误码定义优先: pkg/errors/codes.go 必须先完成,因为其他组件依赖错误码常量
AppError 扩展: 扩展现有 pkg/errors/errors.go,保持向后兼容
ErrorHandler 集成: 在 cmd/api/main.go 中配置 Fiber ErrorHandler
测试驱动: 先编写集成测试,验证各种错误场景

风险和缓解

风险 1: ErrorHandler 自身 panic 导致服务崩溃

缓解: 使用 defer/recover 保护 ErrorHandler,失败时返回空响应
保护机制触发条件明确:
- 触发范围: defer/recover 仅保护 ErrorHandler 函数本身的执行过程
- 捕获的异常: 任何在 ErrorHandler 内部发生的 panic (包括日志系统崩溃、JSON 序列化失败、响应写入错误等)
- 不捕获的异常: Fiber 中间件链中的 panic 由 Recover 中间件处理,不在此保护范围内
- 失败响应: 当 ErrorHandler 自身 panic 时,返回 HTTP 500 状态码,空响应体 (Content-Length: 0)
- 日志记录: 保护机制触发时的 panic 信息会被记录 (如果日志系统可用),但不阻塞响应返回
示例场景:
1. Zap 日志系统崩溃 → defer/recover 捕获 → 返回 HTTP 500 空响应
2. sonic JSON 序列化失败 → defer/recover 捕获 → 返回 HTTP 500 空响应
3. c.Status().JSON() 写入响应失败 → defer/recover 捕获 → 返回 HTTP 500 空响应
4. 业务逻辑中的 panic → Recover 中间件捕获 → 传递给 ErrorHandler → ErrorHandler 正常处理

风险 2: 日志系统失败阻塞响应

缓解: 日志调用使用 defer/recover,静默失败

风险 3: 响应已发送后修改响应导致损坏

缓解: 检查响应状态,已发送则仅记录日志

风险 4: 敏感信息泄露

缓解: 所有 5xx 错误返回通用消息,原始错误仅记录日志

性能优化

预分配错误对象: 常见错误 (ErrMissingToken 等) 使用预定义对象
避免字符串拼接: 使用 fmt.Errorf 和 %w 包装错误
异步日志: Zap 已支持,无需额外配置
ErrorContext 池化 (可选): 如果性能测试显示分配开销大,使用 sync.Pool

测试策略

单元测试:

pkg/errors/codes.go: 错误码映射函数
pkg/errors/context.go: ErrorContext 提取逻辑
pkg/errors/handler.go: ErrorHandler 核心逻辑

集成测试:

参数验证失败 → 400 错误
认证失败 → 401 错误
资源未找到 → 404 错误
数据库错误 → 500 错误 (敏感信息已隐藏)
Panic 恢复 → 500 错误 (堆栈记录到日志)
限流触发 → 429 错误
响应已发送后的错误处理

性能测试:

错误处理延迟基准测试
并发场景下的错误处理

部署注意事项

向后兼容: 现有错误处理代码继续工作,逐步迁移到新机制
日志轮转: 确保日志文件配置正确的轮转策略
监控: 配置告警规则监控 5xx 错误率
文档: 更新 API 文档,说明新的错误响应格式

Constitution Re-Check (Post-Design)

✅ 所有设计决策符合项目宪章要求:

Tech Stack Adherence: 使用 Fiber, Zap, sonic
Code Quality: 清晰的分层,统一的错误码和响应
Go Idiomatic Design: 简单的结构体,显式的错误处理,无 Java 风格模式
Testing Standards: 单元测试 + 集成测试,table-driven tests
Performance: 错误处理延迟 < 1ms
Security: 敏感信息隐藏,日志访问控制

Plan Completion: ✅ Phase 0 研究和 Phase 1 设计已完成
Branch: 003-error-handling
Next Step: 运行 /speckit.tasks 生成任务分解,然后 /speckit.implement 执行实施

Generated Artifacts:

✅ research.md - 技术研究和决策
✅ data-model.md - 错误处理数据结构
✅ contracts/error-responses.yaml - 错误响应规范 (OpenAPI)
✅ quickstart.md - 快速上手指南
✅ CLAUDE.md - 已更新 agent 上下文

16 KiB Raw Blame History Unescape Escape