CharacterSaveData.Save() silently returned on failure (nil decompressed
data, compression error, DB error) while the caller unconditionally
logged "Saved character data successfully". This made diagnosing save
failures difficult (ref #163).
Save() now returns an error, and all six call sites check it. The
success log in saveAllCharacterData only fires when the save actually
persisted.
Wire ExcludeOpcodes config into RecordingConn so configured opcodes
(e.g. ping, nop, position) are filtered at record time. Add padded
metadata with in-place PatchMetadata to populate CharID/UserID after
login. Implement --mode replay using protbot's encrypted connection
with timing-aware packet sending, auto-ping response, concurrent
S→C collection, and byte-level payload diff reporting.
Add a recording and replay foundation for the MHF network protocol.
A RecordingConn decorator wraps network.Conn to transparently capture
all decrypted packets to binary .mhfr files, with zero handler changes
and zero overhead when disabled.
- network/pcap: binary capture format (writer, reader, filters)
- RecordingConn: thread-safe Conn decorator with direction tracking
- CaptureOptions in config (disabled by default)
- Capture wired into all three server types (sign, entrance, channel)
- cmd/replay: CLI tool with dump, json, stats, and compare modes
- 19 new tests, all passing with -race
Add SJISToUTF8Lossy() that wraps SJISToUTF8() and logs decode errors at
slog.Debug level. Replace all 31 call sites across 17 files that previously
discarded the error with `_, _ =`. This makes garbled text from malformed
SJIS client data debuggable without adding noise at default log levels.
fmt.Sprintf inside zap logger calls defeats structured logging,
making log aggregation and filtering harder. All 6 sites now use
proper zap fields (zap.Uint32, zap.Uint8, zap.String).
LoopDelay had no viper.SetDefault, so omitting it from config.json
caused a zero-value (0 ms) busy-loop in the recv loop. Default is
now 50 ms, matching config.example.json.
main.go always sets both Channels and Registry together, making the
Channels fallback paths dead code. This removes:
- Server.Channels field from the Server struct
- 3 if/else fallback blocks in handlers_session.go (replaced with
Registry.FindChannelForStage, SearchSessions, SearchStages)
- 1 if/else fallback block in handlers_guild_ops.go (replaced with
Registry.NotifyMailToCharID)
- 3 method fallbacks in sys_channel_server.go (WorldcastMHF,
FindSessionByCharID, DisconnectUser now delegate directly)
Updates anti-patterns.md #6 to "accepted design" — Session struct is
appropriate for this game server's handler pattern, and cross-channel
coupling is now fully routed through the ChannelRegistry interface.
The global stagesLock sync.RWMutex protected map[string]*Stage, causing
all stage operations to contend on a single lock even for unrelated
stages. Any stage creation or deletion blocked all reads server-wide.
Replace with a typed StageMap wrapper around sync.Map which provides
lock-free reads and allows concurrent writes to disjoint keys. Per-stage
sync.RWMutex remains unchanged for protecting individual stage state.
StageMap exposes Get, GetOrCreate, StoreIfAbsent, Store, Delete, and
Range methods. Updated ~50 call sites across 6 production files and
9 test files.
The userBinary and minidata maps with their locks were spread across
Server as raw fields with manual lock management. Cross-channel session
searches also required acquiring nested locks (server lock + binary
lock). Encapsulating in dedicated types eliminates the nested locking
and reduces Server's field count by 4.
The config package used `package _config` with a leading underscore,
which is unconventional in Go. Rename to `package config` (matching the
directory name) and use `cfg` as the standard import alias across all
93 importing files.
Eliminate 18 direct s.server.db calls from handlers_items.go,
handlers_distitem.go, and handlers_session.go by moving queries into
dedicated repository types.
New repositories:
- StampRepository (7 methods, stamps table)
- DistributionRepository (4 methods, distribution/distribution_items)
- SessionRepository (4 methods, sign_sessions/servers)
Also adds ClearTreasureHunt and InsertKillLog to GuildRepository,
which already owns those tables for read operations.
The stages map was protected by two incompatible locks: the embedded
Server.Mutex and Server.stagesLock (RWMutex). Since these are separate
mutexes they don't exclude each other, and many handlers accessed the
map with no lock at all.
Route all stages map access through stagesLock: read-only lookups use
RLock, writes (create/delete) use Lock. Per-stage field mutations
continue to use each stage's own RWMutex. Restructure
handleMsgSysUnlockStage to avoid holding stagesLock nested inside a
stage RLock, preventing potential deadlock with destructEmptyStages.
Centralizes all 31 direct users-table SQL queries from 11 handler
files into a single UserRepository, following the same pattern as
CharacterRepository and GuildRepository. The only excluded query is
the sign_sessions JOIN in handleMsgSysLogin which spans multiple
tables.
Add 18 new typed methods to CharacterRepository (ReadTime, SaveTime,
SaveInt, SaveBool, SaveString, ReadBool, ReadString, LoadColumnWithDefault,
SetDeleted, UpdateDailyCafe, ResetDailyQuests, ReadEtcPoints, ResetCafeTime,
UpdateGuildPostChecked, ReadGuildPostChecked, SaveMercenary, UpdateGCPAndPact,
FindByRastaID) and migrate ~56 inline SQL queries across 13 handler files.
Pure refactor — zero behavior change. Each handler produces identical SQL
with identical parameters. Cross-table JOINs and bulk CharacterSaveData
operations are intentionally left out of scope.
Centralizes all characters table SQL behind a CharacterRepository struct
in repo_character.go. The 4 existing helpers (loadCharacterData,
saveCharacterData, readCharacterInt, adjustCharacterInt) now delegate to
the repository, keeping identical signatures so all ~70 callsites remain
unchanged. Direct queries in handlers_session.go, sys_channel_server.go
(DisconnectUser), and handlers_mail.go are also migrated.
Pure refactor with zero behavior change — first step toward eliminating
the ~130 scattered character queries identified in anti-patterns #9.
Add readCharacterInt/adjustCharacterInt helpers for single-column
integer operations on the characters table. Eliminates fmt.Sprintf
SQL construction in handlers_misc.go and replaces inline queries
across cafe, kouryou, and mercenary handlers.
Second round of protocol constant extraction: adds constants_time.go
(secsPerDay, secsPerWeek), constants_raviente.go (register IDs,
semaphore constants), and named constants across 14 handler files
replacing raw hex/numeric literals. Updates anti-patterns doc to
mark #4 (magic numbers) as substantially fixed.
Silently ignored DB errors in handlers could cause data loss (frontier
point transactions completing without DB writes), reward duplication
(stamp exchange granting items on failed UPDATE), and crashes (tower
mission page=0 causing index-out-of-bounds). House access state
defaulting to 0 on DB failure also bypassed all access controls.
HIGH risk fixes:
- frontier point buy/sell now fails with ACK on DB error
- stamp exchange/stampcard abort on failed UPDATE
- guild meal INSERT returns fail ACK instead of orphaned ID 0
- mercenary/airou creation aborts on failed sequence nextval
MEDIUM risk fixes:
- tower mission page clamped to >= 1 preventing array underflow
- tower RP donation returns early on failed guild state read
- house state defaults to 2 (password-protected) on DB failure
- playtime read failure logged instead of silently resetting RP
Also cache userID on Session at login time, eliminating ~25 redundant
subqueries of the form WHERE u.id=(SELECT c.user_id FROM characters
c WHERE c.id=$1) across shop, gacha, command, and distitem handlers.
Extract numeric literals into named constants across quest handling,
save data parsing, rengoku skill layout, diva event timing, guild info,
achievement trophies, RP accrual rates, and semaphore IDs. Adds
constants_quest.go for quest-related constants shared across functions.
Pure rename/extract with zero behavior change.
Handlers that log errors and return without sending a MsgSysAck leave
the client waiting indefinitely. Add doAckSimpleFail/doAckBufFail to
14 error paths across 4 files, matching the pattern already used in
~70 other error paths across the codebase.
Affected handlers:
- handleMsgMhfGetCafeDuration (1 path)
- handleMsgMhfSavedata (1 path)
- handleMsgMhfArrangeGuildMember (3 paths)
- handleMsgMhfEnumerateGuildMember (5 paths)
- handleMsgSysLogin (4 paths)
- handleMsgSysIssueLogkey (1 path)
Replace the mutable global `_config.ErupeConfig` with dependency
injection across 79 files. Config is now threaded through existing
paths: `ClientContext.RealClientMode` for packet encoding, `s.server.
erupeConfig` for channel handlers, and explicit parameters for utility
functions. This removes hidden coupling, enables test parallelism
without global save/restore, and prevents low-level packages from
reaching up to the config layer.
Key changes:
- Enrich ClientContext with RealClientMode for packet files
- Add mode parameter to CryptConn, mhfitem, mhfcourse functions
- Convert handlers_commands init() to lazy sync.Once initialization
- Delete global var, init(), and helper functions from config.go
- Update all tests to pass config explicitly
The channel server had several concurrency issues found by the race
detector during isolation testing:
- acceptClients could send on a closed acceptConns channel during
shutdown, causing a panic. Replace close(acceptConns) with a done
channel and select-based shutdown signaling in both acceptClients
and manageSessions.
- invalidateSessions read isShuttingDown and iterated sessions without
holding the lock. Rewrite with ticker + done channel select and
snapshot sessions under lock before processing timeouts.
- sendLoop/recvLoop accessed global _config.ErupeConfig.LoopDelay
which races with tests modifying the global. Use the per-server
erupeConfig instead.
- logoutPlayer panicked on DB errors and crashed on nil DB (no-db
test scenarios). Guard with nil check and log errors instead.
- Shutdown was not idempotent, double-calling caused double-close
panic on done channel.
Add 5 channel isolation tests verifying independent shutdown,
listener failure, session panic recovery, cross-channel registry
after shutdown, and stage isolation.
Enable multiple Erupe instances to share a single PostgreSQL database
without destroying each other's state, fix existing data races in
cross-channel access, and lay groundwork for future distributed
channel server deployments.
Phase 1 — DB safety:
- Scope DELETE FROM servers/sign_sessions to this instance's server IDs
- Fix ci++ bug where failed channel start shifted subsequent IDs
Phase 2 — Fix data races in cross-channel access:
- Lock sessions map in FindSessionByCharID and DisconnectUser
- Lock stagesLock in handleMsgSysLockGlobalSema
- Snapshot sessions/stages under lock in TransitMessage types 1-4
- Lock channel when finding mail notification targets
Phase 3 — ChannelRegistry interface:
- Define ChannelRegistry interface with 7 cross-channel operations
- Implement LocalChannelRegistry with proper locking
- Add SessionSnapshot/StageSnapshot immutable copy types
- Delegate WorldcastMHF, FindSessionByCharID, DisconnectUser to Registry
- Migrate LockGlobalSema and guild mail handlers to use Registry
- Add comprehensive tests including concurrent access
Phase 4 — Per-channel enable/disable:
- Add Enabled *bool to EntranceChannelInfo (nil defaults to true)
- Skip disabled channels in startup loop, preserving ID stability
- Add IsEnabled() helper with backward-compatible default
- Update config.example.json with Enabled field