The crash was in handleMsgMhfGetGuildManageRight where a variable
shadowing bug (guild, err := instead of guild, err =) left the
outer guild pointer nil after the inner GetByID lookup succeeded,
panicking at guild.MemberCount. This is confirmed by the user's
stack trace pointing to handlers_guild.go:234.
Also fix 6 other nil-dereference risks across guild handlers:
- handleMsgMhfArrangeGuildMember: guild nil after GetByID
- handleMsgMhfEnumerateGuildMember: alliance nil when AllianceID > 0
- handleMsgMhfUpdateGuildIcon: guild and characterInfo nil
- handleMsgMhfOperateGuild: guild nil, characterGuildInfo nil
- handleAvoidLeadershipUpdate: characterGuildData nil
Improve panic recovery to log opcode and full stack trace so
future panics can be diagnosed from console screenshots.
Wire ExcludeOpcodes config into RecordingConn so configured opcodes
(e.g. ping, nop, position) are filtered at record time. Add padded
metadata with in-place PatchMetadata to populate CharID/UserID after
login. Implement --mode replay using protbot's encrypted connection
with timing-aware packet sending, auto-ping response, concurrent
S→C collection, and byte-level payload diff reporting.
Add a recording and replay foundation for the MHF network protocol.
A RecordingConn decorator wraps network.Conn to transparently capture
all decrypted packets to binary .mhfr files, with zero handler changes
and zero overhead when disabled.
- network/pcap: binary capture format (writer, reader, filters)
- RecordingConn: thread-safe Conn decorator with direction tracking
- CaptureOptions in config (disabled by default)
- Capture wired into all three server types (sign, entrance, channel)
- cmd/replay: CLI tool with dump, json, stats, and compare modes
- 19 new tests, all passing with -race
The ignored() function is called on every packet when debug logging
is enabled. It was rebuilding a slice and map from scratch each time.
Move to a package-level map initialized once at startup.
Centralizes all 31 direct users-table SQL queries from 11 handler
files into a single UserRepository, following the same pattern as
CharacterRepository and GuildRepository. The only excluded query is
the sign_sessions JOIN in handleMsgSysLogin which spans multiple
tables.
Silently ignored DB errors in handlers could cause data loss (frontier
point transactions completing without DB writes), reward duplication
(stamp exchange granting items on failed UPDATE), and crashes (tower
mission page=0 causing index-out-of-bounds). House access state
defaulting to 0 on DB failure also bypassed all access controls.
HIGH risk fixes:
- frontier point buy/sell now fails with ACK on DB error
- stamp exchange/stampcard abort on failed UPDATE
- guild meal INSERT returns fail ACK instead of orphaned ID 0
- mercenary/airou creation aborts on failed sequence nextval
MEDIUM risk fixes:
- tower mission page clamped to >= 1 preventing array underflow
- tower RP donation returns early on failed guild state read
- house state defaults to 2 (password-protected) on DB failure
- playtime read failure logged instead of silently resetting RP
Also cache userID on Session at login time, eliminating ~25 redundant
subqueries of the form WHERE u.id=(SELECT c.user_id FROM characters
c WHERE c.id=$1) across shop, gacha, command, and distitem handlers.
ByteFrame previously panicked on out-of-bounds reads, which crashed
the server when parsing malformed client packets. Now sets a sticky
error (checked via Err()) and returns zero values, matching the
encoding/binary scanner pattern. The session recv loop checks Err()
after parsing to reject malformed packets gracefully.
Also replaces remaining panic("Not implemented") stubs in network
packet Build/Parse methods with proper error returns.
The handler table was a package-level global populated by init(), making
registration implicit and untestable. Move it to buildHandlerTable()
which returns the map, store it as a Server struct field initialized in
NewServer(), and add a missing-handler guard in handlePacketGroup to log
a warning instead of panicking on unknown opcodes.
Replace the mutable global `_config.ErupeConfig` with dependency
injection across 79 files. Config is now threaded through existing
paths: `ClientContext.RealClientMode` for packet encoding, `s.server.
erupeConfig` for channel handlers, and explicit parameters for utility
functions. This removes hidden coupling, enables test parallelism
without global save/restore, and prevents low-level packages from
reaching up to the config layer.
Key changes:
- Enrich ClientContext with RealClientMode for packet files
- Add mode parameter to CryptConn, mhfitem, mhfcourse functions
- Convert handlers_commands init() to lazy sync.Once initialization
- Delete global var, init(), and helper functions from config.go
- Update all tests to pass config explicitly
The channel server had several concurrency issues found by the race
detector during isolation testing:
- acceptClients could send on a closed acceptConns channel during
shutdown, causing a panic. Replace close(acceptConns) with a done
channel and select-based shutdown signaling in both acceptClients
and manageSessions.
- invalidateSessions read isShuttingDown and iterated sessions without
holding the lock. Rewrite with ticker + done channel select and
snapshot sessions under lock before processing timeouts.
- sendLoop/recvLoop accessed global _config.ErupeConfig.LoopDelay
which races with tests modifying the global. Use the per-server
erupeConfig instead.
- logoutPlayer panicked on DB errors and crashed on nil DB (no-db
test scenarios). Guard with nil check and log errors instead.
- Shutdown was not idempotent, double-calling caused double-close
panic on done channel.
Add 5 channel isolation tests verifying independent shutdown,
listener failure, session panic recovery, cross-channel registry
after shutdown, and stage isolation.
Add package-level documentation (doc.go) to all 22 first-party
packages and godoc comments to ~150 previously undocumented
exported symbols across common/, network/, and server/.
Fix unchecked error returns on bf.Seek(), db.Exec(), QueryRow().Scan(),
pkt.Build(), logger.Sync(), and binary.Write() calls. The linter now
passes with 0 errors, build compiles, and all tests pass with -race.
Re-enable the golangci-lint job in CI (disabled Oct 2025), update to
Go 1.25 and golangci-lint-action v7. Fix errcheck, gosimple S1009,
staticcheck SA4031 and SA2001 errors across 54 files. Remaining ~39
lint errors will be addressed in follow-up commits.
Move time utilities (TimeAdjusted, TimeMidnight, TimeWeekStart, TimeWeekNext,
TimeGameAbsolute) from channelserver into common/gametime to break the
inappropriate dependency where signserver, entranceserver, and api imported
the 38K-line channelserver package just for time functions.
Replace all fmt.Printf debug logging in sys_session.go and handlers_object.go
with structured zap logging for consistent observability.
Resolve conflict in handlers_stage.go: keep lock-free packet
building pattern (copy session list, release lock, then build)
over upstream's in-lock QueueSendMHF approach.
Fix test compilation: remove objectIDs field references after
upstream removed it from Server struct.
Resync vendor directory with updated go.mod dependencies.