Handlers that log errors and return without sending a MsgSysAck leave
the client waiting indefinitely. Add doAckSimpleFail/doAckBufFail to
14 error paths across 4 files, matching the pattern already used in
~70 other error paths across the codebase.
Affected handlers:
- handleMsgMhfGetCafeDuration (1 path)
- handleMsgMhfSavedata (1 path)
- handleMsgMhfArrangeGuildMember (3 paths)
- handleMsgMhfEnumerateGuildMember (5 paths)
- handleMsgSysLogin (4 paths)
- handleMsgSysIssueLogkey (1 path)
Replace all fmt.Printf/Println and log.Printf/Fatal with structured
zap.Logger calls to eliminate inconsistent logging (anti-pattern #12).
- network/crypt_conn: inject logger via NewCryptConn, replace 6 fmt calls
- signserver/session: use existing s.logger for debug packet dumps
- entranceserver: use s.logger for inbound/outbound debug logging
- api/utils: accept logger param in verifyPath, replace fmt.Println
- api/endpoints: use s.logger for screenshot path diagnostics
- config: replace log.Fatal with error return in getOutboundIP4
- deltacomp: replace log.Printf with zap.L() global logger
The handler table was a package-level global populated by init(), making
registration implicit and untestable. Move it to buildHandlerTable()
which returns the map, store it as a Server struct field initialized in
NewServer(), and add a missing-handler guard in handlePacketGroup to log
a warning instead of panicking on unknown opcodes.
Covers 13 identified anti-patterns across error handling, architecture,
concurrency, and code organization with severity ratings and
recommended refactoring priorities.
Replace vague "Alpelo object system backport" references in CHANGELOG
and AUTHORS with accurate descriptions of the per-session object ID
allocation rework. Remove stale test comment referencing the deleted
NextObjectID method.
Replace the mutable global `_config.ErupeConfig` with dependency
injection across 79 files. Config is now threaded through existing
paths: `ClientContext.RealClientMode` for packet encoding, `s.server.
erupeConfig` for channel handlers, and explicit parameters for utility
functions. This removes hidden coupling, enables test parallelism
without global save/restore, and prevents low-level packages from
reaching up to the config layer.
Key changes:
- Enrich ClientContext with RealClientMode for packet files
- Add mode parameter to CryptConn, mhfitem, mhfcourse functions
- Convert handlers_commands init() to lazy sync.Once initialization
- Delete global var, init(), and helper functions from config.go
- Update all tests to pass config explicitly
Summary
- Each channel server now runs as a fully independent instance with its own listener, goroutines, and
state — one channel crashing or shutting down no longer affects the others
- Introduces a ChannelRegistry interface for cross-channel operations (find session, disconnect user,
worldcast) replacing direct iteration over a shared []*Server slice
- Adds cmd/protbot, a headless MHF protocol bot that exercises the full sign → entrance → channel
flow for automated testing
- Fixes several data races and panics found by the race detector during isolation testing
Changes
Channel server isolation (server/channelserver/)
- ChannelRegistry interface + LocalChannelRegistry implementation for cross-channel lookups
- done channel for clean goroutine shutdown signaling, idempotent Shutdown()
- Race-free acceptClients/manageSessions using select on done instead of closing acceptConns
- invalidateSessions rewritten with proper locking (snapshot under lock, process outside)
- logoutPlayer guards nil DB and logs errors instead of panicking
- Session loops use per-server erupeConfig instead of global _config.ErupeConfig
- Per-channel Enabled flag in config for selectively disabling channels
Protocol bot (cmd/protbot/)
- Standalone Blowfish connection package (no dependency on server config)
- Sign, entrance, and channel protocol implementations
- 5 scenario actions: login, lobby, session, chat, quests
- 19 unit tests covering packet building, parsing, and connection handling
Bug fixes
- Nil decompSave panic on disconnect before character data loads
- Docker Postgres 18 volume mount path (/var/lib/postgresql/ not /data/)
Test plan
- go test -race ./... passes (27 packages, 0 races)
- 5 channel isolation tests verify: independent shutdown, listener failure recovery, session panic
containment, cross-channel registry after shutdown, stage isolation
- Protbot live-tested against Docker stack (all 5 actions)
- Existing config.json files work unchanged (Enabled defaults to false but config.example.json sets
it explicitly)
The channel server had several concurrency issues found by the race
detector during isolation testing:
- acceptClients could send on a closed acceptConns channel during
shutdown, causing a panic. Replace close(acceptConns) with a done
channel and select-based shutdown signaling in both acceptClients
and manageSessions.
- invalidateSessions read isShuttingDown and iterated sessions without
holding the lock. Rewrite with ticker + done channel select and
snapshot sessions under lock before processing timeouts.
- sendLoop/recvLoop accessed global _config.ErupeConfig.LoopDelay
which races with tests modifying the global. Use the per-server
erupeConfig instead.
- logoutPlayer panicked on DB errors and crashed on nil DB (no-db
test scenarios). Guard with nil check and log errors instead.
- Shutdown was not idempotent, double-calling caused double-close
panic on done channel.
Add 5 channel isolation tests verifying independent shutdown,
listener failure, session panic recovery, cross-channel registry
after shutdown, and stage isolation.
The protbot sent "DSGN:\x00" as the sign request type, but the server
strips the last 3 characters as a version suffix. Send "DSGN:041"
(ZZ client mode 41) to match the real client format.
The entrance channel entry parser read 14 bytes for remaining fields
but the server writes 18 bytes (9 uint16, not 7), causing a panic
when parsing the server list.
The channel server panicked on disconnect when a session had no
decompressed save data (e.g. protbot or early client disconnect).
Guard Save() against nil decompSave.
Also fix docker-compose volume mount for Postgres 18 which changed
its data directory layout.
Copy MHBridge into the Erupe module as cmd/protbot/ so it can be
built, tested, and maintained alongside the server. The bot
implements the full sign → entrance → channel login flow and
supports lobby entry, chat, session setup, and quest enumeration.
The conn/ package keeps its own Blowfish crypto primitives to avoid
importing erupe-ce/config (which requires a config file at init).
Enable multiple Erupe instances to share a single PostgreSQL database
without destroying each other's state, fix existing data races in
cross-channel access, and lay groundwork for future distributed
channel server deployments.
Phase 1 — DB safety:
- Scope DELETE FROM servers/sign_sessions to this instance's server IDs
- Fix ci++ bug where failed channel start shifted subsequent IDs
Phase 2 — Fix data races in cross-channel access:
- Lock sessions map in FindSessionByCharID and DisconnectUser
- Lock stagesLock in handleMsgSysLockGlobalSema
- Snapshot sessions/stages under lock in TransitMessage types 1-4
- Lock channel when finding mail notification targets
Phase 3 — ChannelRegistry interface:
- Define ChannelRegistry interface with 7 cross-channel operations
- Implement LocalChannelRegistry with proper locking
- Add SessionSnapshot/StageSnapshot immutable copy types
- Delegate WorldcastMHF, FindSessionByCharID, DisconnectUser to Registry
- Migrate LockGlobalSema and guild mail handlers to use Registry
- Add comprehensive tests including concurrent access
Phase 4 — Per-channel enable/disable:
- Add Enabled *bool to EntranceChannelInfo (nil defaults to true)
- Skip disabled channels in startup loop, preserving ID stability
- Add IsEnabled() helper with backward-compatible default
- Update config.example.json with Enabled field
Wii U decompilation of all 6 callers of snj_stage_create confirms
the field distinguishes new stage creation (1) from entering an
existing stage (2): lobby/myhouse/quest pass 1, guild room and
move operations pass 2.
A malicious or buggy client could send arbitrarily large payloads
that get written directly to PostgreSQL, wasting disk and memory.
Each save handler now rejects payloads exceeding a generous upper
bound derived from the known data format sizes.
Covers all remaining items from #158: partner, hunternavi,
savemercenary, scenariodata, platedata, platebox, platemyset,
rengokudata, mezfes, savefavoritequest, house_furniture, mission.
Closes#158
Several handlers used packet fields as array indices or SQL column
names without bounds checking, allowing crafted packets to panic the
server or produce malformed SQL.
Panic fixes (high severity):
- handlers_mail: bounds check AccIndex against mailList length
- handlers_misc: validate ArmourID >= 10000 and MogType <= 4
- handlers_mercenary: check RawDataPayload length before slicing
- handlers_house: check RawDataPayload length in SaveDecoMyset
- handlers_register: guard empty RawDataPayload in OperateRegister
SQL column name fixes (medium severity):
- handlers_misc: early return on unknown PointType
- handlers_items: reject unknown StampType in weekly stamp handlers
- handlers_achievement: cap AchievementID at 32
- handlers_goocoo: skip goocoo.Index > 4
- handlers_house: cap BoxIndex for warehouse operations
- handlers_tower: fix MissionIndex=0 bypassing normalization guard
UserBinary type1-5 and EnhancedMinidata are transient session state
resent by the client on every login. Persisting them to the DB on
every set was unnecessary I/O. Both are now served exclusively from
server-scoped in-memory maps (userBinaryParts, minidataParts).
Includes a schema migration to drop the now-unused type2/type3
columns from user_binary and minidata column from characters.
Ref #158
- Reject BinaryType outside 1-5 in SetUserBinary to prevent
dynamic column name with unchecked client input
- Check rengoku payload length before DB write and fixed-offset
reads to prevent panic on short payloads
- Require MercData >= 4 bytes before ReadUint32 to prevent panic
Ref: Mezeporta/Erupe#158
GetStageBinary and WaitStageBinary silently dropped the ACK when
the requested stage did not exist, leaving the client waiting
indefinitely. Additionally, BinaryType1 == 4 and unknown binary
types returned a completely empty response (zero bytes), which
earlier clients cannot parse as a counted structure.
Return a 4-byte zero response (empty entry count) in all fallback
paths so the client always receives a valid ACK it can parse.
Verified via Wii U decompilation of putUpdate_house: the field is set
to 0 when no password is provided, and 1 when a password string is
present. The previous comment "Always 0x01" was inaccurate.
Add package-level documentation (doc.go) to all 22 first-party
packages and godoc comments to ~150 previously undocumented
exported symbols across common/, network/, and server/.
Ghidra decompilation of hf_gp_main in the Wii U binary revealed that
these three fields are reward eligibility thresholds checked against
the player's Hunter Rank, max Skill Rank, and G Rank respectively.
The extra reward fields (Unk5, Unk6, Unk7) in the InfoFesta response
were gated at >= G1, but G1 clients do not expect these 5 extra bytes
per reward entry. This caused the entire packet after the rewards
section to be misaligned, corrupting MaximumFP, leaderboards, and
bonus rates — which broke the festa UI including trial voting.
Wii U disassembly of import_festa_info (0x02C470EC, 1068 bytes)
confirms G3-Z2 reads these fields. G1 binary analysis shows only
8 festa packets (vs 12 in ZZ), and the intermediate/personal prize
systems were not added until G5.2/G7 respectively.
Move SavePointer type/constants, CharacterSaveData struct, getPointers,
Compress, Decompress, and save data serialization methods out of
handlers_character.go into a dedicated model file.
Split three large files into focused modules:
- handlers_guild.go: extract types/ORM into guild_model.go
- handlers_cast_binary.go: extract command parser into handlers_commands.go
- handlers.go: move seibattle types/handlers into handlers_seibattle.go
Separate the two distinct systems into focused files:
- handlers_shop.go: item shops, exchange shops, frontier point trading
- handlers_gacha.go: normal/stepup/box/free gacha, coin management
The client's Sky Corridor area transition handler saves rengoku data
before the load response is parsed into the character data area,
producing saves with zeroed skill fields but preserved point totals.
Detect this pattern server-side and merge existing skill data from the
database. Also reject sentinel-zero saves when valid data already exists.
- Guard nil listener/acceptConns in Server.Shutdown() to prevent panic
in test servers that don't bind a network listener
- Remove redundant userBinaryPartsLock in TestHandleMsgMhfLoaddata that
caused a deadlock with handleMsgMhfLoaddata's own lock acquisition
- Increase test save blob size from 200 to 150000 bytes to accommodate
ZZ save pointer offsets (up to 146728)
- Initialize MHFEquipment.Sigils[].Effects slices in test data to
prevent index-out-of-range panic in SerializeWarehouseEquipment
- Insert warehouse row before updating it (UPDATE on 0 rows is not an
error, so the INSERT fallback never triggered)
- Use COALESCE for nullable kouryou_point column in kill counter test
- Fix duplicate-add test expectation (CSV helper correctly deduplicates)
Anchor the token regex to ^[A-Za-z0-9]+$ so partial matches on
traversal strings like "../../etc/passwd" are rejected. Refactor
the handler to use early returns so execution stops immediately
on validation failure instead of falling through to os.Create
with tainted input.
Use a `deleted` boolean column (matching characters and mail tables)
instead of permanently removing guild_posts rows. This makes excess
post purging and manual deletion reversible.
Add explicit error discards (_ =) for Close() calls on network
connections, SQL rows, and file handles across 28 files. Also add
.golangci.yml with standard linter defaults to match CI configuration.
golangci-lint v2 removed the --out-format CLI flag, causing the lint
job to fail. The golangci-lint-action v7 already uses problem matchers
to surface issues natively in GitHub Actions.
138 bare db.Exec calls across 22 handler files silently dropped write
errors. Each is now wrapped with error check and zap logging.
4 QueryRow sites that legitimately return sql.ErrNoRows during normal
operation (new player mezfes, festa rankings, empty guild item box)
now filter it out to reduce log noise.
Purge oldest guild posts beyond the limit (100 messages, 4 news) after
each new post is created. Replace misleading alliance application TODO
with a note that the feature is not yet implemented.
Fix unchecked error returns on bf.Seek(), db.Exec(), QueryRow().Scan(),
pkt.Build(), logger.Sync(), and binary.Write() calls. The linter now
passes with 0 errors, build compiles, and all tests pass with -race.
Re-enable the golangci-lint job in CI (disabled Oct 2025), update to
Go 1.25 and golangci-lint-action v7. Fix errcheck, gosimple S1009,
staticcheck SA4031 and SA2001 errors across 54 files. Remaining ~39
lint errors will be addressed in follow-up commits.
Move time utilities (TimeAdjusted, TimeMidnight, TimeWeekStart, TimeWeekNext,
TimeGameAbsolute) from channelserver into common/gametime to break the
inappropriate dependency where signserver, entranceserver, and api imported
the 38K-line channelserver package just for time functions.
Replace all fmt.Printf debug logging in sys_session.go and handlers_object.go
with structured zap logging for consistent observability.
Add error checking and logging for ~25 database call sites that were
silently dropping errors, preventing resource leaks (unclosed rows),
nil pointer panics, and silent data corruption in festa transactions.
- Remove deprecated version field from docker-compose.yml
- Pin Postgres to 18-alpine (matches existing db-data)
- Remove undocumented web (Apache) service
- Fix config/bin volume mounts to use docker/ directory
- Gitignore docker/savedata, docker/bin, docker/config.json
- Rewrite docker/README.md: fix typos, use docker compose V2
commands, match actual compose file behavior
- Link docker/README.md from main README Docker section