- Add lint to build job's needs so lint failures prevent artifacts
- Combine race detector + coverage into a single test run
- Pin golangci-lint to v2.1.6 to avoid surprise breakage
- Rename workflow from go-improved.yml to go.yml
Silently ignored DB errors in handlers could cause data loss (frontier
point transactions completing without DB writes), reward duplication
(stamp exchange granting items on failed UPDATE), and crashes (tower
mission page=0 causing index-out-of-bounds). House access state
defaulting to 0 on DB failure also bypassed all access controls.
HIGH risk fixes:
- frontier point buy/sell now fails with ACK on DB error
- stamp exchange/stampcard abort on failed UPDATE
- guild meal INSERT returns fail ACK instead of orphaned ID 0
- mercenary/airou creation aborts on failed sequence nextval
MEDIUM risk fixes:
- tower mission page clamped to >= 1 preventing array underflow
- tower RP donation returns early on failed guild state read
- house state defaults to 2 (password-protected) on DB failure
- playtime read failure logged instead of silently resetting RP
Also cache userID on Session at login time, eliminating ~25 redundant
subqueries of the form WHERE u.id=(SELECT c.user_id FROM characters
c WHERE c.id=$1) across shop, gacha, command, and distitem handlers.
Binary I/O (#5): all 12 remaining encoding/binary calls are
legitimate (zero-alloc spot-reads, random-access into game blobs).
Copy-paste handlers (#8): loadCharacterData/saveCharacterData helpers
now cover standard blob patterns.
Also upgrades saveCharacterData to send doAckSimpleFail on oversize
payloads and DB errors, and migrates handleMsgMhfSaveScenarioData
to the improved helper.
handleMsgSysOperateRegister returned without sending an ACK when the
payload was empty, causing the client to softlock waiting for a response.
Send doAckBufSucceed with nil data on the early-return path to match
the success-path ACK type.
Also update tests to expect error returns instead of panics from
unimplemented Build/Parse stubs (matching prior panic→error refactor),
and mark resolved anti-patterns in docs.
ByteFrame previously panicked on out-of-bounds reads, which crashed
the server when parsing malformed client packets. Now sets a sticky
error (checked via Err()) and returns zero values, matching the
encoding/binary scanner pattern. The session recv loop checks Err()
after parsing to reject malformed packets gracefully.
Also replaces remaining panic("Not implemented") stubs in network
packet Build/Parse methods with proper error returns.
Extract numeric literals into named constants across quest handling,
save data parsing, rengoku skill layout, diva event timing, guild info,
achievement trophies, RP accrual rates, and semaphore IDs. Adds
constants_quest.go for quest-related constants shared across functions.
Pure rename/extract with zero behavior change.
Handlers that log errors and return without sending a MsgSysAck leave
the client waiting indefinitely. Add doAckSimpleFail/doAckBufFail to
14 error paths across 4 files, matching the pattern already used in
~70 other error paths across the codebase.
Affected handlers:
- handleMsgMhfGetCafeDuration (1 path)
- handleMsgMhfSavedata (1 path)
- handleMsgMhfArrangeGuildMember (3 paths)
- handleMsgMhfEnumerateGuildMember (5 paths)
- handleMsgSysLogin (4 paths)
- handleMsgSysIssueLogkey (1 path)
Replace all fmt.Printf/Println and log.Printf/Fatal with structured
zap.Logger calls to eliminate inconsistent logging (anti-pattern #12).
- network/crypt_conn: inject logger via NewCryptConn, replace 6 fmt calls
- signserver/session: use existing s.logger for debug packet dumps
- entranceserver: use s.logger for inbound/outbound debug logging
- api/utils: accept logger param in verifyPath, replace fmt.Println
- api/endpoints: use s.logger for screenshot path diagnostics
- config: replace log.Fatal with error return in getOutboundIP4
- deltacomp: replace log.Printf with zap.L() global logger
The handler table was a package-level global populated by init(), making
registration implicit and untestable. Move it to buildHandlerTable()
which returns the map, store it as a Server struct field initialized in
NewServer(), and add a missing-handler guard in handlePacketGroup to log
a warning instead of panicking on unknown opcodes.
Covers 13 identified anti-patterns across error handling, architecture,
concurrency, and code organization with severity ratings and
recommended refactoring priorities.
Replace vague "Alpelo object system backport" references in CHANGELOG
and AUTHORS with accurate descriptions of the per-session object ID
allocation rework. Remove stale test comment referencing the deleted
NextObjectID method.
Replace the mutable global `_config.ErupeConfig` with dependency
injection across 79 files. Config is now threaded through existing
paths: `ClientContext.RealClientMode` for packet encoding, `s.server.
erupeConfig` for channel handlers, and explicit parameters for utility
functions. This removes hidden coupling, enables test parallelism
without global save/restore, and prevents low-level packages from
reaching up to the config layer.
Key changes:
- Enrich ClientContext with RealClientMode for packet files
- Add mode parameter to CryptConn, mhfitem, mhfcourse functions
- Convert handlers_commands init() to lazy sync.Once initialization
- Delete global var, init(), and helper functions from config.go
- Update all tests to pass config explicitly
Summary
- Each channel server now runs as a fully independent instance with its own listener, goroutines, and
state — one channel crashing or shutting down no longer affects the others
- Introduces a ChannelRegistry interface for cross-channel operations (find session, disconnect user,
worldcast) replacing direct iteration over a shared []*Server slice
- Adds cmd/protbot, a headless MHF protocol bot that exercises the full sign → entrance → channel
flow for automated testing
- Fixes several data races and panics found by the race detector during isolation testing
Changes
Channel server isolation (server/channelserver/)
- ChannelRegistry interface + LocalChannelRegistry implementation for cross-channel lookups
- done channel for clean goroutine shutdown signaling, idempotent Shutdown()
- Race-free acceptClients/manageSessions using select on done instead of closing acceptConns
- invalidateSessions rewritten with proper locking (snapshot under lock, process outside)
- logoutPlayer guards nil DB and logs errors instead of panicking
- Session loops use per-server erupeConfig instead of global _config.ErupeConfig
- Per-channel Enabled flag in config for selectively disabling channels
Protocol bot (cmd/protbot/)
- Standalone Blowfish connection package (no dependency on server config)
- Sign, entrance, and channel protocol implementations
- 5 scenario actions: login, lobby, session, chat, quests
- 19 unit tests covering packet building, parsing, and connection handling
Bug fixes
- Nil decompSave panic on disconnect before character data loads
- Docker Postgres 18 volume mount path (/var/lib/postgresql/ not /data/)
Test plan
- go test -race ./... passes (27 packages, 0 races)
- 5 channel isolation tests verify: independent shutdown, listener failure recovery, session panic
containment, cross-channel registry after shutdown, stage isolation
- Protbot live-tested against Docker stack (all 5 actions)
- Existing config.json files work unchanged (Enabled defaults to false but config.example.json sets
it explicitly)
The channel server had several concurrency issues found by the race
detector during isolation testing:
- acceptClients could send on a closed acceptConns channel during
shutdown, causing a panic. Replace close(acceptConns) with a done
channel and select-based shutdown signaling in both acceptClients
and manageSessions.
- invalidateSessions read isShuttingDown and iterated sessions without
holding the lock. Rewrite with ticker + done channel select and
snapshot sessions under lock before processing timeouts.
- sendLoop/recvLoop accessed global _config.ErupeConfig.LoopDelay
which races with tests modifying the global. Use the per-server
erupeConfig instead.
- logoutPlayer panicked on DB errors and crashed on nil DB (no-db
test scenarios). Guard with nil check and log errors instead.
- Shutdown was not idempotent, double-calling caused double-close
panic on done channel.
Add 5 channel isolation tests verifying independent shutdown,
listener failure, session panic recovery, cross-channel registry
after shutdown, and stage isolation.
The protbot sent "DSGN:\x00" as the sign request type, but the server
strips the last 3 characters as a version suffix. Send "DSGN:041"
(ZZ client mode 41) to match the real client format.
The entrance channel entry parser read 14 bytes for remaining fields
but the server writes 18 bytes (9 uint16, not 7), causing a panic
when parsing the server list.
The channel server panicked on disconnect when a session had no
decompressed save data (e.g. protbot or early client disconnect).
Guard Save() against nil decompSave.
Also fix docker-compose volume mount for Postgres 18 which changed
its data directory layout.
Copy MHBridge into the Erupe module as cmd/protbot/ so it can be
built, tested, and maintained alongside the server. The bot
implements the full sign → entrance → channel login flow and
supports lobby entry, chat, session setup, and quest enumeration.
The conn/ package keeps its own Blowfish crypto primitives to avoid
importing erupe-ce/config (which requires a config file at init).
Enable multiple Erupe instances to share a single PostgreSQL database
without destroying each other's state, fix existing data races in
cross-channel access, and lay groundwork for future distributed
channel server deployments.
Phase 1 — DB safety:
- Scope DELETE FROM servers/sign_sessions to this instance's server IDs
- Fix ci++ bug where failed channel start shifted subsequent IDs
Phase 2 — Fix data races in cross-channel access:
- Lock sessions map in FindSessionByCharID and DisconnectUser
- Lock stagesLock in handleMsgSysLockGlobalSema
- Snapshot sessions/stages under lock in TransitMessage types 1-4
- Lock channel when finding mail notification targets
Phase 3 — ChannelRegistry interface:
- Define ChannelRegistry interface with 7 cross-channel operations
- Implement LocalChannelRegistry with proper locking
- Add SessionSnapshot/StageSnapshot immutable copy types
- Delegate WorldcastMHF, FindSessionByCharID, DisconnectUser to Registry
- Migrate LockGlobalSema and guild mail handlers to use Registry
- Add comprehensive tests including concurrent access
Phase 4 — Per-channel enable/disable:
- Add Enabled *bool to EntranceChannelInfo (nil defaults to true)
- Skip disabled channels in startup loop, preserving ID stability
- Add IsEnabled() helper with backward-compatible default
- Update config.example.json with Enabled field
Wii U decompilation of all 6 callers of snj_stage_create confirms
the field distinguishes new stage creation (1) from entering an
existing stage (2): lobby/myhouse/quest pass 1, guild room and
move operations pass 2.
A malicious or buggy client could send arbitrarily large payloads
that get written directly to PostgreSQL, wasting disk and memory.
Each save handler now rejects payloads exceeding a generous upper
bound derived from the known data format sizes.
Covers all remaining items from #158: partner, hunternavi,
savemercenary, scenariodata, platedata, platebox, platemyset,
rengokudata, mezfes, savefavoritequest, house_furniture, mission.
Closes#158
Several handlers used packet fields as array indices or SQL column
names without bounds checking, allowing crafted packets to panic the
server or produce malformed SQL.
Panic fixes (high severity):
- handlers_mail: bounds check AccIndex against mailList length
- handlers_misc: validate ArmourID >= 10000 and MogType <= 4
- handlers_mercenary: check RawDataPayload length before slicing
- handlers_house: check RawDataPayload length in SaveDecoMyset
- handlers_register: guard empty RawDataPayload in OperateRegister
SQL column name fixes (medium severity):
- handlers_misc: early return on unknown PointType
- handlers_items: reject unknown StampType in weekly stamp handlers
- handlers_achievement: cap AchievementID at 32
- handlers_goocoo: skip goocoo.Index > 4
- handlers_house: cap BoxIndex for warehouse operations
- handlers_tower: fix MissionIndex=0 bypassing normalization guard
UserBinary type1-5 and EnhancedMinidata are transient session state
resent by the client on every login. Persisting them to the DB on
every set was unnecessary I/O. Both are now served exclusively from
server-scoped in-memory maps (userBinaryParts, minidataParts).
Includes a schema migration to drop the now-unused type2/type3
columns from user_binary and minidata column from characters.
Ref #158
- Reject BinaryType outside 1-5 in SetUserBinary to prevent
dynamic column name with unchecked client input
- Check rengoku payload length before DB write and fixed-offset
reads to prevent panic on short payloads
- Require MercData >= 4 bytes before ReadUint32 to prevent panic
Ref: Mezeporta/Erupe#158
GetStageBinary and WaitStageBinary silently dropped the ACK when
the requested stage did not exist, leaving the client waiting
indefinitely. Additionally, BinaryType1 == 4 and unknown binary
types returned a completely empty response (zero bytes), which
earlier clients cannot parse as a counted structure.
Return a 4-byte zero response (empty entry count) in all fallback
paths so the client always receives a valid ACK it can parse.
Verified via Wii U decompilation of putUpdate_house: the field is set
to 0 when no password is provided, and 1 when a password string is
present. The previous comment "Always 0x01" was inaccurate.
Add package-level documentation (doc.go) to all 22 first-party
packages and godoc comments to ~150 previously undocumented
exported symbols across common/, network/, and server/.
Ghidra decompilation of hf_gp_main in the Wii U binary revealed that
these three fields are reward eligibility thresholds checked against
the player's Hunter Rank, max Skill Rank, and G Rank respectively.
The extra reward fields (Unk5, Unk6, Unk7) in the InfoFesta response
were gated at >= G1, but G1 clients do not expect these 5 extra bytes
per reward entry. This caused the entire packet after the rewards
section to be misaligned, corrupting MaximumFP, leaderboards, and
bonus rates — which broke the festa UI including trial voting.
Wii U disassembly of import_festa_info (0x02C470EC, 1068 bytes)
confirms G3-Z2 reads these fields. G1 binary analysis shows only
8 festa packets (vs 12 in ZZ), and the intermediate/personal prize
systems were not added until G5.2/G7 respectively.
Move SavePointer type/constants, CharacterSaveData struct, getPointers,
Compress, Decompress, and save data serialization methods out of
handlers_character.go into a dedicated model file.
Split three large files into focused modules:
- handlers_guild.go: extract types/ORM into guild_model.go
- handlers_cast_binary.go: extract command parser into handlers_commands.go
- handlers.go: move seibattle types/handlers into handlers_seibattle.go
Separate the two distinct systems into focused files:
- handlers_shop.go: item shops, exchange shops, frontier point trading
- handlers_gacha.go: normal/stepup/box/free gacha, coin management
The client's Sky Corridor area transition handler saves rengoku data
before the load response is parsed into the character data area,
producing saves with zeroed skill fields but preserved point totals.
Detect this pattern server-side and merge existing skill data from the
database. Also reject sentinel-zero saves when valid data already exists.
- Guard nil listener/acceptConns in Server.Shutdown() to prevent panic
in test servers that don't bind a network listener
- Remove redundant userBinaryPartsLock in TestHandleMsgMhfLoaddata that
caused a deadlock with handleMsgMhfLoaddata's own lock acquisition
- Increase test save blob size from 200 to 150000 bytes to accommodate
ZZ save pointer offsets (up to 146728)
- Initialize MHFEquipment.Sigils[].Effects slices in test data to
prevent index-out-of-range panic in SerializeWarehouseEquipment
- Insert warehouse row before updating it (UPDATE on 0 rows is not an
error, so the INSERT fallback never triggered)
- Use COALESCE for nullable kouryou_point column in kill counter test
- Fix duplicate-add test expectation (CSV helper correctly deduplicates)
Anchor the token regex to ^[A-Za-z0-9]+$ so partial matches on
traversal strings like "../../etc/passwd" are rejected. Refactor
the handler to use early returns so execution stops immediately
on validation failure instead of falling through to os.Create
with tainted input.
Use a `deleted` boolean column (matching characters and mail tables)
instead of permanently removing guild_posts rows. This makes excess
post purging and manual deletion reversible.
Add explicit error discards (_ =) for Close() calls on network
connections, SQL rows, and file handles across 28 files. Also add
.golangci.yml with standard linter defaults to match CI configuration.
golangci-lint v2 removed the --out-format CLI flag, causing the lint
job to fail. The golangci-lint-action v7 already uses problem matchers
to surface issues natively in GitHub Actions.
138 bare db.Exec calls across 22 handler files silently dropped write
errors. Each is now wrapped with error check and zap logging.
4 QueryRow sites that legitimately return sql.ErrNoRows during normal
operation (new player mezfes, festa rankings, empty guild item box)
now filter it out to reduce log noise.