Skip to content

← All posts

Detecting bots in an MMORPG: server-side weight accumulation

Bots aren't caught with a binary rule. Here's how I built a weight-accumulation detection service on Dofus Touch, and why this approach avoids false positives.

Detecting bots in an MMORPG: server-side weight accumulation

On Dofus Touch I built a server-side bot detection service. When you tackle this problem, the temptation is to code hard rules: “if the player clicks exactly every 800 ms for two hours, it’s a bot.” That works for five minutes, until the bot adds a bit of jitter and becomes invisible again. And during those five minutes, you’re also banning real players who happen to have a regular pattern.

The approach that holds is weight accumulation. Here’s the idea.

Principle

Every player action contributes to one or more weighted counters. No signal taken in isolation triggers a ban. But when several signals converge over time, the global score crosses a threshold and the account is flagged.

It’s the low-tech equivalent of feature engineering for a scoring model: you encode domain knowledge as weights, observe the distribution, adjust.

Which signals

In an MMORPG, useful signals are never “a single metric.” They’re a bundle:

  • Click timing regularity (variance too low = suspect)
  • Movement patterns (paths always identical to the pixel)
  • Inventory sequences (drag-and-drop with perfect timing)
  • Repeated combat cycles (same spell sequence, same target, on loop)
  • 24/7 connection without human-style disconnects (real players go to the bathroom)
  • Optimal farming routes (a human always takes a detour)

Each signal has a weight. Some are red (perfect millisecond timing across 1000 actions), others are yellow (no break longer than 30 minutes over 12 hours).

Architecture

On the server side, the implementation is deliberately simple:

// Pseudo-code Java
public class BotScoreAccumulator {
    private final Map<PlayerId, ScoreState> scores;

    public void onEvent(PlayerEvent event) {
        ScoreState state = scores.computeIfAbsent(event.playerId(), ScoreState::new);
        for (Detector detector : detectors) {
            double weight = detector.evaluate(event, state.history());
            state.add(detector.id(), weight);
        }
        if (state.totalScore() > FLAG_THRESHOLD) {
            flagForReview(event.playerId(), state.snapshot());
        }
    }
}

A few structural choices:

  • Scoring is in memory, persisted periodically. We need to evaluate in real time without a DB round-trip per event.
  • Time decay: weights erode over time. Suspect behaviour from three days ago weighs less than three minutes ago. Otherwise, you accumulate forever and everyone ends up suspect.
  • Snapshot on flag: when the threshold is hit, the full state is saved (which detectors fired, with what weights, on what time window) so the decision can be audited later.

Why not “real” ML

I considered a pure ML approach (random forest on extracted features, or a sequence model). For this specific context, I chose not to. Here’s why.

  • Auditability: when a player contests a ban, I can show “here are the detectors that fired, with their exact weights.” With a black-box classifier, that’s much harder.
  • Fast iteration: adding a new signal means adding a detector of a few lines of code. With an ML model, you need to retrain, validate, redeploy.
  • Cost: zero GPU infra, zero MLOps. A standard Java service.
  • Cat-and-mouse game: bots evolve. Heuristic detectors adjust within days; an ML model demands a full cycle.

ML would have been the right call if I’d needed to classify complex behaviours zero-shot on new bot types. For farming-bot detection on a well-known MMORPG, weighted heuristics iterate faster and are more operationally defensible.

False positives: the real judge

The real test of an anti-bot system isn’t “how many bots did it ban.” It’s “how many real players did it ban by mistake.” A single abusive ban on a popular player and the community manager loses their weekend.

Two safeguards:

  1. A “review” stage between flag and ban. Hitting the flag threshold triggers an investigation (by a GM or a third-party service), not an automatic ban. The auto-ban threshold is much higher.
  2. Feedback loop: every manual ban or ban reversal feeds back into the analysis to adjust weights.

What I take away

  • Weighted heuristics beat binary rules for this kind of problem.
  • Time-decayed accumulation prevents drift.
  • Auditability is a feature, not a nice-to-have. When a player contests, you have to be able to explain.
  • ML would come after, on the cases the heuristics don’t catch.

On Dofus Touch, this system reduced the number of non-human accounts. Not through a great algorithmic revolution: just disciplined domain modelling and rigour about false positives.