Short Stories
Travelogues
Personal Musings

Esports Have A Focus Problem

Esports Have A Focus Problem

Lemme start at the beginning: Stop making esports for gaming fans.

I mean that.

Yes, I know all the arguments, and yes I know it feels like the right decision, but it’s the kind of “easy” decision that has catastrophic consequences. It feels right in the moment, and may even lead to a quick spark, but leads to an unrecoverable and inevitable death spiral.

Why? Fuck am I glad you asked. Let’s get down and dirty. But first: a flashback to the old days of sports.

Big Sports know how to convey complex interactions simply.

Media is often a discovery process in terms of identifying what makes the thing unique and worth paying attention to. TV was, for a long time, filming radio. Literally just an additional visual component to the exact same radio program.

And even after it moved beyond that, it took another 50 years for TV to realize something very simple and kick off the golden age of TV: lean into the inherent strengths of the medium.

TV didn't start competing with film as an artistic medium until someone realized "hey, we can spend 100 hours with these characters, following their journeys and growth, and that's a differentiating characteristic that we can take advantage of." Seems obvious right? But 30 years ago the mantra was that film was flashy, but TV was the cheap and repeatable revenue stream that kept the doors open. Leaning into the strengths of the medium is what let TV reach its golden era (today.)

Sports started the same way. Sports are hard to convey on a flat screen. Much of the action takes place on a mostly flat plane, but it still took a long time to discover the visual and media language required to convey the totality of the experience. We needed time to realize that the medium benefitted from isometric viewing angles. One of the big breakthroughs was for commentary to understand what needed to be conveyed to the audience: for a long time, sports commentary was just the radio commentary (full descriptive conveyance of everything going on) broadcast over a video image - echoing early tv mirroring radio programs.

Sixty years later, the NFL has realized that the value of their commentary wasn’t radio-esque description, but more in-depth analysis. And that analysis needed to be aimed not at regular viewers, but newcomers. Regular viewers knew the game, but newcomers needed to be told - on a play by play basis - what was going on. And why. And what they should look for next.

The goal was to turn awareness into fandom (and merch sales, tickets, etc.) but the path led by necessity through understanding. So essentially, sports devised an onboarding path to get someone from awareness to interest, to understanding, to passion.

So you got information between plays about intent, what the players were doing and trying to do, replays with big red circles on the screen to make sure you knew where to look, and more. All in order to make you a better and more educated participant of the game.

NFL commentary is a tutorial level for watching football. For many of the same reasons.

It made the tv commentating experience additive and unique. No longer was the audio entirely redundant to the video. Other sports quickly did the same, and in the 80s and 90s it became the standard. Former players were hired to give their “expert opinions” and lend more credence to the analysis, and broadcasters got into hiring battles over which former Super Bowl winning quarterback they could shovel Super Bowl winning money towards in exchange for a couple soundbites on Sunday.

UFC came along and added one more key element: they opened every single broadcast by *explaining the sport to newcomers every time.* Whereas football’s rules are often obtuse and needing explanation in the moment, and Ted Lasso was able to make a running joke out of the convoluted If->Then statements that is soccer’s offsides rules, UFC created something so simple it became a meme: here are the rules of the ring. Here is 99% of everything you need to know in four bullet points. It made it all seem so easy to understand. Then once you were watching, the commentary was almost entirely devoted to discussing what the fighters were trying to accomplish and why. Suddenly you could track the success or failure state of a hundred small interactions through that intention -> success/failure -> repercussions tree, in real time, for every fight.

With help, granted. But that help meant you learned the sport by watching.

Focus, people.

Half of understanding a sport is just knowing what to watch.

Many traditional sports are helped in this audience understanding pathway by the reality that there is a singular central point of focus for the game.

  • In football, you can just follow the ball. All meaningful events to the casual viewer happen around the ball. (Yes, some things happen off of the ball, but they’re all relatable to the position of the ball.)

  • The same is true of baseball, hockey, soccer, etc.

  • With MMA and Boxing, the point where the fighters come together is the central point of focus, and allows a fundamentally simple experience for casual viewers.

  • With basketball, all 10 players on the court revolve around that ball - get it in the hole and you get points. This team is ball-holing better than the other, so they’re winning.

Even in sports with separated but simultaneous competition (Golf, gymnastics, etc.) there are focal points for the audience:

  • Golf has a lot of players each playing a single player game. But still: there’s the ball. It goes in the hole. Do that better than anyone else, and you win. Over here is the “who’s balling into holes better than anyone else” board.

  • Automotive racing has a hundred small competitions, but in the end the focus is who crosses the finish line first.

  • In gymnastics everyone is competing in isolation, but with a singular point of focus to compare: each gymnast’s actions is given a score. At a glance, you can know who’s winning and generally can discern why, even if you don’t know the names of every technique. (Very few of us have competed in any Olympic sports, and yet we all seem to be experts come Olympic season…)

The other aspect of knowing what to watch is the surface area in which this takes place. In most sports, you can show the relevant space (a field or a pitch or a hole or a court) either in its entirety, or the entire relevant portion of that space. This becomes important to visually communicate success and failure on an individual level easily, because that singular point of focus in a known space has distinct success or failure states: in the hole or out; in the water or on the green, standing on feet or fallen on butt.

The nuance of strategy comes through the commentary, but a singular core focal point of the game means viewers can understand the game more easily.

Esports are bad sports.

So mainstream sports are communicating complex actions and interactions simply, and giving viewers an easily understood central focus point for the players and viewers alike. Then why do virtually all esports do both of these fundamental things so fundamentally badly?

Esports almost always never allow an audience to learn the game from watching, and rarely present a single focal point for the casual viewer to understand the core stakes. What they show casual viewers is a visual cacophony of unintelligible symbols, where viewers don’t - and can’t - understand what the players are doing or why, and whether or not they’re being successful at their goals.

Having been designed with the assumption that esports audiences come from audiences of the game, the presentation also assumes a base level of understanding of the game, the mechanics, and the goals. So with this assumption in place, the presentation leans into conveying the complication instead of simplicity. That single choice - to make the audience for the sport a subset of the audience for the game - precludes any causal fan’s attention, because the barriers to understanding begin with “know how the game is played.”

It's made even tougher by the fact that the viewing angles and viewpoints of the game itself often lean into these same assumptions. We can take this all the way back to game development decisions, which often design levels that require a certain amount of experience and master in order to to memorize and master the nuances of a map. But this also makes it impossible to show a singular angle that captures almost all relevant action, and therefore makes it easy to miss or misunderstand the larger metanarrative and gameplay.

In soccer, and basketball, and baseball, and football, and tennis, etc. there is an isometric-esque "all seeing" angle that lets you understand where virtually all of the players are, what the players are doing, what they *wanted to do*, and why they either succeeded or failed in their attempts - in Overwatch there’s not. Overwatch has indoor and outdoor corridors, abundant verticality (sometimes with multiple floors running above and below each other), multiplayer attack vectors, etc. All of which make it a compelling game to play, because it adds complexity via map understanding and strategy - but that complexity makes it impossible to see all the action at once, or to understand the stakes of any given exchange.

Or, if we’re inserting ourselves into the minds of a new viewer, it’s impossible to know why - in a game with six players to a side - we’re cutting the camera to someone on top of a building staring down an alley, or two people in a corridor…somewhere on the map…to accomplish…something.

The visual language of the game itself is not conducive to a casual viewer’s understanding of the game.

Gamers love problems, and this is a problem.

As gamers, we often mistake “understanding something complex” with “skill.” We do this because we’re taught that it’s the same thing by the games we play. Games often layer additional layers of complexity through overlapping game mechanics that force the player to get better or more knowledgeable at obscure or obtuse systems interactions in order to progress in overall skill or closer to a win state. The additional complexity of mastering more mechanics is satisfying to us, so as games layer on additional gameplay mechanics, it creates a player mastery feedback loop.

If understanding how skill tree overlap or weapon choice can provide an advantage in certain scenarios allows us to eke out some percentage of more damage, we feel like we solved a puzzle and are being rewarded. If we know when to launch an Ult for maximum carnage/advantage/defense and do so successfully, we feel like we are a more skilled player than a player who misses that opportunity. The more complex or unknown or opaque the mechanic we unleash, the more it feels like we’ve discovered and taken advantage of a secret loophole in the rules.

The unintended consequence of this game development decision is that when taken out of gameplay and into a passive viewership environment, those additional mechanics need to be conveyed to the audience. But as we mentioned above, even if those mechanics are ever conveyed visually (and oftentimes they are not - discernment of effectiveness is something done behind the scenes and the players only see the results in-action if they are already aware of those effects) they require additional layers of understanding which also needs to be communicated to the audience. And those additional gameplay complexities from additional (potentially hidden) mechanics remove the ability to communicate that gameplay simply.

The easy (and maybe unfair) comparison is that casual viewers don’t understand the mechanisms of an Overwatch or LoL ult, or that bunnyhopping allows for faster movement and conservation of momentum in CS:Go, but I can still see if that guy threw a ball into the hole or not, and whether they got points for it.

In short: by making games deeper with additional mechanics, you’ve increased the skill ceiling but also raised the floor of casual understanding (which is a term I just made up) and created a barrier to entry (which is a term I did not just make up.) Since these mechanics cannot be conveyed through visual interface alone, the audience is - once again - unable to learn the game simply by watching.

For want of a gamer audience, games have sacrificed casual viewing audiences, and the result is that their audiences will never move beyond those who already play the game. There is no way to learn the game by watching, so it's frustrating to newcomers and ultimately limiting to the potential audience. And inevitably, as the shininess of the game fades but the familiarity requirements remain, the available pool for interested audiences shrinks further.

You’ve built a death spiral, by design. Because esports are built for a gamer audience, and not a casual audience.

Fine. So now what.

Good news: this is a solvable problem, it just takes people willing to experiment and solve it. And look, some games are actually pretty good at parts of this.

Rocket League is compelling viewing because everything revolves around a big ball and whether or not it goes into a goal. The field is a familiar layout, and you can see much of it at any given moment.

StarCraft allows viewers to see the field and therefore understand the stakes, even I they don’t understand the gameplay:

“Honey, what’s going on?”

“He’s trying to get his troops to take over the other guy’s base.”

“Oh, he’s got a lot more guys and they’re getting close. He must be winning.”

“Actually, yeah.”

Most fighting games are pretty good too - hit the other player and their life goes down. Additional layers of gameplay complication (feints, cancels, super moves, etc.) mean it’s sometimes hard to know why things happened or didn’t happen, but more often than not you can follow it without knowing everything. See the continued popularity of EVO (and their very smart decision to focus on a single genre) for evidence of tournament popularity beyond core players.

A lot of this is because the player’s visual perspective of the game is either similar or identical to the viewer’s perspective of the game. Which eliminates that data and information gap from a visual perspective, despite there being some remaining data gaps from a core competency perspective.

By seeing the entirety of the game field, and reducing that information gap, viewers have a much better likelihood of understanding a player’s intention -> success/failure -> repercussions, and therefore learning over time.

Ze End!

Summary: esports fail at capturing an audience outside of that game because they don’t consider casual viewers in the construction of the game itself. If you want a single paragraph takeaway, it’s this:

There isn’t a singular point of focus for the audience to pay attention to, you can’t see the entire playing field at once to understand what the teams are doing, levels aren’t designed with broadcast cameras in mind, viewers can’t see intention and success/failure in that intention, and rule sets are over complicated and impossible to communicate visually. So casual viewers cannot learn the game by watching the game be played, they get confused, and they leave. Thus, esports find themselves limited to a perpetually shrinking set of existing fans of the game, and new fans must be educated about the game before they can become interested in casual viewing, instead of the other way around.

But someone will realize this.

And they’ll make a game that’s easy to visually understand but hard to master.

With a rule set which can be laid out ahead of time.

Which considers viewers as equal to players.

And which provides a way for commentators to convey more nuanced information about tactics, strategies, intentions, failures, and victories.

They’ll create something which assumes zero audience knowledge ahead of time, but easily perceivable skill in the moments of action.

And they’ll have some of the most valuable media in the current marketplace: A new sport which appeals to all demos, worldwide, in a brandable and fan-centric manner, with an easy onramp to enjoyment alongside deep and nuanced gameplay mechanics.

And that’s how they’ll build multi-billion dollar esports team franchises.

My favorite photos from Scotland

My favorite photos from Scotland

That Time I Maybe Accidentally Slid Between Universes On The Lower East Side: A Modern Pizza Brigadoon

That Time I Maybe Accidentally Slid Between Universes On The Lower East Side: A Modern Pizza Brigadoon