This article is intended to be applicable for all TCGs and other games that let you make some sort of deck or character choice. For games like Smash, replace the word "deck" with "character" and all of the below applies equally.
With the recent and subtle surge of theory-crafting taking place in the Yugioh community (which is wonderful), I think something that we need to do is talk about what the word "matchup" means in certain contexts. I've seen people get arguments that could have perhaps been easily avoided if people simply understood which definition of "matchup" they were using. There is a pretty general consensus definition, but the problem that this definition is way too general,
Deck A's matchup against deck B is said to be: The probability that deck A beats deck B in a match. Example: Deck A beats deck B 60% of the time. Deck A has a 60% matchup against deck B, and deck B has a 40% matchup against deck A.
The problem here is that when we say "deck A beats deck B" it is not clear what exactly is the context is. Typically we come to these conclusions through some sort of observations, and the nature of these observations can alter the definition and implications of the word matchup significantly.
Deck A's theoretical matchup against deck B is said to be: The probability that deck A beats deck B in a match when both decks are piloted by highly competent players who are equally skilled and make no blunders. Example: I take a group of the best players in the world and have them test a deck A against deck B over a large sample. They choose the statistically correct plays every time and make no obvious mistakes. After testing a sufficiently large sample of matches this way, deck A wins 60% of the time. Deck A's theoretical matchup against deck B is 60%. Deck B's theoretical matchup against deck B is 60%.
Unsurprisingly enough, theoretical matchups really do exist only in theory. The goal of this definition is to control for other factors such as player skill, fatigue, etc. The only factor that is influencing the outcome of the match how well the cards in deck A play against the cards in deck B. The goal of this definition is often to determine the best deck from a game theory standpoint. If your goal of testing is to approximate the theoretical matchup, you should always let your opponent take back any mistakes. If a deck has a theoretical matchup that is 50% or greater against all other possible decks, then it is said to be game theory optimal (GTO). Sometimes the term "unexploitable" is used instead. However, in reality, even if you can prove that such a GTO deck exists and can find it, it does not necessarily mean that it is the correct decision for you to take to a tournament. These types of discrepancies will be a huge focus of this article.
Deck A's metagame matchup against deck B is said to be: The fraction of the time that deck A beats deck B over a large sample of matches taken from tournament data. Example: At a tournament with thousands of players, deck A is observed to beat deck B 60 times out of 100 total matches played. Deck A's metagame matchup against deck B is 60%.
Now why does this distinction matter? It matters because there is no guarantee that theoretical matchups and metagame matchups will be the same, and in rare cases they will be quite drastically different. One of the best examples that I can actually think of is Gishki from the time period of fall 2012 to early spring 2013. From a theoretical standpoint, I think that the deck was close to GTO (if not actually GTO). It was simply very difficult to construct a deck that could beat Gishki more than 50% of the time if both players played correctly. The issue of course is that it was almost impossibly difficult to play Gishki correctly, leading to its pilots losing matches that could have been won and the deck performing quite poorly on the metagame level. Shortly after bringing the deck to the spotlight, I went to YCS Miami in January 2013 to vend. Many people came up to me and told me that they were playing Gishki. None of them day2'd, which is not that shocking.
Something common that can cause huge discrepancies is what I call skill bias (more generally in statistics, this is really just sample bias): The idea that whether or not a player is skilled can affect their deck selection. It may be the case that deck A has a 80% theoretical matchup against deck B, but if all of the worst players at the tournament play deck A, and all of the best players at the tournament play deck B, the deck A's metagame matchup against deck B could reverse to 20%. It depends of course on how large the skill gap is and how large of a role skill plays (as opposed to deck selection) in determining the match result.
I've observed skill bias many times over my TCG career, first when I played UB Faeries back in Lorwyn/Shards standard MTG. UB Faeries were widely considered to be one of the best decks from a theory standpoint, and the deck also happened to fit my playstyle perfectly, so of course I played it throughout most of the format. However, it had a miserable theoretical matchup versus Mono Red (probably around 25%). Basically, the only matches that you could possibly win were ones in which the opponent had very unfortunate mulliganing. Your best cards against every other deck, Thoughtseize and Bitterblossom, were absolutely terrible against Mono Red, and there was very little that you could do to improve the matchup postboard. However, at my FNM, I noticed that the actual match results were a lot closer to 50/50. The reason for this is that UB Faeries happened to be played by the most skilled players at our FNM, and Mono Red happened to be played by the least skilled players at our FNM. The Faerie players had a better understanding of which in-game resources mattered when, which hands to mulligan vs keep, and which cards mattered in the matchup most. This allowed them to greatly outperform their theoretical matchup in this setting.
One of the biggest factors in determining skill bias has to do with deck cost. While good players are not necessarily drawn towards expensive decks, bad players are drawn towards cheaper decks. In general, I'd say that good players are far more price inelastic than bad players. Some of the best players may be sponsored and not even have to pay for any cards at all, making their preferences perfectly inelastic. I would not say this is a causational relationship. Playing a cheap deck certainly doesn't make you bad, and there are definitely some bad players who play expensive decks. But bad players often find it to hard to justify the greater investment, whether it is due to lack of confidence in their abilities, viewing themselves as casuals, or simply wanting to prove that they can win with a budget.
Price however is not the only factor in skill bias. Deck style may have a strong influence as well. In MTG, I have noticed that tempo and aggro-control decks (like Faeries, Delver, Merfolk) are disproportionately favored by stronger players, and low curve aggro decks (like white weenie and mono red) are disproportionately favored by weaker players. It changes from time to time and also can vary based on deck specifics. Very simple combo decks like Demise OTK I noticed were almost always played by weaker players, while very complex combo decks like Elves were played by stronger players. Good players often play the game for a mental challenge, and thus may play complex but otherwise suboptimal decks (I've even fallen into this trap myself on more than one occasion). Bad players often seek to play decks that will allow themselves to cheese out a win by playing a lot of 1-drops and quickly overwhelming the opponent or by assembling an easy combo where you just show your opponent 3 cards and its game over.
Social circles may also influence deck choice to a large extent, although in some games this is more present than others. For example, I find this factor to be far more present in Yugioh than in Magic, where good players would never want to be caught playing a "helmet deck" and bad players would never want to be caught playing a "meta deck." This form of social segregation means that when present, skill bias can be very difficult to eliminate.
Understanding how skill bias can affect deck selection and hence metagame matchups can help you spot decks that are underrated. If a deck is only played by bad players, and it gets average tournament results, that means that if the deck were instead played by good players, it would get above average tournament results. It might actually be completely broken. It is not necessarily the case that a deck that has bad tournament data is also bad from a theoretical matchup standpoint, but it is about 9/10 times. However, the time where you discover the hidden potential in something under the radar makes up for every time that you tested a deck getting bad results and concluded that it was in fact bad. Test the decks that everyone else in your circle of good players sees no point in testing.
The metagame matchups are important for actually trying to predict tournament brackets. For example, in my Average Prize Model, if we are trying to simulate which decks will in reality win the most money at some upcoming tournament, then what we need to enter is what we believe are the metagame matchups, not the theoretical matchups. We don't care if deck A should beat deck B in theory if deck B has crushed deck A at all of the recent tournaments. In other words, past tournament data is the best predictor of future tournament data. A problem that professional-level players can succumb to is figuring out the theoretical matchups and then assuming that their tournament experience will match that. Just because you figured out that deck A is the best deck in theory does not mean that the top of the brackets will actually be filled by deck A. I have made this mistake myself. In fall 2011, after doing much testing to approximate the theoretical matchups, it was blatantly obvious to me that Dark World was close to GTO. The meta was Plants, Agents, Dark World, (and later Rabbit) and a well-constructed Dark World deck was theoretically 50% or better against all of the rest of the decks in the meta. This led me to predict some absurd outcome like Dark World filling half of the top cut slots as we'd expect if the competitive playerbase was rational and willing to play any deck. However, the good players at the time were disproportionately drawn to Plants due to Billy Brake, enjoyable mirrors, and social factors and mostly unwilling to consider changing decks. The bad players at the time were disproportionately drawn to Dark World, because it could be easily constructed via the purchase of structure decks. Of course Dark World got poor results on the tournament circuit and Plants did quite well. The unwillingness to consider skill bias also led me to make some suboptimal deckbuilding decisions, such as playing Dark World Lightning over Dark World Dealings, believing that mirror matches would be very common later in the brackets, where Dark World Lightning was the far superior card. Against Plants however, Dark World Dealings was a fair bit better. Regardless, I did enjoy a very high win rate with Dark World at the time, but my lack of knowledge of skill bias caused me to make some errors in my bracket predictions.
Last but not least, there's the matchup that really matters, and that is your personal matchup. It may be the case that deck A beats deck B 60% of the time in theory, and deck A also beats deck B 60% in the metagame, but that doesn't mean that you can't win 80% of the time with deck A anyways. You are neither a statistic nor a computer. As a pilot, you may be far more or far less skilled than others who play your deck, which can influence the results that you get quite a bit. This is also why that the optimal choice for you at some event could be completely different than the optimal choice for someone else at an event.
Your personal matchup with deck A against deck B is said to be: The probability that you, using deck A, will beat an average caliber pilot of deck B. Example: As a player of deck A, I have kept track of my match results, and notice that in tournaments, I have beaten deck B 60% of the time. My personal matchup is 60%.
In general, your personal matchups should more closely match the metagame matchups than the theoretical matchups. This is because skill bias is still coming into play. For example, UB Faeries' theoretical matchup against Mono Red was about 25%. The metagame matchup was about 50%. Given that I was an active participant in the metagame that was being studied, my individual matchup was also about 50%.
Here are some basic facts about your personal matchups, as a pilot of deck A, matched up against deck B:
- Assuming that you are exactly as good at playing against deck B as the average deck A pilot in your meta, your personal matchup against deck B will be exactly the metagame matchup.
- If you are better at playing against deck B than the average deck A pilot in your meta, your personal matchup against deck B will be higher than the metagame matchup.
- If you are worse at playing against deck B than the average deck A pilot in your meta, your personal matchup against deck B will be lower than the metagame matchup.
- Neither a theoretical nor a metagame mirror matchup can ever be different from 50%, but your personal mirror matchup can be. It is solely dependent upon how good you are at playing mirrors compared to your opponent.
So this is why a pilot of deck A could say something like "My personal matchup against deck B is 90%" and a pilot of deck B could say, "My personal matchup against deck A is 90%" with neither of them lying.
The concept of your personal matchup together with sunk costs is why metagames tend to be less dynamic than we might otherwise expect. In general, the more time that you invest into learning and playing a deck, the higher your personal matchups with that particular deck are. Someone might be well aware that their deck is the 3rd best choice from a metagame standpoint but still conclude that the correct choice is to play that deck at the tournament this weekend anyways, feeling that their level of play with any other deck would not be high enough to get the results that they want. Players who value the long-term would probably just play the "better deck" anyways, feeling that although their personal matchups will suffer for this particular tournament, the tournament experience playing with the better deck will allow them to make their tournaments higher expected value in the future. Keep in mind however that it is quite rare for average pilots to win tournaments. The best pilot of the 2nd best deck almost always has a better chance of winning the tournament than an average pilot with the best deck. The long-term goal however should be to be the best pilot with the best deck. In other to do this, having a very complete understanding of the fundamentals of the game that you play along with managing your time well is essential.
There is no practical use for this that I can find, but you could also look at player specific matchups: The probability that a specific player with a specific deck beats another specific player with a specific deck. In all likelihood, you will never have a large enough sample to determine player specific matchups with any kind of statistical certainty, which is why I say that it has no practical use. Of course, if you took two individuals who were close to perfect, their player specific matchup should be pretty close to the theoretical matchup. If the player specific matchup is different from the theoretical matchup, it means that the underperforming player is playing less than perfectly (given that it is impossible for someone to be playing better than perfectly).
A final note on the Average Prize Model: If you want to accurately predict brackets but also accurately predict the expected value of entering a tournament, the correct inputs for the program are all of the metagame data (metagame matchups, metagame percentages) but also to add an additional deck type for you specifically which uses your personal matchups (with only one player in the tournament being "you"). Using this, you can for example compare your expected value as an average pilot of deck A vs a top-notch pilot of deck B by running the program multiple times and comparing your average winnings in all cases. This is more accurate than the old way of doing things, which was to essentially assume that your skill with any particular deck would be no better or worse than anyone else's.
- Playing a deck with bad theoretical matchups is usually a bad idea. Playing a deck with good theoretical matchups is usually a good idea. Use testing among good players while allowing takebacks to approximate the theoretical matchups.
- Use tournament data (if available) to approximate the metagame matchups. Looking at the metagame matchups should not determine what you play. Rather, it is a good tool to use for predicting how the brackets will break down and thus the EV of your deck choice.
- Use personal matchups to determine which deck has the highest EV for you at a tournament tomorrow. But don't get trapped into a cycle of dedicating more and more time into improving your personal matchups with the same deck and feeling that you only have one viable choice.