EDHREC at Home: Part 1 - Introduction


Another Day, Another Project

Hi! My name is Michael Celani, and since I'm writing this article series for an audience that's ostensibly wider than my usual demographic, allow me to introduce myself.

I'm someone who loves solving problems in creative ways; it's my raison d'être, if you pardon my pretentiousness. I'm a bit of a night owl; If something grabs my attention, I'm working on it until three in the morning. Coding is an obvious pick for a guy like me, since you can solve (or cause) pretty much any problem with computers, and you don't have to make loud noises with a power saw to do it.

I also enjoy card games, particularly Magic: the Gathering. I've made a bit of a name for myself with my unconventional Commander decks, which generally work through some unexpected synergy, quirk of the rules, or black magic. I love writing up articles about them where I explain how they work while pretending to be a carnival barker or something. No, that's not a joke: that one's real, and now I'm considering the prospect of a potential future employer finding that.

As a person that builds these decks constantly, I figured that I'd benefit from a tool that automatically displays interesting, synergistic cards for the list I'm working on. In other words, I want a recommendation algorithm for Magic decks. Anyone in the scene will immediately shout "Hey! Isn't that just EDHREC?" to which I'll respond "Christ, read the title of the article, I know."


But Why Not Use Actual EDHREC?

As a quick overview, EDHREC is an indispensable tool for Magic deck building. It's a website that aggregates player-created decklists from a multitude of different sources, like Moxfield and Archidekt, and provides handy statistics pages for each potential commander (which is the defined leader of a constructed deck). You can see which commanders are popular, which cards their decks tend to run, and even what types of strategies they're associated with.

It also has a feature where you can provide your own decklist to the website, and have it recommend cards for you based on the data it's collected. This feature is notable for being both exactly what I want and also laughably useless — or at the very least, useless for my purposes.

Its big flaw is that it doesn't seem to take into account what your particular list is trying to do. Instead, it appears to diff your deck against the set of most common cards for a given commander, and then remind you that you forgot to include something that eighty percent of other builders did. Well-known staples like Swords to Plowshares abound in their list of recommendations, and that just didn't cut it for me, because I often take commanders off the beaten path. I'm not searching for surface-level don't-forget-mes, I'm searching for hidden gems and deep cuts.

The solution, of course, is to get good at finding unique cards myself. But that sounds like I'd have to improve and cultivate a new, intrinsic skill through practice, research, and critical thinking, which, if ChatGPT's enduring popularity has shown, is clearly not the way of the future. Since getting good is out of the question, I settled on the next best thing: building a wholesale copy of EDHREC with a recommendation algorithm I actually like. Don't tell me you wouldn't make the same choice.


A Plan of Attack

There's a lot of prep work I'll have to do before I work on the recommendation portion, though. I have to actually get all the data I'll be using to train my algorithm, and that is a project all its own. Here's my master plan to making my own EDHREC At Home:

First, I'll need to figure out how to represent a commander deck in code. A commander deck is obviously just a list of cards, but that just kicks the can down to figuring out how to represent a card. Not only is there a ton of superflous information on a Magic card that I don't really need to concern myself with, but there's also loads of idiosyncracies and edge cases I'll have to account for, like double-sided cards and tokens. Picking the right data points to focus on can make or break my project in the long run.

Next, I'll have to find a way to store all these decklists so I can run my analysis on them later. I want my data store to be performant and easy to reason about; that way, it's not only simpler to develop and debug, but also potentially reusable in the event I wanted to slot it into another project. Being able to reason about my data can also be a major source of inspiration in the future, as querying and investigating it manually can help me come up with new and interesting ideas for features.

Once I've got the model and data storage down, it's time to actually crawl deck-building websites. That means reverse-engineering each website's public API and using it to download all the decks I can. I'll have to convert the disparate responses I receieve from each unqiue API into a normalized form that my software can understand, and then save them to my data store for later processing. This'll be a slow and gruelling process, but it's gotta be done.

Then, I've got to actually train the recommendation algorithm. This'll be tricky for me, since I've got no prior experience in any sort of machine-learning algorithm, but learning is a part of the process.

Finally, I've gotta write a web front-end that displays the recommendations in a pleasant way, because querying a recommendation algorithm via the command line sounds like an absolute nightmare.


Til Next Time

With our project outline, all that's left to do is to get to work. But this introductory article is already long enough, so join me next time, when we get into the nitty-gritty of modelling our data.

Next: Part 2 — Data Modeling