{"id":366,"date":"2009-03-14T16:52:40","date_gmt":"2009-03-14T14:52:40","guid":{"rendered":"http:\/\/isabout.wordpress.com\/?p=366"},"modified":"2020-01-02T20:25:31","modified_gmt":"2020-01-02T20:25:31","slug":"overview-of-the-diplomacy-scoring-conundrum","status":"publish","type":"post","link":"https:\/\/www.arkenstonepublishing.net\/isabout\/2009\/03\/14\/overview-of-the-diplomacy-scoring-conundrum\/","title":{"rendered":"Overview of the Diplomacy scoring conundrum"},"content":{"rendered":"<p><a title=\"Boardgamegeek\" href=\"http:\/\/boardgamegeek.com\/boardgame\/483\">Diplomacy<\/a> is one of the most played and researched of modern designer boardgames. Regardless, many interesting theoretical issues remain. One I&#8217;ve been occupying myself with is scoring games &#8211; or more generally, evaluating player performance. I have some vague notion that this&#8217;ll be useful when we have tournaments here in Finland, but mostly I just find this issue an interesting theoretical problem. It&#8217;s so challenging, in fact, that I don&#8217;t have any ready-made answers &#8211; I can formulate the question, but I don&#8217;t have a perfect response.<!--more--><\/p>\n<h3>Formulating the question<\/h3>\n<p>The general, vague form of my question is this: when you&#8217;ve played a game of Diplomacy, who won and who lost, and can anything more be said of the player performance? Furthermore, if we have a number of players participating in a tournament, how much and what sort of play do we need to find out the best player of the bunch &#8211; and ideally, the ranking of the rest as well?<\/p>\n<p>To be more specific, we have a number of interconnected questions here:<\/p>\n<ol>\n<li>The rules of Calhamer Diplomacy state that the game can end in either one player winning and everybody else losing, or with a consensual tie amongst the still surviving players at any point of the game. Is this ideal? Can anything more be said of the player performance? Is a victory more desirable outcome than a tie, is a scarcer tie (more losers, less tied players) more desirable than a wider one, is getting killed late more desirable than early?<\/li>\n<li>Diplomacy is rarely played to the end due to the length of the game. If we have to end the game early, can anything be said of the player performance? Given that we want to have tournament environments and we have to be able to compete with unfinished games, how do we move from the perfect Calhamer arrangement into a compromise solution that allows us to score player performance without actually resolving the game?<\/li>\n<li>Given answers to questions 1 &amp; 2, assuming that we are playing a tournament with several rounds of Diplomacy play, how do we combine the results into an overall one? Do we give numerical scores to individual games and then manipulate those numbers? How many games are needed to find a substantial winner out of N players? How many games are needed to rank all players in some meaningful manner?<\/li>\n<p>In addition to these basic questions I am myself concerned with the following <strong>generalized issues:<\/strong><\/p>\n<li>Given that we are able to run tournaments where players play a preset number of games of Diplomacy (question number 3 sorted out, essentially), can we generalize this into an environment where each player plays a variable amount? If one player participates in one game while another plays two, how do we stack these performances in relation to each other?<\/li>\n<li>Given that we have figured out question 3, can we generalize this into a system wherein the tournament consists of a number of Diplomacy scenarios (in the sense of <a title=\"My earlier blog post\" href=\"http:\/\/isabout.wordpress.com\/2008\/12\/12\/analytical-boundaries-of-diplomacy-scenario-design\/\">this post<\/a>) instead of sequential plays of the Calhamer scenario? What if these scenarios include different numbers of centres or players or other variables?<\/li>\n<li>Finally: given answers to questions 4 &amp; 5, what would an universal scoring system look like? Such a system would need to take as input the performance results of N players playing a variable number of games in different groupings. The system wouldn&#8217;t need to always provide perfect results, as the input might be incomplete, but we would have to be able to know how to achieve such results with further input.<\/li>\n<\/ol>\n<p>Despite my language here I don&#8217;t want to suggest that the answers to these questions necessarily flow top-down &#8211; in fact, it seems probable to me that the last question, if it can be answered meaningfully at all, would need to be considered with each choice on the lower steps. Perhaps these questions are best considered as each internalizing the question before it as a special case &#8211; the special cases can be answered in different ways, but as we require more general solutions the choices valid for lower steps of the pyramid break down.<\/p>\n<p>Now, some thinking on these matters has certainly gone down before. Especially questions 1-3 are very practical concerns for Diplomacy tournaments every week all over the globe. As far as I know nobody else is concerned with questions 4-6, but luckily I don&#8217;t need to account for my Diplomacy time to anybody else. Before going into my own musings on those generalized questions and playing variants in a tournament, let&#8217;s look into what has been said of the first three questions:<\/p>\n<h3>1: What should one try to achieve in Diplomacy?<\/h3>\n<p>My inspiration for writing this post was that I recently reread <a title=\"Calhamer at the Diplomacy Archive\" href=\"http:\/\/www.diplom.org\/~diparch\/resources\/calhamer\/objectives.htm\">Objectives Other Than Winning<\/a>, a 1974 piece by Allan Calhamer himself. Calhamer writes about the recent practice among a certain subset of Diplomacy players, this being the tendency of the players to satisfy themselves with &#8220;second place&#8221; achievement in the game on the basis of centre count. I find myself in full agreement on this topic with Calhamer, his argumentation is very cogent: Diplomacy, when played to the finish, only recognizes a winner or a draw. There is no &#8220;second place&#8221;.<\/p>\n<p>However, I also fully understand why crediting other goals has become commonplace among the players of the game: Diplomacy is a hard game, long and exhausting. It is psychologically easy to judge a player&#8217;s performance in the game in different ways even when the rules have no criteria for it. After playing a whole night it might be somewhat dissatisfying that only one player gets official recognition for his play &#8211; the game designer can scream until he&#8217;s blue in the face, but that does not take away the second place finisher&#8217;s satisfaction in at least having beaten the other five players at the table.<\/p>\n<p>Personally, though, I condemn recognizing any results apart from a win or a draw in a Diplomacy game played to the finish &#8211; allowing a player to intentionally play for the second place (based on centre count or whatever) without strict censure breaks the game, as it becomes trivial for the leading player to promise the second position to his most dangerous enemy. He will even keep such a promise, as it detracts not at all from his own success.<\/p>\n<p>If players desire to differentiate between performances a bit more, then I suggest looking at elimination dates; the structure of the game is such that I find no problem in claiming that a player who got eliminated earlier played a &#8220;worse&#8221; game than one who survived longer in the game. Surviving on the board should be a prime concern for all players anyway, as you can&#8217;t win after you&#8217;ve become eliminated.<\/p>\n<p>Another viewpoint is that Diplomacy is, despite its wargame stylings, a semi-cooperative game. Each draw result, which are actually rather common in a well-played game, is a cooperative victory for the players participating in it. Players even have a chance to &#8220;improve&#8221; the victory by eliminating non-crucial participants from the draw to sharpen it &#8211; this might not matter for a single game in isolation, but when comparing results between several games (to which I&#8217;ll come soon), it&#8217;s clear that a smaller draw is a stronger result.<\/p>\n<h3>2: How to evaluate an unfinished game?<\/h3>\n<p>OK, so I don&#8217;t see much ambiguity in question one, Calhamer&#8217;s right all the way. However, #2 is a much, much more complex beast. When time constraints force us to cut a game of Diplomacy short, as happens most of the time in real life, what can be said of the performances of the players?<\/p>\n<p>I should address the most importan thing immediately, and that is centre-count: practically all modern tournament scoring systems count centres on the board to find out which player did better and which did worse after a set number of turns of play. After having judged several tournaments under these sorts of systems I find this scoring playable but ultimately unsatisfactory. There are two issues with it:<\/p>\n<ol>\n<li>Centre count does not reflect the goals of the game perfectly. Thus it influences the way the game is played.<\/li>\n<li>An arbitrary cut-off for the game influences the play in major ways at the end &#8211; so much so that I&#8217;m tempted to consider tournament Diplomacy with its last year mad centre grab a variant rules-set for the game, and not necessarily one that benefits it.<\/li>\n<\/ol>\n<p>The problem here is that if we want to be absolutely faithful to the logic of the game, then an unfinished game provides us with no data on player performance &#8211; after all, if the game had continued, any player not eliminated could have gone on to win it. From this viewpoint the only way to use unfinished games as data points is to only count eliminations of players and play however many games it takes to rank the players on the basis of who gets eliminated. Not only would this be prohibitively slow, but it would also strongly favour Powers positioned to avoid early fall &#8211; and, of course, it&#8217;s clear that the game&#8217;s purpose is hardly served if the only concern of each player is survival, not dominance.<\/p>\n<p>Centre count is deservedly the dominant form of scoring unfinished games, considering that centres are something like 80% of the success in high-level games. It might be that an approximate solution is inherently the only possibility in this matter. Almost the only other solution that even comes to mind for me is to use judges to score the game &#8211; a judge could take a glance at the board and tell the players who won, who became second, etc. all based on relative strategic positions. Almost always this&#8217;d produce the same results as centre count, but philosophically it&#8217;s quite different.<\/p>\n<p>The fundamental problem with centre count calculation is that the purpose of skill in Diplomacy is to balance tactical gains with diplomatic losses. Thus a typical mid-game position has a dominant Power being resisted by weaker Powers around it. When the game is frozen and centre-count executed, the materially dominant Power benefits from having its strengths considered, while players with a more laid-back and careful play suffer; this might come as a surprise for players who only play time-limited games, but on full games of Diplomacy a grab of material must by necessity be balanced by the diplomatic considerations.<\/p>\n<p>From the viewpoint of diplomatic balancing, one could well consider compensating limited games with some sort of inbuilt subgame that kicks in near the end of the game and allows the players to force the game to reflect their overriding diplomatic concerns around the time the game is frozen and scores tallied. A sort of peace conference or congress, I&#8217;d imagine it &#8211; perhaps the players could form voting blocks to deduct points from their enemies, to get simplistic about it, votes based on centre count&#8230; or players might get the opportunity to form a progressive series of &#8220;unbreakable&#8221; alliances during the last couple of years in the game to reflect the organic concerns they have when the game finally ends. Something like that might be worthwhile to explore, although it might also be too complex to justify itself in a pure Diplomacy tournament.<\/p>\n<p>However individual, unfinished games are scored, it seems that such a scoring could only gain validation by staying within the spirit and goals of the full game.<\/p>\n<h3>3: Building tournament systems<\/h3>\n<p>Question #2 is really the stumbling block in these matters, but I want to look into one extra possibility: almost all Diplomacy scoring systems provide us with numerical scores for each game played, and for good reason &#8211; it&#8217;s much easier to compare games, combine the results of several games and thus produce more data to figure out player rankings when you the results are numerical. However, what if they weren&#8217;t?<\/p>\n<p>In principle we could play elimination tournaments: instead of scoring a couple or three games and seeing who played best overall, have each game drop players from the tournament until you only have enough for a final table, then play that and see who gets dropped and who doesn&#8217;t. Elimination tournament, when played with complete games or near so, could be extremely Calhamer-faithful; one could decide that all players participating in a draw continue in the tournament, for instance, while any elimination drops a player. The final round could give us a bunch of winners if it ended in a draw: anybody still standing at that point would be an equal victor of the slaughter that the tournament metaphorically became. The weakness of this set-up is, of course, that elimination is a slow business that leaves the eliminated players with less interest in the tournament and play after their defeat.<\/p>\n<p>In principle I am very much in favour of having a top table in tournaments &#8211; this is basic procedure in most of Europe, but I understand Americans just compute scores to find out an overall winner for the tournament. In principle I find it more satisfying to put the players who compete for the top positions up against each other, though. Another mathematically pretty favourable practice around here is a 2\/3-tournament format wherein there are three initial rounds, after which the best two results of each player from those rounds determine which players get to the top table, which is played as a fourth round.<\/p>\n<p>Ultimately, though, the tournament system is not nearly as tricky an issue as the scoring system. One depends on the other.<\/p>\n<h2>An effort at universal scoring<\/h2>\n<p>Now, getting back to my own concern, scoring arbitrary length variant scenario tournaments: most existing Diplomacy scoring systems depend on centre count, which makes them largely useless from the viewpoint of variant scenarios: different numbers of centres and potentially different dynamics in gaining and losing them make it difficult to compare results. Different numbers of players open up the question of challenge: it is more difficult to win an 8-player game than it is a 5-player game, but how much so?<\/p>\n<p>As a rough sketch, here are some basic ideas for scoring universal Diplomacy:<\/p>\n<ul>\n<li>A game performance is more definitive and valuable as ranking information when it is more complete (that is, played closer to the end), played with more players, eliminated more players and survived longer in the game.<\/li>\n<li>When counting victory points in our universal tournament, we can determine that the definite solo victory of a Diplomacy scenario for N players is simply worth N points. This is intuitively obvious and simple, a victory is always more definite when it is achieved against more players, assuming that all of those players have equal opportunity for victory. In a full-length game of high-level Diplomacy this is pretty much always the case due to the self-balancing nature of the game, so we don&#8217;t need to know anything more than the number of players that participated in a game.<\/li>\n<li>More intricately, we can determine that a K-way draw (considering a solo victory as a 1-way draw) allows all draw members the same number of victory points. I have two simple notions here: we could give each player N\/K points, thus splitting the solo victory from above into equal parts for each participant of the draw. Or we could determine that the value of the draw is equal to N-K for each player. This would lessen the difference between a draw and a solo victory considerably; with the first method a 2-way draw in Calhamer Diplomacy would be worth 3,5 points vs. 7 points for a solo, while in the latter method the numbers would be 5 vs. 6. Both methods rank sharper draws as more valuable than weaker ones, which they should.To choose between those two principal ideas, let&#8217;s compare some games. Which is more valuable, winning a 8-player game or a 3-way draw in a 10-player game? The former system gives us 8 vs. 3,33, the latter 7 vs. 7. I am inclined to lean for the latter interpretation of value in some ways: in both games the winner(s) managed to eliminate 7 other players without being themselves among those eliminated. On the other hand, a 3-way draw avoids the end-game crunch of solo victory, which makes it less prestigious and definitive. So I think I&#8217;m siding with the split solution.<\/li>\n<li>The actual difficult part comes with how to handle incomplete games. These are always less definite than complete ones, so it stands to reason that they should be worth less points. How much less? Because it&#8217;s just about impossible to develop a general function for estimating the state of finish in an ongoing Diplomacy game, I&#8217;m going with a moral measure: the game approaches the state of being finished as the players put more work into playing it. In other words: the more years the players play, the more authoritative the results are, even if the players do not manage to resolve the game. In practice this could come to play as a percentage multiplier for the score totals: a game played for x years might get f(x) as the multiplier, where f is some function that approaches 1 asymptotically from below. Thus when we have two otherwise identical board positions, but one group has tried longer to find a resolution to the situation, the longer-suffering group is entitled to a larger share of points.<\/li>\n<li>The other half of the incompleteness conundrum is the issue of how to split the available points among the players. As I describe above, I don&#8217;t like centre-counting too much&#8230; my inclination would be to at least try a solution wherein the current leading player (by centre-count) tries to form a majority coalition (by centre count again) which gets points like in a draw (less the non-finish penalty from last step), and should he fail, the next player in line could try, and so on &#8211; if nobody manages to form a majority coalition, then everybody loses and nobody scores from the game. I&#8217;d be surprised if this were an ideal solution, considering the opportunities for metagaming, but it might beat pure centre-counting in some circumstances &#8211; I especially like it how your centre count doesn&#8217;t directly turn into points but instead just makes you a more likely candidate for a member of a majority coalition. As points are divided evenly between coalition members, the negotiators have a motivation to keep the coalition as small as possible, which means taking only significant players &#8211; but there is enough freedom to drop out somebody who annoyed you in the game or refuse to get into coalition with a bigger partner who&#8217;d need you to get that majority; this last bit is the one I&#8217;m most suspect about, as this last choice in the game isn&#8217;t constrained by diplomacy in the way other choices are; that&#8217;s a recipe for metagaming. Perhaps I&#8217;d need to have the players make their coalition choices a couple of turns before game end, or something like that.<\/li>\n<\/ul>\n<p>This way we have a pretty complete scoring system for Diplomacy, and it&#8217;s a system that doesn&#8217;t care about the number of players or centres or even whether the individual games are short or full-length. Now I&#8217;ll just need to get some volunteers to playtest a variant tournament; could be fun when you wouldn&#8217;t necessarily know at all what sort of map and how many players you&#8217;d face in a given game.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Diplomacy is one of the most played and researched of modern designer boardgames. Regardless, many interesting theoretical issues remain. One I&#8217;ve been occupying myself with is scoring games &#8211; or more generally, evaluating player performance. I have some vague notion that this&#8217;ll be useful when we have tournaments here in Finland, but mostly I just [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[16,5,32],"tags":[],"class_list":["post-366","post","type-post","status-publish","format-standard","hentry","category-diplomacy","category-gaming-analytics","category-volume1"],"featured_image_src":null,"author_info":{"display_name":"Eero Tuovinen","author_link":"https:\/\/www.arkenstonepublishing.net\/isabout\/author\/eerotuovinen\/"},"post_mailing_queue_ids":[],"_links":{"self":[{"href":"https:\/\/www.arkenstonepublishing.net\/isabout\/wp-json\/wp\/v2\/posts\/366","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.arkenstonepublishing.net\/isabout\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.arkenstonepublishing.net\/isabout\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.arkenstonepublishing.net\/isabout\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.arkenstonepublishing.net\/isabout\/wp-json\/wp\/v2\/comments?post=366"}],"version-history":[{"count":0,"href":"https:\/\/www.arkenstonepublishing.net\/isabout\/wp-json\/wp\/v2\/posts\/366\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.arkenstonepublishing.net\/isabout\/wp-json\/wp\/v2\/media?parent=366"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.arkenstonepublishing.net\/isabout\/wp-json\/wp\/v2\/categories?post=366"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.arkenstonepublishing.net\/isabout\/wp-json\/wp\/v2\/tags?post=366"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}