PAXsims

Conflict simulation, peacebuilding, and development

Barzashka: Do academic standards for research excellence apply to professional wargaming?

The following item has been written for PAXsims by Ivanka Barzashka (Managing Director of the Wargaming Network at the School of Security Studies at King’s College London), based on her recent presentation at the Connections US 2021 professional wargaming conference. This article expresses her personal views and does not necessarily reflect an institutional position. 


Wargaming has long practiced as a professional enterprise but is only emerging as an academic discipline. Civilian universities play an important role in bringing established standards of academic excellence to the theory and practice of wargaming for both research and education. 

Ensuring excellence in analytical wargames is especially important as governments increasingly looking to wargames to innovate and inform decisions. The pool of analytical wargame providers is rapidly expanding beyond the well-established expert circles. 

To achieve and demonstrate research excellence, analytical wargames need to follow established research integrity and ethics principles. Researchers, institutions and funders can then ensure these principles are implemented in practice, and action is taken when behaviours fall short. 

Research Integrity & Ethics in Analytical Wargaming 

What are these fundamental principles and how do they apply to wargames used for research? I answered this question in a recent presentation at the US Connections Professional Wargaming Conference from an academic perspective informed by King’s College London policies. I also highlighted some challenges facing scholars who wargame. You can listen to the talk here and view my slides below. 

The key takeaway: while wargaming scholarship is progressing, there is still a way to go. To properly meet research integrity standards, we need more fundamental research on wargaming, more educational opportunities in wargaming theory, methods and practice, and appropriate publication outlets. It is impossible to follow “disciplinary standard and norms” when scholars do not know or agree what these are. It is difficult to demonstrate rigour in “using appropriate research methods” when scientifically-sound analytical wargaming methods are only beginning to emerge in the open literature and are being applied for the first time for scholarly inquiry. Academics who strive for “transparency and open communication” still have issues publishing wargame findings in reputable journals.  

Expectations for Scholars vs Professional Wargamers 

But to what extent do research integrity and ethics requirements for analytical wargaming differ for academics versus professional wargamers? To advance this discussion, I offer three propositions.  

Most, but Not All, Academic Research Integrity Principles Apply to Professional Wargaming 

First, while the general principles for research excellence are fairly standard, a major difference between academic and professional wargaming is the expectation for transparency and open communication. For example, analysts who use wargames to support research for government clients are not expected to make their methods and findings available to others. In contrast, scholars are required to publish research and are promoted on the number of publications. 

Responding to Research Misconduct and Questionable Research Practices 

Second, the extent to which research integrity principles are applied in practice differs significantly among institutions and sponsors. This includes taking appropriate measures when there is evidence of research misconduct or questionable research practices.  

Research misconduct, which includes falsification, fabrication, plagiarism and misrepresentation, is a potentially fireable offence at a university. But could professional wargamers lose their jobs over poor game design or inadequate analysis of gameplay data?  

Looking at King’s definitions for categories of misconduct, many common wargaming practices would raise red flags in academe. Here are some examples: 

Falsification includes “inappropriate manipulation and/or selection of a research process.” According to this definition, creative injects by a control team that affect or determine outcomes of player decisions, but do not clearly link to research objectives and protocols, would raise questions.   

Misrepresentation includes “suppressing relevant results or data, or knowingly, recklessly or by gross negligence representing flawed interpretation of data.” Cherry-picking insights from a plenary discussion, while ignoring gameplay data, would get wargame analysts in trouble in this category. 

Another issue is plagiarism. Not acknowledging other people’s “ideas, intellectual property or work (written or otherwise)” in wargame design would be especially problematic in an academic setting but is common practice in the gaming community. 

Major Research Ethics Risks 

My third proposition concerns research ethics. The ethical issues that arise from the application of a particular analytical wargaming method that collects data from human subjects are mostly the same – regardless of whether the principal investigator works for a university or a government agency. However, the likelihood and consequence of ethical risks materialising will differ significantly in different settings. 

Scholars applying for research ethics review of an analytical wargaming process are most worried about preserving anonymity of research participants and ensuring the confidentiality of personal data. This risk arises because wargames are conducted in group settings and require support from large research teams (e.g. rapporteurs and facilitators). 

However, scholars can effectively manage these risks by carefully applying best practices, such as minimisation of directly or indirectly identifiable personal data, pseudo-anonymisation, access limitation, data separation and retention policies. These risks can be further reduced by careful recruitment and training of game staff. (At King’s, we spend 6 months selecting and training our wargame rapporteurs.) 

For professional wargamers, the major ethical risk is the conflict of interest between them and their sponsor. Stephen Downs-Martin describes the issue well in this article. Research ethics problems deepen when lines of responsibility and accountability are not clearly defined, and when the research process is not (or cannot) be made transparent. Mitigating these risks requires clear communication between a wargame provider and their sponsor but doing so might not be in the self-interest of the parties involved. 

Other ethical risks will be just as big, regardless of setting. For example, risks of harm to individuals could result from using wargames to investigate topics that could trigger stress or violence. If a principal investigator uses deception, including not fully informing participants of the purpose of the wargame, this also raises ethics concerns. (Thanks to Rex Brynen for highlighting these points.) 

Are Scholars Better Positioned to Ensure Research Excellence in Wargaming?

Ensuring and demonstrating research excellence in wargaming requires understanding, applying, and enforcing integrity and ethics principles. These principles are well established, but expectations differ in academic and professional wargaming settings. Professional wargamers face greater ethical risks than scholars who wargame, and these risks cannot be easily mitigated.   

Overall, scholars at universities are better positioned to ensure research excellence in wargaming than their professional wargaming colleagues. This does not mean professional wargamers are less interested in honesty, rigour, transparency, or ethics. Quite on the contrary – the wargaming community of practice is conscious of these risks and limitations, and the topic of this year’s US Connections conference is testimony to this fact. But there are powerful institutional incentives that influence research integrity and ethics in practice, which cannot be wished away. 

If people who are professional wargamers want to effectively demonstrate research excellence in wargaming, they should consider a sabbatical to spend some time at a university. 

Ivanka Barzashka

3 responses to “Barzashka: Do academic standards for research excellence apply to professional wargaming?

  1. Ivanka Barzashka 12/07/2021 at 1:15 pm

    @Tim Smith Thank you for your thoughtful response. To emphasize a point where we appear to diverge:

    An analytical wargame, that is, a wargame that is used for research purposes, involves primary data collection from human subjects. Analytical gaming activities are subject to research ethics clearance from an appropriate body.

    Universities have their own research ethics committees (called Institutional Review Boards in the United States). Government agencies do too. For example, the UK Ministry of Defence has MODREC, which reviews gaming activities across the MOD: https://www.gov.uk/government/groups/ministry-of-defence-research-ethics-committees

    Educational games are not subject to such formal requirements, but as my colleague Dr David Banks should be subject to increased ethical scrutiny. See his talk and further discussion here: https://www.youtube.com/watch?v=8hDbk7uW-CE

  2. Lou Coatney 03/07/2021 at 4:42 pm

    While I think there should always be evaluation as far as relevance to reality and efficiency/productivity of the game( system)s, I think “free” wargaming not chained by academic restrictions (or distorted by reigning academic ideology) should continue to be the way to go.

    Academic degrees, protocols, and hierarchy do not guarantee relevant creativity, and there is justified suspicion they can handicap that.

  3. Tim Smith 30/06/2021 at 7:19 am

    Ivanka Barzashka, Managing Director of the King’s College Wargaming Network, offers a proposal that touches upon both academic and ethical standards, two important issue areas that must be distinguished clearly. Academic standards uphold principles of scientific validity — soundness of evidence and data combined with logical coherence in inference and theory. The second set of ethical standards provides protections for individual subjects of typically psychological or medical research, many of whom are students or private citizens. In the U.S. this effort is associated with the Belmont Report on human-subjects research and its enforcement by university ‘Institutional Review Boards’; other nations have similar policies and committees.

    Defense and other applications of modelling, simulation, and analysis (with wargaming as an important, distinct form of M&S) do require sound scientific standards and the ethical conduct of evidenced-based research. They do not, however, treat the human participants as the subjects of the research, any more than any other form of scientific/academic research treats the researchers and support staff as the subjects of their own research. Wargaming involves humans in a more visible way than other forms of defense decision-support, but the participants are not the subjects.

    Barzashka is right, on the other hand, to emphasize scientific standards. The effectiveness of defense planning and decision is a function of the knowledge gained through the supporting wargaming, M&S, and analysis, which in turn is a function of the extent to which these uphold methodological and broader scientific standards. These processes do need a renewed infusion of scientific rigor and methodological modernization, of the kind undertaken during World War II and thereafter. These standards have eroded even while extensive methodological modernization has occurred outside the organizations tasked with defense decision support. These tend, in the U.S. at least, to retain increasingly obsolete methodology, methods and processes employing, for instance, stovepiped, tools-centric techniques in a complex era that demands and offers cross-stovepipe, multimethod integration principles and practices; discrete-event simulation models in an age that demands and offers system-dynamic and agent-based models; frequency-based statistics in an age that demands and offers Bayesian learning, and unvalidated ‘free Kriegsspiel’ wargame models (if we can call them that) that lack rigor, in an age that demands and offers extremely low-cost, rigorous, validated models for ready defense use.

    Some conceptual clarification might help us scope out the optimal way ahead, starting with the multiple meanings of the noun ‘game’ and the specificity of the prefix ‘war’. First the prefix: a wargame, per se, is a game on warfare at the strategic, operational or tactical levels (conventional, unconventional; air, land, sea; historical or current, etc. — many applications). And ‘game’ refers, rather ambiguously, either to the rules, algorithms, mechanics and data values (basically, the causal architecture) of a game design or, alternatively, to events in which the game is being played. The former is an ‘object’, if you will, the latter a process employing the object.

    A wargame ‘object’ is a model, in manual/paper or computerized/digital form, that depicts the warfare phenomena of interest, a model itself being a reduction of theory to manipulable and therefore rationally/empirically testable form (as ‘well-formed formulae’). By contrast, the wargame process or event involves the use of that model for certain purposes: e.g., analysis, education/training, entertainment. Whichever the use, ‘playing’ or otherwise implementing/running the model over time is simulation: participants ‘role-play’ (simulate) real-world decision-makers fulfilling specified command roles.

    ‘Analytic’ (decision-support) wargaming plays a unique role in defense research, serving as an integrated analytico-synthetic method that adds substantial value in the ‘problem structuring’ phase of a sound multimethod research paradigm/programme, in which ‘analysis’ (e.g., operational research), game theory, and computational M&S all can and should play vital, distinct and interdependent roles. Wargaming, done properly (that is, using low-cost, validated models) facilitates the isolation and manipulation of causal variables, the discernment of pattern in complex phenomena, the formulation of hypotheses, and initial sensitivity testing to weight the variables and prioritize the hypotheses for subsequent in-depth testing using computational and quantitative tools and techniques (and note that OR and M&S are no less ‘human-directed’ than is live gameplay; it’s just hidden behind the curtain: every command decision and behavior enacted by a roleplayer in live and virtual simulation is coded by a software engineer in constructive, computational models; the main difference being that the former infuses domain expertise in the behavioral representations). Following Peter Perla, defense wargamers refer to this spiral learning process as the ‘cycle of research’ (Argyris & Schön speak similarly of ‘double-loop learning’ in organizational development).

    Thus ‘analytic’ wargaming is concerned not so much with the behavior of individual participants as with cause-and-effect in the external, objective domain of interest. Wargaming engages domain experts in the initial modelling of the warfare problem. Expert domain theoreticians design the model and extend its range of application. Testing for internal coherence (‘verification’, in M&S speak) is performed internally, then expert domain practitioners participate to subject the model to tests of external validity (‘validation’) and then participate in wargame events that employ the model to explore the decision space in the domain being represented, with a view toward substantive discovery.

    The model creates situations that drive decision-making, especially if implemented (‘played’ in simulation) by competent domain experts (rational actors), whether civilian or military. That is, consistent with the range of variability/uncertainty inherent in the situation being modelled, a valid model will bound the range of ‘player’ choice in all the various decision points and their tactical details. Over the course of a limited and generally manageable series of simulation executions (with the aforementioned low-cost, validated model), different sets of competent role-players will trend toward a consistent pattern of statistically normative outcomes (‘expected values’). This is particularly the case in stable environments such as, for instance, land warfare between the late Napoleonic era and World War I or again in late World War II, naval surface warfare between heavily armored warships, and long-term attritional campaigns such as conventional strategic bombing and naval guerres de course.

    The model, therefore, is the predominant governing factor, through its algorithmic relations among variables and its quantification schema (magnitudes of the variables’ values). The domain expertise/competence of the human participants will vary, of course, but that is not what analytico-synthetic wargaming, conducted as a phase in decision-support of defense planning or policy, examines. Brilliant or incompetent participant decision-making might be noted but is not the source of research findings.

    In fact, the entire value of wargaming in the paradigm/programme/cycle lies in the model and the warfare outcomes it demonstrates in given scenarios (initial conditions). What are the rough-order-of-magnitude (‘ROM’) sensitivities between causes and effects (independent and dependent variables); where do spikes, inflections and cascades occur and diminishing marginal utilities follow? What variation in force quantities and combat qualities count most and least? How do these relate to ‘ROM’ costs in the national defense program or to logistical consumption rates in the forward area in a future war?

    There would appear to be no role for Belmont/IRB intrusion in this realm, where paid defense professionals design and implement wargames as part of their jobs and no after-action reports critique their performance or behavior.

    Things are a bit different in educational/training wargaming, a dimension of defense preparedness more akin to the academic educational (vice research) mission. But here students are graded on their individual performance and team contributions through wargame lab projects and associated in-class drills just as they are on papers and objective tests. It is just another form of graded coursework. Thus nor is there a role for Belmont/IRB intrusion here (Rex Brynen commented in the recent Connections USA conference that Canadian policy, as one exemplar, exempts in-class materials and work from such oversight).

    Were a researcher to propose research into student performance in coursework (of any nature), or into human performance in any activity, then the human subjects would properly deserve IRB oversight. But that would seem the main and perhaps the only kind of research in which such protections are warranted or appropriate.

    Finally, with regard to a third, very prominent domain of wargaming, Stephen Downes-Martin and Robert Rubel have written persuasively on the epistemological, evidential and ethical pitfalls encountered in the large-scale ‘Title X’ wargame events hosted by US DoD/service colleges/universities. We might be right to train our gaze here. These events, however, are somewhat unique. They are neither rigorously scientific nor rigorously pedagogical. They do not directly, formally inform analysis for procurement or policy planning nor do they directly teach students warfare principles/practices. But they do shape perceptions and assessments in defense leadership ranks and must therefore be as epistemologically sound as practicable. It might be time to subject them to a searching methodological evaluation with a view toward a fresh infusion of scientific rigor.

    These large-scale, high-level wargame events do not employ validated simulation models. They do not entirely replicate the wargaming performed at the US Naval War College (and the German Kriegsakademie) prior to the Second World War nor that in the Royal Navy’s Western Approaches Tactical Unit (WATU) during the war. It’s not clear they fully replicate the practices the US Navy undertook in its vigorous wargaming and fleet exercise program in the 1970s-80s. In all of these, the warfare models were more or less rigorous, as validated by empirical data and real-world testing as conditions allowed, and implementation, and hence testing, was repeated on numerous occasions for validation and assessment purposes. Yet today’s simulation-gaming environment, manual and computational, permits substantially higher levels of scientific validity than anything achieved in previous eras of pre-war wargaming, warranting a through review of practices in high-level defense wargaming.

    But at that, it’s still conducted by paid defense professionals functioning not as private personalities but rather as anonymous rational actors. What it requires is philosophical governance, not bureaucratic and legal. And this is true of wargaming across the board, not only in professional institutions, but perhaps even, with the aforementioned exception of research into the behavior of living human subjects, in academic research and pedagogy as well. Given concerns raised in academe concerning threats to academic freedom posed by widespread zeal and rigidity in IRB rulings (see for instance the following statement by the American Association of University Professors: Research on Human Subjects: Academic Freedom and the Institutional Review Board, https://www.aaup.org/report/research-human-subjects-academic-freedom-and-institutional-review-board), other research institutions might be advised to cast a wary eye toward contemporary university practices. Where the stakes of research are high, today’s university ethics policies and practices might serve more as a warning than as a beacon.

Leave a comment