Start Submission

Reading: Tool-Driven Revolutions in Archaeological Science

Download

A- A+
Alt. Display

Research Article

Tool-Driven Revolutions in Archaeological Science

Authors:

Sophie C. Schmidt,

Institute of Archaeology, University of Cologne, DE
X close

Ben Marwick

Department of Anthropology, University of Washington, Seattle, US
X close

Abstract

There is an argument in philosophy of science that revolutions in science are either idea-driven or tool-driven. We explore this debate in light of recent efforts by many scientific disciplines to embrace methods to improve the reproducibility of their research. One of the most profound changes driven by this concern for reproducibility and transparency is from analysing data using tools dependent on point-and-clicking with a mouse in closed source software, to tools based on writing scripts in open source programming languages and making them openly available. We present bibliometric evidence for this change in ecology and in archaeology to test if the adoption of these new tools is revolutionary or transformational. We identify a positive citation effect for papers that use the open source programming language R. We discuss how computational approaches to improving reproducibility and transparency in archaeology are mediated and transformed by the use of R code.
How to Cite: Schmidt, S.C. and Marwick, B., 2020. Tool-Driven Revolutions in Archaeological Science. Journal of Computer Applications in Archaeology, 3(1), pp.18–32. DOI: http://doi.org/10.5334/jcaa.29
146
Views
34
Downloads
30
Twitter
  Published on 28 Jan 2020
 Accepted on 13 Dec 2019            Submitted on 01 Jan 2019

1. Introduction

In this paper we investigate recent developments which are facilitating computational archaeology’s return to fundamental principles of the scientific method. We ask whether this might be the beginning of a broader change in the field. Revolutions or paradigm-shifts in archaeology have been proposed to occur following major theoretical statements (Clark 1993; Härke 2002), but there is an argument among philosophers of science whether revolutions in science are idea-driven or tool-driven (Dyson 2000; Galison 1997). This argument motivated us to explore how the disciplines of ecology and archaeology have started to embrace tools that improve the reproducibility of research. We focus on one set of tools for this change, based on writing data analysis scripts in free and open source programming languages, exemplified by R, and the practice of sharing of these scripts. We focus on R because our prior observations indicate it is by far the most common scripting language used by ecologists and archaeologists. Bibliometric evidence shows a strong increase in the use of R among ecologists and the beginning of a similar development in archaeology. We evaluate how approaches to improving reproducibility and transparency in archaeology are mediated and transformed by digital approaches and propose these might reflect a tool-driven change in archaeology. Acknowledging that this is not a simple process, we offer an R-based tool to ease the task of creating a compendium which enables other researchers to reproduce the published results.

2. Revolutions in Science: Idea-driven or Tool-driven?

Philosophers of science disagree whether revolutions in science are idea-driven or tool-driven. One of the most widely-known models of idea-driven change in science is Kuhn’s (1962) effort to describe the history of science by proposing paradigm shifts resulting from revolutionary change in communities of researchers. In brief, this model proposes that the history of science consists of long periods of tradition-bound ‘normal science’ punctuated by short episodes of ‘revolution’. Normal science was described by Kuhn as stagnant, routine day-to-day research focused on ‘puzzle solving’. What counts as suitable puzzles, and acceptable solutions to these puzzles, is governed by the norms and procedures of the prevailing paradigm. Revolutions in science occur as punctuations in the equilibria of normal science, when anomalous observations culminate in changes replacing one paradigm with another. In this conversion or speciation process the community adopts a new way of doing science. There has been much debate about how to draw a distinction between so-called normal and revolutionary science. (Casadevall & Fang 2016; Toulmin 1970; Watkins 1970). A notable criticism is that Kuhn’s normal science is unscientific, as it describes a situation where critical science had contracted into defensive metaphysics resulting from the domination of a ruling dogma (Popper 1970). This is at odds with a view of science as a continuous state of evolution, with researchers simultaneously employing many styles of thinking and doing science, and continuously confronting theory with evidence and modifying their ideas based on the outcome. In this view, based on the post-World War II era of science, radical discontinuities are rare, and major developments instead emerge from the division and recombination of already mature fields (Andersen 2013). Given the rarity of radical discontinuities, some biologists have argued that change in biology is better characterised by new ideas (e.g. Mendelian heredity, Darwinian evolution, and molecular genetics) replacing not a former paradigm, but a conceptual vacuum (Wilkins 1996).

In archaeology we have seen claims for paradigm shifts in several areas. The most wide-ranging application is Clark’s (1993) comparison of North American archaeology to European archaeology during the 1970s–1990s. His analysis describes a multi-paradigm situation where many research communities (i.e. North American and European) simultaneously operate in different paradigms (which Kuhn noted as a sign of a developing or immature discipline). Clark’s account is not a complete Kuhnian analysis because it lacks a revolutionary change event where one paradigm is replaced by another, although he hints that processual archaeology may be considered a radical discontinuity relative to the culture-historical approaches that preceded it (Clark 1993: 206).

A more thorough treatment of the claim that the appearance of processual archaeology was a paradigm shift is provided by Meltzer (1979). Meltzer reviews literature arguing for a revolutionary change in archaeology during the 1960s and 1970s, and considers these claims in light of classic Kuhnian revolutionary events such as the replacement of the Ptolemaic system by the Copernican system, and Newtonian dynamics by the new Einsteinian dynamics. He finds that the changes occuring in Anglophone archaeology after the 1960s were incremental, mostly of method, rather than a widespread replacement of one ontological structure by another one, incommensurable or incompatible with the former. Trigger (2006: 538) came to a similar conclusion at the end of his broad survey of the history of archaeology, that changes throughout archaeology have been mostly additive with only partial replacement.

We also see mentions of paradigm shifts in archaeology in reference to the shift from processual to post-processual archaeology (Härke 2002; Koerner 2018), as well as more localised and specific shifts in archaeological practice and thinking. For example, Snodgrass (2002) claims that among Classical Archaeologists, an increase in research on previously neglected periods of antiquity, such as the Greek Early Iron Age, constitutes a paradigm shift. McAnany & Rowe (2015) propose that the appearance of community-based participatory models of research among some communities of archaeologists is a paradigm shift, though they concede that it is transformational rather than revolutionary. Harris (2012) similarly argues that community crowdsourced geographic knowledge (or volunteered geographic information) could be paradigm shift for archaeological communities. Archaeologists using LIDAR (light detection and ranging) for remote geospatial imaging of cultural landscapes have claimed the application of this technology to archaeology is a paradigm shift (Chase et al. 2012; Howey et al. 2016). Fuller (2010) has used the term to describe a shift in thinking from the emergence of agriculture as a ‘Neolithic Revolution’ to a protracted and entangled process, happening several times independently. Most of these claims for Kuhnian paradigm shifts differ in meaning, which highlights an important complication with using Kuhn’s concept of a paradigm-shift, namely the multiple definitions of a paradigm in his writings.

Masterman (1970) has documented 21 different uses of ‘paradigm’ by Kuhn, which she organises into three groups. First is the metaphysical notion of a set of beliefs, and this is the sense dominating most of the commentary on Kuhn’s model. Second is the sociological sense of scientific habits, the universally recognized scientific achievements that are the foundations of day-to-day normal science. These include specific attention-grabbing successes that give a touchstone for the coordination of future research. Third is the concrete sense of an actual textbook, instrumentation or toolkit in wide use. This recognition of the multivalent nature of paradigms is important because it broadens the locus for scientific revolution to include sociological practices of science, such as might be produced by changes in norms of publication and peer review, and to include the physical infrastructure – tools – of doing science, such as the technology for collecting and analysing data, and for communication and collaboration among researchers.

We see a compelling exploration of this third concrete sense of a paradigm in the work of Galison (1997), who has offered a tool-driven view of change in research communities. Galison analysed the role of tools in twentieth century particle physics, starting with hand-crafted cloud chambers and bubble chambers and ending with digital counters, particle accelerators, and computers. He identifies a profound change in physics when analog devices producing pictures were superseded by digital devices producing numerical data. At the heart of this change is a shift from an intuitive approach, stimulated by visual and pictorial model-building to an approach based on logic, calculation, and demonstration. The changes are not purely at the points of data collection and analysis, but extend to the social and economic organisation of science, with new categories of physicists emerging, people who are not entirely experimenters and not entirely theorists. Dyson (2000, 2012) has further explored this tool-driven approach, tracing the origin of the Galilean revolution to appearance of the telescope in astronomy, and the origin of the Crick-Watson revolution to the use of X-ray diffraction.

We propose that Galison’s emphasis on tool-driven change, as a complement to Kuhnian concept-driven change, has potential to enhance archaeology’s social and scientific relevance and contributions. The centrality of material culture in Galison’s view raises the question about whether there is scope for archaeologists to make productive contributions to understanding change in the history of science through an archaeological analysis of scientific instrumentation (cf. Schiffer 2013). Furthermore, Galison’s analysis of the image/logic contrast in the history of physics invites a similar analysis of the history of archaeology. Can we identify archaeological traditions focused on data collection and analysis using images and image-making devices (cf. Molyneaux 2013), in contrast to digital devices (Marwick 2019)? One candidate for this might be the shift from building relative chronologies based on seriation using typologies of visually distinctive artefacts to absolute chronologies based on radiometric dating using computer-controlled instruments. Three revolutions in archaeology have been attributed to radiocarbon dating and associated technologies (Bayliss 2009). Chase et al. (2012) have claimed the introduction of radiocarbon dating was a paradigm shift in archaeology, but we are not aware of a Galisonian analysis of this change. A third issue is how we can use a tool-driven approach not just as a framework for understanding change in science, but as a method to predict or generate change in the practice of archaeology?

3. Bibliometric Analysis of a Tool-driven Change

In this section we explore how a change observed in ecology might be relevant for understanding or even directing the future of archaeology. Biological disciplines have a long tradition of influencing archaeological thought, starting with Oscar Montelius, whose typological method was inspired by Darwin’s theory of evolution (Montelius 1899: 237). This tradition continues, for example, many spatial statistics currently used by archaeologists are derived from ecology (Keron 2015: 7).

3.1. Looking back on a tool-driven change in Ecology

Touchon & McCoy (2016) investigated evaluated changes in statistical methods used by ecologists as a potential area of tool-driven change. They searched nearly 20,000 published articles in ecology published between 1990 and 2013. They found that there has been a rise in complex and computationally intensive statistical techniques such as mixed effects models and Bayesian statistics and a decline in reliance on approaches such as ANOVA or t-tests. Crucially, they found that ecologists have shifted away from software tools such as SAS and SPSS to the open source program R.

Touchon & McCoy (2016) identify four factors relating to technological change that might explain the changes they observed in the use of statistics in ecology. First, they note that automated data loggers, GPS trackers, remote sensing, and crowd sourcing have greatly increased the rate at which ecologists collect data. Second, increases in desktop and cluster computing power have made complex analytical processes faster and more convenient to compute. Third, the development of free, open source and easily extensible software for data analysis and visualisation, such as R, allows new methods to spread quickly via online fora and social media. Fourth, the publication of several books and papers that have strongly influenced the way many ecologists think about data analysis, such as Burnham and Anderson’s (2003) book on model selection and inference.

We focus here on the third factor, free and open source software, because it is the most generic factor and so the most relevant beyond ecology. Archaeology and ecology each have a high diversity of data collection methods, and types of data analyses they conduct. An important similarity for the two disciplines is working with the field-collected data on a computer to prepare it for publication. R is a widely used free and open source data analysis tool in many research communities (Baker 2017; Thieme 2018; Tippmann 2015). Our observations suggest it is the dominant programming language in archaeology and ecology, so it is a good proxy for the adoption of open source scientific programming languages in these disciplines. To investigate changes in the use of R in archaeology and ecology, we obtained reference lists from a sample of scholarly articles in the Web of Science database and examined patterns in the citation of the R program over time.

4. Methods

We used the ‘Cited Reference Search’ function provided by the Web of Science online scientific citation indexing service to find journal articles citing R. Although R has been available since the late 1990s (Thieme 2018), a recommended format for citing the software did not appear until 2004, with the author given as “R Development Core Team”. This recommended format for citing the software changed slightly in 2012 when the author was updated to “R Core Team”. We searched the Web of Science using ‘“R DEV COR TEAM” OR “R CORE” OR “R CORE TEAM” OR “R DEVELOPMENT CORE TEAM”’ in the CITED AUTHOR field of the Web of Science database. We sorted the results by frequency of citations, and selected the first 1000 items (the maximum allowed by the Web of Science service). These 1000 items represent variations on the recommended format for citing R. We found citations of R in the reference lists of 42,659 English-language articles indexed by the Web of Science in the research area of ‘Environmental Sciences Ecology’. We then downloaded the bibliographic data and reference lists for each of these articles.

4.1. Reproducibility and open source materials

To enable re-use of our materials and improve reproducibility and transparency according to the principles outlined in Marwick (2017), we include the entire R code used for all the analysis and visualizations contained in this paper at http://doi.org/10.17605/OSF.IO/RHVN5. Also in this version-controlled research compendium are the raw data for all the results reported here. All of the figures, tables and statistical test results presented here can be independently reproduced with the code and data in this repository. In our online materials our code is released under the MIT licence, our data as CC-0, and our figures as CC-BY, to enable maximum re-use (for more details, see Marwick 2017).

5. Results

Figure 1 shows the percentage of articles citing R in each of several of the top ecology journals (as defined by how often their articles are cited). We restrict the start of the observation period to 2008 for convenience so we have a ten year study period. The plot shows that the percentage of articles citing R has increased from less than ten percent in all journals in the late 2000s, to more than 30% in Ecosphere, Ecology and Evolution, and Molecular Ecology after 2012. We might not call this a Kuhnian paradigm shift, but it does show a substantial change in the tools of the discipline, supporting claims for a Galisonian tool-driven change in ecology similar to the changes described by Touchon & McCoy (2016).

Figure 1 

Percentage of articles per year citing R in top Ecology journals (5,800 articles out of 42,659). Data from Web of Science for 2008–2018.

5.1. Looking forward to a tool-driven change in Archaeology

To compare with ecology we conducted a similar bibliometric analysis for archaeology journals indexed by the Web of Science service. We used the ‘Cited Reference Search’ to find articles citing R in the same way as above, and then refined the results to keep only those articles published during 2008–2018 that are included in the Web of Science category ‘Archaeology’. This resulted in 42,991 articles, of which 154 cite R. Figure 2 shows the temporal trend of citations of R in archaeological articles. There are three interesting details revealed by this figure.

Figure 2 

Proportion of Archaeology articles per year citing R (a total of 154 out of 42,991 articles in our sample for 2008–2018). Labels to the right show journals in our sample with more than five articles that cite R Sub-plot shows articles published in the Journal of Archaeological Science during 2008–2017. Data from Web of Science.

The first detail we see here is that overall, the proportions and the absolute number of articles citing R in archaeology are much smaller than what we see in the ecology journals in Figure 1. Only the Journal of Archaeological Science has more than 50 articles in our sample. We conclude from this observation that archaeologists have yet to adopt programming for data analysis and visualisation in the same way ecologists have.

The second detail is that most archaeology journals do not show any strong increase in the percentage of article citing R over time. The ecology journals shown in Figure 1 show strong upward trends of increasing proportions of articles citing R over time, but we do not see any unambiguous trends in the archaeology articles when considered together. Indeed, in several of the other archaeological journals in our sample, the first citations of R only occurred in the last two years. The sample sizes for the archaeology journals are too small to confidently infer any trend of over time, with the exception of the Journal of Archaeological Science with 61 articles. Figure 2 shows a statistical test of change over time for articles citing R in the Journal of Archaeological Science. With a moderate r-squared value and low p-value for the linear model, we conclude there is some evidence for a non-random increase in citations of R over time in this journal. This may show an increasing use of R among archaeologists, especially those working at the intersection of archaeology and the natural sciences. This may hint at the start of a trend like what we see in the ecology journals of a widespread adoption of R.

Following on from this observation about the increasing popularity of R in the Journal of Archaeological Science, the third detail we see in these results is the distinctive types of journals that have articles citing R. There is a focus on journals publishing scientific archaeology, especially those that focus on specialised empirical analysis. In addition to the general archaeology journals, we also see Lithic Technology, Anthropozoologica, and International Journal of Osteology in the journal names, indicating that we find R used by specialists in stone artefacts and faunal remains. For more fine-grained insights into the topics that archaeologists are using R to study, we conducted a statistical analysis of the words in the titles of all the articles in our sample.

5.2. What are R-using archaeologists writing about?

We computed a comparison of word frequencies in journal article titles in the Web of Science data to get a better understanding of what topics archaeologists are writing about when they cite R. First we separated the archaeology articles into two groups, those that cite R, and those that do not. Second, we filtered to keep only words that occur in titles in both groups, and removed very common and uninformative words (e.g. ‘the’, ‘archaeological’, ‘study’, etc.). This resulted in 41,645 words in 43,044 articles. Third, for each word found in all the titles of the journal articles in each group, we computed its proportion of the total number of words in all the titles in each group. Figure 3 shows the results of this analysis. Words near the red line are used with about equal frequencies by papers citing R and by papers not citing R. Words far away from the red line are used much more by one group of articles compared to the other.

Figure 3 

Comparing the frequency of words used in titles of archaeology articles (41,645 words in 43,044 articles). Words located above the red line are found more frequently in articles that cite R, compared to words located below the red line.

Figure 3 supports our observation that archaeologists using R tend to be doing archaeology that involves the natural and physical sciences. Articles that cite R more often have titles that include terms such as ‘lithic’, ‘experimental’, and ‘isotope’. On the other hand, articles that do not cite R more often discuss the Roman and Bronze Age periods, and less often reflect a technical specialisation. If there is one technical specialisation that is characteristic of articles that do not cite R, it seems to be in ceramics.

5.3. From using code in research to sharing code with publications

Touchon & McCoy (2016) identified the use of scripting languages as being a factor for easily sharing methods and code, thereby increasing the rate of development of statistical methods. One notable practice we observed during our bibliometric analysis is that among authors of articles citing R, a subset of these authors publicly share the code and data that they used to generate the results presented in their article. We identified 85 articles that include R script and data files, either in the supplementary materials or in a trustworthy data repository such as Zenodo, Open Science Framework, Figshare, etc. ‘Trustworthy’ data repositories are those hosted by a non-commercial organisation, offering persistent identifiers, and have transparent commmitments to long term data management and backup strategies (see Marwick & Birch (2018) for more details). Our continuously updated list of articles in this sample is available online here: https://github.com/benmarwick/ctv-archaeology. Figure 4 shows that although this number is small, it is increasing, hinting at the emergence of a new approach to how archaeologists share their research. The author guidelines of this journal support this trend: “In the interests of open scholarship and the reproducibility of results, JCAA strongly encourages all authors to deposit material relating to their publication in a preservation repository” (Editors n.d.).

Figure 4 

Articles in archaeology journals using R for reproducible research, and making code files openly available to accompany the published article (n = 85).

These archaeology articles that include publicly available R code and data are part of a shift in scientific communication that is also underway in other fields. For example, the journal Nature Neuroscience requires authors ‘to make the code that supports the generation of key figures in their manuscript available for review’ (Editors 2017). Although we are not aware of attempts to quantify this change in the same way we have for archaeology, we have observed a number of recent publications that describe and recommend code sharing in statistics (Baumer et al. 2014), genome biology (Markowetz 2015), computational biology (Sandve et al. 2013), hydrology (Slater et al. 2019), biostatistics (Peng 2009), computer science (Mitchell et al. 2012; Peng 2011), applied mathematics (LeVeque et al. 2012), speech science (Abari 2012), political science (Dafoe 2014; King 1995), and the social sciences generally (Miguel et al. 2014). As part of this growing interest in using code for research we also see manifestos aimed at researchers doing any kind of quantitative work (Barnes 2010; Ince, Hatton & Graham-Cumming 2012; Nosek et al. 2015) and articles recommending using R in undergraduate education (Bray, Çetinkaya-Rundel & Stangl 2014; Eglen 2009). Much of this literature is concerned with identifying and solving problems of irreproducibility in research. These actions come from a renewed interest in basic principles of the scientific method: that once-off results should not be trusted, instead they should be reproducible by other members of the research community (Stark 2018). Furthermore, that an article should sufficiently describe the presented research results such that a colleague can fully understand the results and how they were obtained. Many of these authors trace problems of irreproducibility to the increased complexity of computer-based analyses, combined with the limited space to describe them in a journal article, and mouse-driven computer programs where the researcher’s analytical decisions are not recorded during the research process, and thus cannot be shared with other researchers.

In our view, merely using R (or similar open source programming languages) to conduct research would not constitute a tool-driven revolution by itself, but publicly sharing the code used for research, as a solution to problems of irreproducibility, is more likely to lead to revolutionary change. We expect that openly shared code will speed the widespread adoption of reproducibility as a core tenet of the scientific process, since it frees researchers from the black box that most mouse-driven programs are, and enables researchers to not only rerun the shared analysis, but to gain access to all parameter settings, empowering them to change these and so properly evaluate, extend and reuse the published results. As open source languages such as R (and Python) are free to use, and many trustworthy repositories are also free (such as Zenodo, Open Science Framework, Figshare, etc.), then even researchers with limited resources anywhere in the world can contribute equally to the research community by using and sharing code. Using and sharing open source code is thus an important action for reducing inequality in the archaeological research community. It enables participation by researchers who are unable to buy software licenses.

5.4. Citation effects of sharing code

A pragmatic issue for understanding widespread changes in scientific practice is to identify incentives for members of a research community to change. If code sharing has concrete career advantages, for example, if is known to result in increased citations, then we might expect it to be more likely to be part of a tool-driven revolution. Vandewalle (2012) analysed citation rates for 645 articles published in IEEE Transactions on Image Processing during 2004–2006. He found non-random differences between the median number of citations for papers that have code available and papers that do not. In considering causal relationships, Vandewalle raises the possibility of self-selection bias, where authors include code more often with their best papers. He also notes that for some papers, the code was made available by researchers other than the authors, after the publication has become popular.

We conducted a similar analysis for archaeology articles using the Web of Science Core Collection ‘Times Cited’ (‘TC’) count (Figure 5). This Times Cited count displays the total number of times a published paper was cited by other papers within the Web of Science Core Collection. Unlike Google Scholar, this does not include citations on blogs, pre-prints and other places outside of journal articles, so the citation counts from Web of Science are lower than those on Google Scholar. We see that archaeological articles that cite R are consistently more highly cited themselves than articles that do not cite R.

Figure 5 

Median citation rates per year for archaeology articles 2010–2017 that cite R (n = 216) and articles that do not cite R (n = 42,828). On average, articles citing R have higher numbers of citations (m = 10.1) than articles that do not (m = 6.5), t(158) = 3.38, p = 0.00092.

The mechanisms underlying the citation effects of using R remain to be elucidated. The articles in our data are mostly not open access (only 20% are open access), so we cannot attribute the citation effect to open access effects, such as easier availability, and additional time in press of the articles (Kurtz et al. 2005). In our data all the article authors are also the code authors, so there is no popularity effect noted by Vandewalle. One possibility is that authors preferentially tend to promote their articles containing code, because they want a greater return for the greater effort they have put into including the code with the article, this may be equivalent to the self-selection bias postulated by Vandewalle. Another possibility is that papers with code are perceived by readers to be more robust because of their transparency, and thus a more reliable and credible item to cite in support of knowledge claims. Another possible cause is that when a paper is accompanied by code (for example on a trustworthy repository) it is more visible than a paper without code, and appears higher in search result rankings. All other factors equal, higher visibility leads to more citations because the researcher will stop searching when they have found the first paper that serves their purpose. Further work is necessary to understand the mechanisms of this citation effect. Additional insights might come from comparisons of papers that use another open source language such as Python, or a closed source language like MATLAB.

6. Discussion

Our hypothesis is that a tool-driven change is underway in the biological sciences, and is emerging in archaeology. The tool which drives this change is the use of scripting languages in computational work in these disciplines. We have shown that the use of R is on the rise among archaeologists, especially those using natural science methods, and has concrete advantages for authors that publish papers that cite R. The advantage is that papers using and citing R tend to be referenced more often in other papers, giving the authors more visibility and influence (cf. Huggett in this volume for critical discussion of open scholarship).

The use of R or similar scripting languages enables a major shift in the technical and social dynamics of the research community, as it offers the possibility to communicate one’s workflow in a highly detailed and efficient way, and easily share this workflow with others. The sharing of code has been identified in other disciplines as an amplifier of change in methods (Touchon & McCoy 2016). It also enables researchers to not just reproduce the work of others, but to also thoroughly evaluate them. Thus, the sharing of code may be at the root of an upcoming tool-driven change for archaeology, consistent with the patterns described by Galison (1997). This change is transformational rather than revolutionary, enabling archaeologists to more faithfully adhere to a fundamental tenet of the scientific process which demands the reproducibility of results. We do not claim archaeologists did not want to produce reproducible research before the advance of scientific scripting languages. Indeed, there is a longstanding tradition in archaeology of publishing catalogues, lists and maps showing the data used for analyses. Empirical reproducibility has always been prominent in archaeological research as we can see from the value placed on re-examining museum collections, re-excavating old sites, and reusing previously published data (Strupler & Wilkinson 2017).

Nonetheless, with the introduction of computers there has been a gradual and unintended drift away from reproducibility (Bailey Borwein & Stodden 2016; Brinckman et al. 2019), and so from the fundamental tenets of scientific practice. Many archaeological tasks have shifted from the descriptive and qualitative aspects to quantitative and computational analyses. This shift also mandates a change in documentation of analyses to enable other researchers to reproduce published results. Modern archaeology has often been lacking in this aspect, as much research is undertaken using point-and-click software, which obscures and complicates the reproducibility of analyses. By using and sharing code we can undo this drift, and return to a robust scientific practice for archaeology. As we show, the most frequent use of scripting languages takes place mostly in the part of archaeology that is in close contact with natural sciences. These sciences also show a trend of increased emphasis on reproducibility, with journals changing their data and code availability policies to require code and data to be peer reviewed and available alongside articles (Stodden, Guo & Ma 2013). Also in archaeology, funding agencies are increasingly showing interest in open science methods and transparency (Strupler & Willkinson 2017).

As Galison (1997) describes a shift in the practices of physicists, we believe archaeological practice is similarly changing. Huvila & Huggett (2018) have explored the use of digital tools on archaeological practices to understand the boundaries of what counts as archaeological, what is merely related to archaeology and what should be excluded from archaeology. Reflecting on Huvila & Huggett (2018), the use of code is unquestionably archaeological, since it is one of many tools that archaeologists are using for converting raw data into summaries, visualisations, and insights. These concerns about how digital tools affect what we consider is and is not archaeology intersect with the work of Gieryn (1999) on boundary construction. In his cultural cartography of science, Gieryn examines boundaries between science generally and varieties of less authoritative non-science. He argues that the boundaries are drawn and redrawn continuously in flexible and sometimes ambiguous ways. Gieryn claims that the cultural authority of science flows from boundary work in professional and political settings where scientists construct a public image for science as a source of credible knowledge, validate their work, and marginalize claims made by their competitors in the scientific community. Boundary work involves attributions of select characteristics to science in order to distinguish it from non-scientific competitors. The increasing voluntary use and sharing of code by researchers, and demands for it by readers and peer reviewers is a form of active boundary work that is shifting the boundary between scientific and non- or less-scientific archaeology. We, as members of this group, argue that the use of code as a research tool, and the sharing of code as a public research product, are increasingly attributes that are being used to distinguish science from less- and un-scientific work. The reason for this is that using code, and making it available for others to inspect, enhances the credibility of knowledge claims and the visibility of those claims.

A related issue here is the technicity of tools, in our case programming languages. Cassirer (2004) and Simondon (2011) to define technicity as a term to describe the technology, i.e. tools and machines respectively, that interact with humans and with nature (Hoel & Tuin 2013: 188). Technology is said to own an instrumental kind of logos (Cassierer, see Hoel & Tuin 2013: 194) and, in analogy to language, it exercises a certain measure of agency in mediating between humans, nature and technology. The use of a scripting language might exemplify this, as it is not just a technological tool, but a proper “language” as well. Scripts of code are texts that translate data into visualisations and statements about the world, transversing and revitalizing conceptions of the relationship between people and nature, and between people as members of a research community.

In contrast to the concept of technological determinism, which puts technology as a force against the human experience, technicity enfolds technology into ‘the human’ without letting it lose its ‘foreignness’ (Hoel & Tuin 2013: 190). Changes in tools will influence how human researchers interact with their objects of interest and other people in their research community. There are three ways is which this can be anticipated for archaeology. First, tools that make the research process more transparent and open might be expected to make the research community more open and inclusive to diverse participants and ideas. Second, using code to ‘stand for’ or represent the research process will likely increase our awareness of how archaeology itself ‘stands for’ the human experience in the past, and how it is a representation of the past (cf. Pearson 1998). Third, the code has to be successfully executed to generate a useful result, just as claims about the past have to be successfully engaged with archaeological evidence to count as plausible.

We do not claim that a Kuhnian paradigm-change is happening in ecology or archaeology: Data- and method-sharing behaviours have been around in archaeology before now (Lodwick 2019) and do not present newly incommensurate ways of thinking about the subject, in the way that a typical reading of Kuhn’s paradigm change concept should. Indeed, social science disciplines such as archaeology may not even be easily susceptible to classical Kuhnian paradigm shifts. This is because high-power tests (i.e. statistical tests with high probabilities of rejecting false null hypotheses) capable of distinguishing between superior and inferior paradigms are rare in the social sciences, such as economics (Ioannidis Stanley & Doucouliagos 2017), compared to the physical sciences. Modelling and case studies described by Akerlof and Michaillat (2018) indicate that among disciplines lacking evidence to routinely strongly discriminate between theories, the chances of getting trapped in an inferior paradigm are high. In this situation, true theories may not be adopted at all and the field is at risk of being captured by false paradigms.

Although a discipline-wide paradigm shift might be off the table for archaeology, there is a need for methods to share research in a way that is reproducible for others, and that archaeologists are actively exploring tools to make this possible. Our results show that scripting languages and data repositories to share the scripts is a solution that enjoys increasing popularity in the archaeological research community (cf. Marwick & Birch 2018). It has also expanded the ‘trading zone’ (Galison 1997) where archaeologists collaborate with specialists in other fields to develop a shared language to get things done. In using scripting languages, archaeologists are working at a trading zone with computer scientists and computational researchers in other fields to coordinate how tools from one domain can be useful in another. Other examples of archaeologists in trading zones include the use of GIS, network methods, agent-based models, isotope analysis, remote sensing, and DNA analysis to answer archaeological questions. Archaeologists’ embrace of these methods may also count as tool-driven revolutions. However, what makes the use of code unique as a tool-driven revolution is that coding is transcendental, it is an universal method with which we can do all, or which can be central to any, of those previously-mentioned approaches. That is why we believe archaeologists should adopt scripted workflows regardless of their usual toolkit and methodological interests.

6.1. Making it easier

We acknowledge that learning to use code, if one has not received formal instruction, is not an easy task. A certain familiarity with the computational tools is required to be able to use them not just on a technological basis, but creatively and in full knowledge of their restrictions and ambiguities (Chrysanthi, Murietta-Flores & Papadopoulos, 2012). The awareness and the appropriate handling of shortcomings and contexts of the data used in an analysis is also a time- and thought-consuming effort. This effort is required at the beginning of each archaeological research project, and which may be exacerbated by using multiple data sets from varying sources (cf. Huggett 2015).

As with data, preparing code to make it publicly available takes time to ensure that it is fit for others to read and use. Our casual observations of the ways in which archaeologists are using and sharing code show a high degree of variability in code style and organisation, indicating that currently most archaeologists are independently solving problems of how to write and share code through trial and error. We can use concepts from cultural evolution to understand this situation: learning by trial and error is known as guided variation in a cultural evolutionary framework. The dominance of guided variation suggests that using code is not yet widespread enough among archaeologists to propagate due to frequency-dependent biases. Frequency-dependent biases occur when people copy the most abundant variant, in this case the most commonly used tool for analysing data, in the population (cf. Boyd & Richerson 1988). It also suggests that the professional benefits (i.e. the citation effects we describe above) are not yet widely known for code use to propagate due to content biases. Content biases result when some aspect of a variant’s content, such as positive citation effects, makes it more likely to be adopted. Finally, there may not yet enough highly-visible, prestigous researchers using code for this behaviour due to propagate due to model-based biases (Boyd & Richerson 1988; Henrich & McElreath 2007; Rendell et al. 2011). This bias is based on imitation of highly prestigous, skilled or successful individuals. Considering the high variability of code using cultural evolutionary concepts, we conclude that there is unrealised potential to improve the efficiency of using and code in archaeology, firstly by converging on some widely-agreed upon conventions that will save researcher’s time, and secondly by investing effort in communicating the professional benefits of using code.

Our contribution to improving the efficiency of using and sharing code is the R package rrtools (‘reproducible research tools’, https://github.com/benmarwick/rrtools). This package is the result of our analysis of existing practices among archaeologists and other researchers, and our study of best practices and current conventions in scientific computing, described in more detail in Marwick, Boettiger, & Mullen (2018). The goal of rrtools is to make it easier for archaeologists and other researchers to use R for research and publication. This package aims to simplify many of the steps needed to write reproducible research papers, and to guide users to best practices with minimal effort. The rrtools package contains functions that create a file structure according to fundamental principles of organising files for research. For example, we keep the data separate from the methods. This means keep the code in a separate directory to the data. We keep the raw data (i.e. field-collected and instrument output) separate from data that is created or derived during the analysis. These two basic principles make it easy to stay organised during a complicated project, and make it easier for a reader to navigate their way through the compendium of files that are shared with the publication. At the heart of the project template provided by rrtools is an R Markdown document. This is where the report or journal manuscript is written. R Markdown is a document format that allows the author to combine plain text, R code, and automation of all the usual details of scholarly writing such as citations, captions, and cross-references to tables and figures (Xie, Allaire & Grolemund 2018). The unique concept of R Markdown is that the R code contained in the document will generate the statistical figures and tables when the document is executed (or rendered) into output such as PDF, MS Word or an HTML file.

The rrtools package contains several functions for the R user to enable a researcher to quickly set up a compendium suitable for writing reproducible research in R. Each of these steps reflects a best practice that has been previously articulated in other fields (e.g. Wilson et al. 2017). We will focus here on the first five, as these are most relevant to most researchers:

  • use_compendium: This function creates a new directory for the research project, and creates an R package within this directory. We use the R package structure as the standard for our directory because it is widely recognised by R users (Wickham 2015). This means that people know where to look for code, where to look for data, and where to look for dependencies and other information.
  • use_mit_license: This function creates a copy of the MIT software license to the compendium. Other licenses are also possible, such as GPL, but the MIT license is preferred because it is widely used for research software because of its two important qualities. First, it tells the reader that the author is happy for their code to be reused by others, both in academic and commercial contexts. And second, it tells the reader that the author does not take responsibility for any problems that the reader might have when they use the code. This is important for setting the expectations about the relationship between the author and the reader, regarding the use of code.
  • use_github: This will initiate the use of the Git version control system. This is the current state-of-the-art for version control of any kind of plain text file, including text, code, data, and images (Jones 2013). This will also create a repository on the GitHub website. Using version control is important because it allows the author to time-travel back to earlier states of their analysis, if they take a dead-end path (Ram 2013). It is also makes collaboration smooth because many people can work on one set of files without losing track of the most recent version. Plus it serves as a remote back-up of our work in case of an emergency. The GitHub repository can be kept private if desired, until the work is published.
  • use_readme_rmd: This will create a simple document that is usually the first thing a reader sees when they browse the code files. This document is important because it helps to describe to the reader what to expect in the compendium, and it gives details of how to cite the compendium and engage with it if they want to make changes (e.g. if a reader finds a mistake that needs to be fixed).
  • use_analysis: This function creates a set of folders according to the best practices for organising a typical research project (Noble 2009; Wilson et al. 2017). We recognise that this file structure wont be perfect for every project. But our observations show that it is in wide use, and will suit most projects well. It will at least make the user think about how to organise their project logically, and to make it easier for others to navigate.

7. Conclusion

Starting with the discussion of Kuhn’s influential work on paradigm shifts in science (Kuhn 1962), we took inspiration from Galison (1997) to propose a another explanation of how changes occurs in research communities, emphasising the transformative role of tools. The change we focus on is the use of scripts to document and communicate data analysis. This change will increase the computational reproducibility of archaeology, helping the discipline adhere more closely to the core values of science. The tool with the highest potential to achieve this currently is the R programming language. However, we realise that this will change in time, and in the future newer technologies may replace R. The key point here is that this transformative tool is any open source scripting language.

We have presented a bibliometric analysis that reveals substantial change in the use of the R programming languages in ecology. Citations patterns show that ecological and evolutionary sciences have strongly embraced R in their publications. Archaeologists have more recently taken up R, but to a much lesser degree, so far, than what we see in the other fields, and most commonly in more scientific aspects of archaeology. We identified an advantage of citing R for researchers as it leads to being referenced more often. We also identified a small, but increasing, set of archaeology papers that use R to make their work fully transparent and reproducible by others. In reflecting on these results, we have noted the technicity of coding in how it mediates between humans, nature and technology. We’ve proposed that the increasing adoption of coding among archaeologists is a form of active boundary work, shifting the boundary between scientific and unscientific archaeology, and is developing a trading zone between archaeologists and programming specialists such as computer scientists and software engineers as they search for useful tools. Many useful tools have already emerged from this trading zone, including our R package, rrtools, designed to make it easy for researchers to make their research reproducible. We hope this will help to realise some of the potential to make archaeological research more open and reproducible, both within the research community and to the public, as well as speeding the transfer of new results and methods throughout the research community without barriers due to access to resources. Tools emerging from the trading zone are important because they support practices of archaeology that are more faithful to the core values of science.

We have outlined a positive future for archaeology where research publications are accompanied by code and data, enabling a ‘critical self-consciousness’ at a community-wide scale, similar to that imagined by Clarke (1973) as part of the discipline’s ‘loss of innocence’. A characteristic of this stage of disciplinary maturity is ‘a closer understanding of its internal structure’ (Clarke 1973: 7), and we can think of no more efficient and intelligible way to communicate the internals of archaeological analyses to other members of the discipline than by encoding the assumptions, logic, and calculations in scripts of an open source programming language. If we accept that that is necessary for the future of the discipline, we must confront two implications. First is the shortage of incentives to motivate the use and sharing of code. Our observation is that many researchers will not change unless required to by gate-keepers at prestigious journals and funding sources. In other research communities we see people in these gate-keeping roles (e.g. in over 5,000 journals and professional organizations) effecting change by adopting the Transparency and Openness Promotion Guidelines that establish community standards of code and data availability that help to align scientific ideals with disciplinary practices (Nosek et al. 2015). Simple forms of signalling, such as badges on papers to indicate that the authors have made code and data available, may also help to shift norms and raise awareness of these desirable behaviours (Kidwell et al. 2016).

However, cultural change on a community-wide scale, such as a shift to the wide-spread use of code among archaeologists, is most likely to happen via the slow enculturation of professionals as they internalize norms during their formative years, and not via contemporaneous exposure to external cultural influences (Vaisey & Lizardo 2016). This means it is unlikely that an established professional archaeologist reading this paper, or ones like it, will be motivated to use code if they were not trained to work with code at the start of their career. The implication here is that the most effective way to stimulate this change is during the first few years of professional training. This means updating our professional training curricula by shifting from a model of creating T-shaped researchers (where the vertical bar on the T represents the depth of specialist skills and expertise in archaeology, and the horizontal bar is the breadth of skills and knowledge of related and intersecting fields) to gamma-shaped researchers (Fiore-Gartland 2017). A gamma-shaped researcher has expert-level depth in archaeology, and proficient in other domains that provide skills such as analysing data with a programming language. A gamma-shaped researcher with their MA or PhD in archaeology may also be conversant enough in the language and culture of computer science to have conversations and collaborate in the trading zones described above, but does not need to be an expert in computer science.

Acknowledgements

The rrtools package was developed by the 2017 Summer School on Reproducible Research in Landscape Archaeology group at the Freie Universität Berlin (17–21 July). Thanks to Sahir Bhatnagar, Ricarda Braun, Wojciech Francuzik, Sebastian Funk, Charles Gray, Matthias Grenié, Martin Hinz, Patrick Kennedy, Daniel Knitter, Anna Krystalli, Nils Muller-Scheessel, Clemens Schmid, Adam H. Sparks, and Joseph de la Torre for their contributions to rrtools. This was funded and jointly organized by Excellence Cluster 264 Topoi, the Collaborative Research Center 1266, and ISAAKiel. A version of this paper was previously presented at the 2018 Computer Applications and Quantitative Methods in Archaeology (CAA) international conference at Tubingen, Germany. Early drafts were improved at the Cologne writing workshop in September 2018, which was supported by the COST Action Arkwork: Archaeological Practices and Knowledge Work in the Digital Environment.

This article is based upon work from COST Action ARKWORK, supported by COST (European Cooperation in Science and Technology). www.cost.eu.

Funded by the Horizon 2020 Framework Programme of the European Union

Competing Interests

The authors have no competing interests to declare.

References

  1. Abari, K. 2012. Reproducible research in speech sciences. International Journal of Computer Science Issues, 9(6): 43–52. http://www.ijcsi.org/papers/IJCSI-9-6-2-43-52.pdf. 

  2. Akerlof, GA and Michaillat, P. 2018. Persistence of false paradigms in low-power sciences. Proceedings of the National Academy of Sciences, 115(52): 13228–13233. DOI: https://doi.org/10.1073/pnas.1816454115 

  3. Andersen, H. 2013. The second essential tension: On tradition and innovation in interdisciplinary research. Topoi, 32(1): 3–8. DOI: https://doi.org/10.1007/s11245-012-9133-z 

  4. Bailey, DH, Borwein, JM and Stodden, V. 2016. Facilitating reproducibility in scientific computing: Principles and practice. In: Atmanspacher, H and Maasen, S (eds.), Reproducibility: Principles, Problems, Practices, 205–231. Hoboken, NJ: Wiley Online Library. DOI: https://doi.org/10.1002/9781118865064.ch9 

  5. Baker, M. 2017. Scientific computing: Code alert. Nature, 541(7638): 563–565. DOI: https://doi.org/10.1038/nj7638-563a 

  6. Barnes, N. 2010. Publish your computer code: It is good enough. Nature News, 467(7317): 753–753. DOI: https://doi.org/10.1038/467753a 

  7. Baumer, B, Cetinkaya-Rundel, M, Bray, A, Loi, L and Horton, NJ. 2014. R markdown: Integrating a reproducible analysis tool into introductory statistics. Technology Innovations in Statistics Education, 8(1): 1–22. https://escholarship.org/uc/item/90b2f5xh. 

  8. Bayliss, A. 2009. Rolling out revolution: Using radiocarbon dating in archaeology. Radiocarbon, 51(1): 123–147. DOI: https://doi.org/10.1017/S0033822200033750 

  9. Boyd, R and Richerson, PJ. 1988. Culture and the evolutionary process. Chicago, IL: University of Chicago Press. 

  10. Bray, A, Çetinkaya-Rundel, M and Stangl, D. 2014. Taking a chance in the classroom: Five concrete reasons your students should be learning to analyze data in the reproducible paradigm. Chance, 27(3): 53–56. DOI: https://doi.org/10.1080/09332480.2014.965635 

  11. Brinckman, A, Chard, K, Gaffney, N, Hategan, M, Jones, MB, Kowalik, K, Kulasekaran, S, Ludäscher, B, Mecum, BD, Nabrzyski, J, Stodden, V, Taylor, IJ, Turk, MJ and Turner, K. 2019. Computing environments for reproducibility: Capturing the “Whole Tale”. Future Generation Computer Systems, 94: 854–867. DOI: https://doi.org/10.1016/j.future.2017.12.029 

  12. Burnham, KP and Anderson, DR. 2003. Model selection and multimodel inference: A practical information-theoretic approach. New York: Springer Science & Business Media. 

  13. Casadevall, A and Fang, FC. 2016. Revolutionary science. mBio, 7(2). DOI: https://doi.org/10.1128/mBio.00158-16 

  14. Cassirer, E. 2004. Form und technik. In: Recki, B (ed.), Gesammelte werke. Hamburger ausgabe, band 17: Aufsätze und kleine schriften (1927–1931), 139–183. Hamburg: Felix Meiner Verlag. 

  15. Chase, AF, Chase, DZ, Fisher, CT, Leisz, SJ and Weishampel, JF. 2012. Geospatial revolution and remote sensing LiDAR in Mesoamerican archaeology. Proceedings of the National Academy of Sciences, 109(32): 12916–12921. DOI: https://doi.org/10.1073/pnas.1205198109 

  16. Chrysanthi, A, Murietta-Flores, P and Papadopoulos, C. 2012. Archaeological computing: Towards prosthesis or amputation? In: Chrysanthi, A, Murietta-Flores, P and Papadopoulos, C (eds.), Thinking beyond the tool. Archaeological computing and the interpretive process, 7–12. BAR Internat. Ser., Oxford: Archeopress. 

  17. Clark, GA. 1993. Paradigms in science and archaeology. Journal of Archaeological Research, 1(3): 203–234. DOI: https://doi.org/10.1007/BF01326535 

  18. Clarke, D. 1973. Archaeology: The loss of innocence. Antiquity, 47(185): 6–18. DOI: https://doi.org/10.1017/S0003598X0003461X 

  19. Dafoe, A. 2014. Science deserves better: The imperative to share complete replication files. PS: Political Science & Politics, 47(1): 60–66. DOI: https://doi.org/10.1017/S104909651300173X 

  20. Dyson, F. 2000. The sun, the genome, and the internet. New York, NY: Oxford University Press. 

  21. Dyson, FJ. 2012. Is science mostly driven by ideas or by tools? Science, 338(6113): 1426–1427. DOI: https://doi.org/10.1126/science.1232773 

  22. Editors. 2017. Extending transparency to code. Nature Neuroscience, 20(6): 761. DOI: https://doi.org/10.1038/nn.4579 

  23. Editors. n.d. Author guidelines Journal of Computer Applications in Archaeology. Available at https://journal.caa-international.org/about/submissions/ [Last accessed 6 Dec 2019]. 

  24. Eglen, SJ. 2009. A quick guide to teaching R programming to computational biology students. PLoS Comput Biol, 5(8): e1000482. DOI: https://doi.org/10.1371/journal.pcbi.1000482 [Last accessed 18 May 2015]. 

  25. Fiore-Gartland, B. 2017. Hacked Ethnographic Fieldnotes. AstroHackWeek Blog. 1 October 2014. Available at http://astrohackweek.github.io/blog/ethnographic-notes.html [Last accessed 6 Dec 2019]. 

  26. Fuller, DQ. 2010. An emerging paradigm shift in the origins of agriculture. General Anthropology, 17(2): 1–12. DOI: https://doi.org/10.1111/j.1939-3466.2010.00010.x 

  27. Galison, P. 1997. Image and logic: A material culture of microphysics. Chicago, IL: University of Chicago Press. DOI: https://doi.org/10.1063/1.882027 

  28. Gieryn, TF. 1999. Cultural boundaries of science: Credibility on the line. Chicago, IL: University of Chicago Press. 

  29. Härke, H. 2002. Interdisciplinarity and the archaeological study of death. Mortality, 7(3): 340–341. DOI: https://doi.org/10.1080/1357627021000025487 

  30. Harris, TM. 2012. Interfacing archaeology and the world of citizen sensors: Exploring the impact of neogeography and volunteered geographic information on an authenticated archaeology. World Archaeology, 44(4): 580–591. DOI: https://doi.org/10.1080/00438243.2012.736273 

  31. Henrich, J and McElreath, R. 2007. Dual-inheritance theory: The evolution of human cultural capacities and cultural evolution. In: Barrett, L and Dunbar, R (eds.), Oxford handbook of evolutionary psychology, 555–570. Oxford: Oxford University Press. DOI: https://doi.org/10.1093/oxfordhb/9780198568308.013.0038 

  32. Hoel, AS and van der Tuin, I. 2013. The ontological force of technicity: Reading Cassirer and Simondon diffractively. Philos. Technol., 26(2): 187–202. DOI: https://doi.org/10.1007/s13347-012-0092-5 [Last accessed November 19, 2018]. 

  33. Howey, MC, Sullivan, FB, Tallant, J, Kopple, RV and Palace, MW. 2016. Detecting precontact anthropogenic microtopographic features in a forested landscape with lidar: A case study from the upper great lakes region, ad 1000–1600. PloS One, 11(9): e0162062. DOI: https://doi.org/10.1371/journal.pone.0162062 

  34. Huggett, J. 2015. Digital haystacks: Open data and the transformation of archaeological knowledge. In: Wilson, AT and Edwards, B (eds.), Open source archaeology: Ethics and practice, 6–29. Warsaw, Poland: De Gruyter Open. DOI: https://doi.org/10.1515/9783110440171 

  35. Huvila, I and Huggett, J. 2018. Archaeological practices, knowledge work and digitalisation. Journal of Computer Applications in Archaeology, 1(1): 88–100. DOI: https://doi.org/10.5334/jcaa.6 

  36. Ince, DC, Hatton, L and Graham-Cumming, J. 2012. The case for open computer programs. Nature, 482(7386): 485–488. DOI: https://doi.org/10.1038/nature10836 

  37. Ioannidis, JPA, Stanley, TD and Doucouliagos, H. 2017. The power of bias in economics research. The Economic Journal, 127(605): F236–F265. DOI: https://doi.org/10.1111/ecoj.12461 

  38. Jones, ZM. 2013. Git/github, transparency, and legitimacy in quantitative research. The Political Methodologist, 21(1): 6–7. https://thepoliticalmethodologist.com/2013/11/18/gitgithub-transparency-and-legitimacy-in-quantitative-research/. 

  39. Keron, JR. 2015. The use of point pattern analysis in archaeology: Some methods and applications. PhD thesis. University of Western Ontario. 

  40. Kidwell, MC, Lazarević, LB, Baranski, E, Hardwicke, TE, Piechowski, S, Falkenberg, LS and Nosek, BA. 2016. Badges to acknowledge open practices: A simple, low-cost, effective method for increasing transparency. PLOS Biology, 14(5): e1002456. DOI: https://doi.org/10.1371/journal.pbio.1002456 

  41. King, G. 1995. Replication, replication. PS: Political Science & Politics, 28(03): 444–452. DOI: https://doi.org/10.2307/420301 

  42. Koerner, S. 2018. Scientific revolutions. In: López Varela, SL (ed.), The encyclopedia of archaeological sciences, 1–5. New York: Wiley Online Library. DOI: https://doi.org/10.1002/9781119188230.saseas0525 

  43. Kuhn, T. 1962. The structure of scientific revolutions. Chicago, IL: University of Chicago Press. 

  44. Kurtz, MJ, Eichhorn, G, Accomazzi, A, Grant, C, Demleitner, M, Henneken, E and Murray, SS. 2005. The effect of use and access on citations. Information Processing & Management, 41(6): 1395–1402. DOI: https://doi.org/10.1016/j.ipm.2005.03.010 

  45. LeVeque, RJ, Mitchell, IM and Stodden, V. 2012. Reproducible research for scientific computing: Tools and strategies for changing the culture. Computing in Science & Engineering, 14(4): 13–17. DOI: https://doi.org/10.1109/MCSE.2012.38 

  46. Lodwick, L. 2019. Sowing the seeds of future research: Data sharing, citation and reuse in archaeobotany. Open Quaternary, 5(1): 7. DOI: https://doi.org/10.5334/oq.62 

  47. Markowetz, F. 2015. Five selfish reasons to work reproducibly. Genome Biology, 16. DOI: https://doi.org/10.1186/s13059-015-0850-7 

  48. Marwick, B. 2017. Computational reproducibility in archaeological research: Basic principles and a case study of their implementation. Journal of Archaeological Method and Theory, 24(2): 424–450. DOI: https://doi.org/10.1007/s10816-015-9272-9 

  49. Marwick, B. 2019. Galisonian logic devices and data availability: Revitalising upper palaeolithic cultural taxonomies. Antiquity, 93(371): 1365–1367. DOI: https://doi.org/10.15184/aqy.2019.131 

  50. Marwick, B and Birch, SEP. 2018. A Standard for the Scholarly Citation of Archaeological Data as an Incentive to Data Sharing. Advances in Archaeological Practice, 6(2): 125–143. DOI: https://doi.org/10.1017/aap.2018.3 

  51. Marwick, B, Boettiger, C and Mullen, L. 2018. Packaging data analytical work reproducibly using R (and friends). The American Statistician, 72(1): 80–88. DOI: https://doi.org/10.1080/00031305.2017.1375986 

  52. Masterman, M. 1970. The nature of a paradigm. In: Musgrave, A and Lakatos, I (eds.), Criticism and the growth of knowledge: Volume 4: Proceedings of the international colloquium in the philosophy of science, London, 1965, 59–89. Cambridge: Cambridge University Press. 

  53. McAnany, PA and Rowe, SM. 2015. Re-visiting the field: Collaborative archaeology as paradigm shift. Journal of Field Archaeology, 40(5): 499–507. DOI: https://doi.org/10.1179/2042458215Y.0000000007 

  54. Meltzer, DJ. 1979. Paradigms and the nature of change in American archaeology. American Antiquity, 44(4): 644–657. DOI: https://doi.org/10.2307/279104 

  55. Miguel, E, Camerer, C, Casey, K, Cohen, J, Esterling, KM, Gerber, A, Glennerster, R, Green, DP, Humphreys, M, Imbens, G, Laitin, D, Madon, T, Nelson, L, Nosek, BA, Petersen, M, Sedlmayr, R, Simmons, JP, Simonsohn, U and Van der Laan, M. 2014. Promoting Transparency in Social Science Research. Science, 343(6166): 30–31. DOI: https://doi.org/10.1126/science.1245317 

  56. Mitchell, IM, LeVeque, RJ and Stodden, V. 2012. Reproducible research for scientific computing: Tools and strategies for changing the culture. Computing in Science and Engineering, 14(4): 13–17. DOI: https://doi.org/10.1109/MCSE.2012.38 

  57. Molyneaux, BL. 2013. The cultural life of images: Visual representation in archaeology. New York: Routledge. DOI: https://doi.org/10.4324/9781315888460 

  58. Montelius, O. 1899. Typologien eller utvecklingsläran tillämpad på det menskliga arbetet. Svenska Fornminnesföreningens Tidskrift, 10: 237–268. 

  59. Noble, WS. 2009. A quick guide to organizing computational biology projects. PLoS Computational Biology, 5(7): e1000424. DOI: https://doi.org/10.1371/journal.pcbi.1000424 

  60. Nosek, BA, Alter, G, Banks, GC, Borsboom, D, Bowman, SD, Breckler, S, Buck, S, Chambers, CD, Chin, G, Christensen, G, Contestabile, M, Dafoe, A, Eich, E, Freese, J, Glennerster, R, Goroff, D, Green, DP, Hesse, B, Humphreys, M, Ishiyama, J, Karlan, D, Kraut, A, Lupia, A, Mabry, P, Madon, T, Malhotra, N, Mayo-Wilson, E, McNutt, M, Miguel, E, Levy Paluck, E, Simonsohn, U, Soderberg, C, Spellman, BA, Turitto, J, VandenBos, G, Vazire, S, Wagenmakers, EJ, Wilson, R and Yarkoni, T. 2015. Promoting an open research culture. Science, 348(6242): 1422–1425. DOI: https://doi.org/10.1126/science.aab2374 

  61. Pearson, MP. 1998. The beginning of wisdom. Antiquity, 72(277): 680–686. DOI: https://doi.org/10.1017/S0003598X0008710X 

  62. Peng, RD. 2009. Reproducible research and biostatistics. Biostatistics, 10(3): 405–408. DOI: https://doi.org/10.1093/biostatistics/kxp014 

  63. Peng, RD. 2011. Reproducible research in compu-tational science. Science, 334(6060): 1226. DOI: https://doi.org/10.1126/science.1213847 

  64. Popper, K. 1970. Normal science and its dangers. In: Musgrave, A and Lakatos, I (eds.), Criticism and the growth of knowledge: Volume 4: Proceedings of the international colloquium in the philosophy of science, London, 1965, 51–58. Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139171434.007 

  65. Ram, K. 2013. Git can facilitate greater reproducibility and increased transparency in science. Source Code for Biology and Medicine, 8(1): 7. DOI: https://doi.org/10.1186/1751-0473-8-7 

  66. Rendell, L, Fogarty, L, Hoppitt, WJE, Morgan, TJH, Webster, MM and Laland, KN. 2011. Cognitive culture: Theoretical and empirical insights into social learning strategies. Trends in Cognitive Sciences, 15(2): 68–76. DOI: https://doi.org/10.1016/j.tics.2010.12.002 

  67. Sandve, GK, Nekrutenko, A, Taylor, J and Hovig, E. 2013. Ten simple rules for reproducible computational research. PLoS Comput Biol, 9(10): e1003285. DOI: https://doi.org/10.1371/journal.pcbi.1003285 

  68. Schiffer, MB. 2013. The archaeology of science: Studying the creation of useful knowledge. New York: Springer International Publishing. 

  69. Simondon, G. 2011. On the mode of existence of technical objects (trans: Mellamphy, N.). Deleuze Studies, 5(3): 407–424. DOI: https://doi.org/10.3366/dls.2011.0029 

  70. Slater, LJ, Thirel, G, Harrigan, S, Delaigue, O, Hurley, A, Khouakhi, A and Smith, K. 2019. Using R in hydrology: A review of recent developments and future directions. Hydrology and Earth System Sciences, 23(7): 2939–2963. DOI: https://doi.org/10.5194/hess-23-2939-2019 

  71. Snodgrass, A. 2002. A paradigm shift in classical archaeology? Cambridge Archaeological Journal, 12(2): 179–194. DOI: https://doi.org/10.1017/S0959774302000094 

  72. Stark, PB. 2018. Before reproducibility must come preproducibility. Nature, 557: 613. DOI: https://doi.org/10.1038/d41586-018-05256-0 

  73. Stodden, V, Guo, P and Ma, Z. 2013. Toward reproducible computational research: An empirical analysis of data and code policy adoption by journals. Plos One, 8(6): e67111. DOI: https://doi.org/10.1371/journal.pone.0067111 

  74. Strupler, N and Wilkinson, TC. 2017. Reproducibility in the field: Transparency, version control and collaboration on the project panormos survey. Open Archaeology, 3(1): 279–304. DOI: https://doi.org/10.1515/opar-2017-0019 

  75. Thieme, N. 2018. R generation. Significance, 15(4): 14–19. DOI: https://doi.org/10.1111/j.1740-9713.2018.01169.x 

  76. Tippmann, S. 2015. Programming tools: Adventures with R. Nature News, 517(7532): 109. DOI: https://doi.org/10.1038/517109a 

  77. Touchon, JC and McCoy, MW. 2016. The mismatch between current statistical practice and doctoral training in ecology. Ecosphere, 7(8): e01394. DOI: https://doi.org/10.1002/ecs2.1394 

  78. Toulmin, S. 1970. Does the distinction between normal and revolutionary science hold water. In: Musgrave, A and Lakatos, I (eds.), Criticism and the growth of knowledge: Volume 4: Proceedings of the international colloquium in the philosophy of science, London, 1965, 39–48. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139171434.005 

  79. Trigger, BG. 2006. A history of archaeological thought, 2nd ed. Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511813016 

  80. Vaisey, S and Lizardo, O. 2016. Cultural fragmentation or acquired dispositions? A new approach to accounting for patterns of cultural change. Socius, 2. DOI: https://doi.org/10.1177/2378023116669726 

  81. Vandewalle, P. 2012. Code sharing is associated with research impact in image processing. Computing in Science and Engineering, 14(4): 42–47. DOI: https://doi.org/10.1109/MCSE.2012.63 

  82. Watkins, JWN. 1970. Against ‘normal science’. In: Musgrave, A and Lakatos, I (eds.), Criticism and the growth of knowledge: Proceedings of the international colloquium in the philosophy of science, London, 1965, 25–38. Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139171434.004 

  83. Wickham, H. 2015. R packages: Organize, test, document, and share your code. New York: O’Reilly Media, Inc. 

  84. Wilkins, AS. 1996. Are there ‘Kuhnian’ revolutions in biology? BioEssays, 18(9): 695–696. DOI: https://doi.org/10.1002/bies.950180902 

  85. Wilson, G, Bryan, J, Cranston, K, Kitzes, J, Nederbragt, L and Teal, TK. 2017. Good enough practices in scientific computing. PLOS Computational Biology, 13(6): e1005510. DOI: https://doi.org/10.1371/journal.pcbi.1005510 

  86. Xie, Y, Allaire, J and Grolemund, G. 2018. R markdown: The definitive guide. New York: CRC Press. DOI: https://doi.org/10.1201/9781138359444