Digital social science research has had an important impact on the types of methodological approaches to the internet and digital social phenomena, practices and communities. Whilst this paper does not seek to include empirical data, it aims to elaborate further on these debates in digital social research, that is, research on ‘life in digital society’ (Lindgren 2017: 230), using insights from my own research methods. This paper will firstly consider some methodological pitfalls that could sabotage our digital social archaeology research. It will then discuss the importance of understanding the framework and sources of our data. It will outline the two main methodological approaches I have used in my own empirical research to date – ‘thick’ social media data collection and analysis, and digital ethnography. It will discuss some of the many ethical considerations that must be assessed and implemented when undertaking this type of work. I will argue for a methodological pragmatism when undertaking social research in the fields of archaeology and heritage, although this pragmatism can be applied to any field of social study in the digital world.
Over the past two decades or more, an overwhelming amount of data about society and what it is to be a human actor in a digital world has become available to the social researcher. If we understand archaeology and heritage as a series of social practices, we can perhaps envision data that can be extracted from contemporary social media discussions on archaeological topics as a field site for the examination of heritage-focused social and economic power structures, of political expediency, and the source of symbolic resources for nationhood and identity. As Richardson & Lindgren have argued previously (2017: 140) ‘it is here at the intersection of archaeology and sociology where we are most likely to radically advance our understanding of the relationship between digital technologies and archaeological knowledge from a uniquely social perspective’. The birth of ‘the “Big Data” paradigm’ (Zelenkauskaite & Bucy 2016) encouraged scholarship using large-scale data-driven methods with which to make sense of society. Access to the Application Programming Interfaces (APIs) of the large social media platforms fostered the development of innovative tools and methods to collect and analyse these data (Bruns 2019: 1545). The affordances of this access to platform APIs, and access to current and historical social media data from a range of social media platforms, changed the direction of epistemological and ethical approaches to digital research. Together, these had an important impact on the types of methodological approaches to the internet and digital social phenomena, practices and communities. We were forced, by the velocity of technological and social change, to ‘reframe key questions about the constitution of knowledge, the processes of research, how we should engage with information, and the nature and the categorization of reality’ (boyd & Crawford 2012: 665). Archaeologists and heritage professionals who wished to locate the presence of the past within contemporary social and political discourse could benefit from the multidisciplinary work of colleagues in the digital humanities and within the social sciences. Using data derived from social media, it is possible to observe and understand the human social activity that takes place around the subject of archaeological sites, monuments, knowledge, timelines, narratives, communities and professional discourses, in both individual lives and collective experiences.
In a Global North atomised by fresh claims of Fake News, science scepticism, and the reanimation of race science (Saini 2019), there has never been a more important moment at which to try to understand what meaning society ascribes to archaeology and archaeologists, and to actively use our professional, expert knowledge and voice in public discourse (Niklasson & Hølleland 2018; Zuanni 2017). However, we first have to understand how these public discourses are created, how they emerge in in the digital world, the circumstances under which archaeology is discussed as a socially relevant subject, and how these discussions are mediated and socially enacted in digital society – what Zuanni describes as ‘new opportunities to observe how users engage with archaeology in their everyday lives, beyond professional initiatives’ (2017: 1). The use of computational methods to gather data in order to understand archaeology and cultural heritage as specifically social fields, beyond the professional influence of archaeologists and archaeological activities, is still a developing field, and is, at present, almost wholly Anglophone. Although similar methods are attracting interest in the wider field of the digital humanities, of which archaeology is undoubtedly a part, only a ‘select few researchers are in a position to truly reap the benefits of big social data analysis’ (Zelenkauskaite & Bucy 2016). A handful of scholars working in the field of heritage and archaeology have made forays into the use of these computational techniques, and a number of empirical papers have been published (for example: Altaweel 2019; Bonacchi, Altaweel & Krzyzanska 2018; Cunliffe & Curini 2018; Ginzarly, Roders & Teller 2019; Greenland et al. 2019; Huffer & Graham 2017; Huffer & Graham 2018; Oteros-Rozas et al 2018; Zuanni 2017).
Despite this growing interest in digital methods from within the heritage field, gathering digital social data is not without obstacles. In the aftermath of the Cambridge Analytica scandal of 2018 (Constine & Hatmaker 2018), the largest corporate social media platforms tightened up access to their APIs, and shut down many, if not most, straightforward avenues for simple data collection for the average social/digital archaeological researcher. As Bruns (2019: 1546) notes, ‘the relationship between researchers and platforms was never entirely harmonious’. Without open APIs, ‘web interfaces have to be scraped to access the data… which is labour-intensive and drastically limits the amount of information that can be collected and processed’ (Walker, Mercea & Bastos 2019: 1536). As a reaction to these relatively recent changes to the landscape of social media research, this is a paper focused on theory and method. It does not present ‘new’ empirical archaeological knowledge but brings together ‘old’ methodological discussions in a new way for an archaeological audience. It will not discuss data assemblages, nor will it provide case studies in depth. It will, however, shine some light on some of the considerations of the digital social researcher that may have been overlooked, or undervalued in social/digital archaeological research to date. This paper will firstly discuss the importance of understanding the framework and sources of our data, especially in the light of recent changes to access to APIs, the impact of these changes and the issues of restricted access to social media data for digital research. Secondly, it will consider some of the methodological pitfalls and ethical considerations that are essential in digital social/archaeological research. It will then try to address some of these concerns by outlining the two main methodological approaches I have used in my own empirical research to date – ‘thick’ social media data collection and analysis, and digital ethnography. This paper will then argue that the future of social/digital research in heritage and archaeology calls for the development of a methodological pragmatism, although this pragmatism can be applied to any field of social study in the digital world. Finally, this paper will argue for a bricolage methodology, based on an application of Grounded Theory, with a focus on contextualising and examining user behaviour when exploring contemporary reception and understanding of archaeological material, themes and discoveries.
The use of social media as a data source presents its own methodological pitfalls, not least the fact that some forms of social media data are effectively inaccessible to the average researcher. The year 2018 marked the beginning of the end of easily accessible social media data – at least, for the poorly resourced scholar, or the researcher without a relationship with a proprietary platform, or one funded by government or other large organisation (Halavais 2019: 1568). Three of the best known and widely researched social media platforms are Facebook, Instagram and Twitter, and thanks to the prioritisation of the needs of advertisers and a variety of privacy scandals, access to these for research has been restricted. After investigation in 2018 by the US Federal Trade Commission, access to the suite of Facebook APIs is now heavily limited, privacy protections have been significantly increased, and Facebook now run their own market research app (Facebook Newsroom 2019). Access to data for independent research on Facebook is now managed by an academic-non-profit partnership called “Social Science One”, whose ‘incentive-compatible approach enables academics to analyse and use the increasingly rich troves of information amassed by companies to address societal issues, while protecting their respective interests and ensuring the highest standards of privacy and data security’ (Social Science One 2019). Instagram is owned by Facebook, Inc, and also shut down access to its public API in 2018. Its Platform Policy requires all developers to provide a use policy, respond to individual requests for information deletion, as well as obtain platform consent for most forms of data collection that a social researcher might require (Instagram 2019). Twitter has offered premium paid access to its data hose for a number of years (Richardson 2014a). This is usually at a price that can place it far beyond the financial capabilities of most academic research grants. The loss of licensed access to Twitter data through Textifer, the academic focused analytics and commercial data provider, in September 2018 significantly impacted my own research and left a gaping hole in easy scholarly access to the platform (DiscoverText 2019). Currently, Twitter can be accessed through the Developer API, although potential users need to register an application and provide detailed information about the form and scope of the research they wish to undertake. When applying for my own developer access, I had to answer a series of emails which asked for very detailed information about the future use of the data gathered from the platform, including my plans to publish, and the potential use of images and screenshots taken from the site at any point in the future. The developer use policies have several restricted use policies which include subjects such as health, political beliefs or religious affiliation, all of which might be highly relevant to the social researcher (Twitter Help Center 2019).
Although there are many good reasons for increased privacy and user protection across these platforms, these restrictions have vastly limited the representativeness of any data obtained through these platforms. The issue of capturing metadata and metrics, or the relationship between entities, may also restrict the usefulness of data, or change the focus of enquiry. Reliance on digital data collected from the major social media platforms also risks research design dependent on the technical architecture and constraints of the platform in question, rather than activity and behaviour of the social actors and issues being researched (Lewis 2015; Marres 2017). Bruns (2019: 1556) argues that there are four options left in what he terms this ‘hostile environment’ for scholarly research:
All of these are possibilities for those of us in the field of social/digital archaeology and heritage, although they would require skills in web scraping, serious ethical somersaults, and injections of cash. These issues also represent further barriers for independent researchers, those outside of large research-intensive universities, and academics working outside the Anglophone Global North (Puschmann 2019: 1588). Of course, all social media platforms and apps will hold technical, social and cultural challenges for the researcher, which will differ depending on the ability to access platform data, the type of social activity or phenomena under consideration, and the apparatus of social interaction provided. Web scraping also limits the area of research to web-based content and makes access to information from mobile apps very complicated. Using data collection methods that require less technical understanding, or data that are easier to access might lead to a ‘scrape-analyse-conclude’ process in heritage research which is epistemologically troubling and technologically determinist. Access to data from easier-to-reach platforms such as Wikipedia or Reddit, or even user comments, blogs and fora are fields that are relatively unexplored by heritage scholars. As Marres argues (2017: 114) we perhaps need to reduce the role of technological considerations in the first iterations of our research methodology designs, and become device or platform aware, rather than device or platform driven.
There is little advice available to the archaeologist and heritage practitioner from within the sector on how to approach digital social research questions from a methodological perspective, what challenges might arise in terms of data collection, and how to carry out this type of research in an ethical manner. The digital environment allows for far more complex forms of data to be collected and processed. New data types emerge alongside new devices, platforms and forms of connectivity, requiring a pragmatic approach to the types of data the researcher plans to collect which we will discuss further below. It behoves the archaeological researcher to explore the wider world of digital research. Some of these forms of data, such as intentional data in the form of online surveys and interviews have existed prior to the ubiquity of 21st century social media, but some are specific to these new landscapes. Traditional social research in archaeology has most often relied on the production of intentional data, such as that derived from surveys and interviews (e.g. Kowalczyk 2016; Apaydin 2017). There has been some useful categorisation of the types of data produced in digital environments by Purdam and Elliot (2015: 26), which is beneficial to reflect on from a methodological perspective. Purdam and Elliot’s (2015: 28–29) typology consists of:
From this typology, we can identify that archaeologically relevant data is most likely to be found as intentional data, participative or self-published data, social media and found data. There are many developing methods for accessing, extracting and analysing data sets from these data sources which are relevant to archaeological topics. Computational social science offers a challenge to our understanding of public opinions and sentiments in the variety of tools and programmes that can be used to access them. But these data, and the analysis of the public-facing social and cultural lives represented are ‘not only an important topic for social enquiry, but they also require active methodological engagement’ (Marres 2017: 188). Many of the recent social/digital archaeology and heritage papers published in the past few years and outlined above have used ‘big data’ approaches to social data. There are too many arguments in the wider field of digital research that have been made around the many definitions and impacts of the availability (or not) of large amounts of social data to discuss here with any complexity, but it is useful to briefly outline what such approaches might involve for a researcher interested in heritage perspectives. According to boyd and Crawford (2014: 663) ‘big data’ can be defined as:
‘…a cultural, technological, and scholarly phenomenon that rests on the interplay of:
- Technology: maximizing computation power and algorithmic accuracy to gather, analyze, link, and compare large data sets.
- Analysis: drawing on large data sets to identify patterns in order to make economic, social, technical, and legal claims.
- Mythology: the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy.’
The recent challenges faced by the academic community who wish to access platform data is a reminder that digital scholars need to first engage with increasingly complex technological skills, and tools with which to collect data, such as the use of APIs and web scraping. In practical terms, this means that social/digital archaeologists need to either have previous experience of digital data capture and coding skills, or time and money to acquire these skills, or have access to human and computing resources who can provide these technical research support. My own experience of attempting to capture large datasets from social media via the use of web scraping with Python entailed a week-long training course, the purchase of numerous ‘teach yourself’ books, and intense frustration on my journey. These experiences can be avoided when planned work is funded, and appropriate software support can be strategically placed in order to speed up the data collection process.
Data capture means data cleaning, which is theory-laden in itself. The analysis of the data captured begins first with processing and cleaning, which almost always ‘relies on some form of “data reduction” and indeed, prior categorization’ (Marres 2017: 20). Again, this is an ethical issue to consider when dealing with data drawn from the digital presence of indigenous communities and others who experience colonialism and oppression in their daily lives. These factors may affect the range of research questions asked in the first place, and the types of analysis that may take place. Understanding how value-laden, gender discriminatory, racist and biased many of the algorithms are that supply our platform-derived data is essential, if beyond the scope of the discussion in this paper (e.g. Noble 2019; O’Neil 2016; Selena & Kenney 2019). The tools for analysing these ‘big data’ assemblages often also rely on algorithms to locate patterns and model topics, ignoring the epistemological issues that sit alongside the use of these types of data (Huggett 2018: 100). There are also infrastructure-related issues attached to the use of large datasets, often as a result of the challenges of dealing with the amounts of data produced, and the need for data storage capacity, digital curation of ‘ephemeral’ data, and preparation for data reuse (Huggett 2018: 101). If we want to understand social activity around archaeology subjects, it is vital to understand where and how bias occurs, whether in our research design, in the technical architecture, or in the unrepresentative nature of our subject platform populations. These in turn affect the format and location of data collection, and further extended discussion to address these issues amongst the social/digital heritage community are urgently required.
Understanding that these data are not produced simply as the result of data science applied to heritage relevant material is important. Large data sets created for social research are made up of both human and machine data and metadata. These are best understood as emergent data assemblages (Lupton 2016: 2) which only emerge through human/technology encounters and re-encounters in these social/political/economic/technological overlaps. There is some ambiguity in the framing of why we might use these large datasets in social/digital archaeology research – are we researching society, communication platforms, technologies, or a hybrid? Digital data is never collected without being “pre-cooked” and it is always collected, analysed and understood through social, economic and political actions and contexts – in the case of archaeological data, these most often focus on research undertaken in the Global North, with and by Anglophone communities.
As Lindgren (2017: 238) has argued, ‘…research about digital society demands continuous critical reflection’. Since the field of digital social research has no single accepted methodological tradition, and the speed of the development of digital technologies, alongside social interactions, creates new data types and contexts, there is an urgency around the subject of ethics in the practice of digital research. The emergence of big data research paradigms in digital heritage calls for further critical and reflexive thinking around what ethical issues complex data environments might generate – and most importantly, we need to encourage more discourse on the ethical issues encountered by researchers in published research papers. There has been a notable neglect of ethical issues in social/digital archaeologies published to date, with the exception of Richardson (2018) and Huffer, Wood & Graham (2019). Whilst it is beyond the scope of this paper to fully examine the ethical issues attached to research in these forms of digital public archaeology, it is important for researchers to be familiar with the ethical issues raised in the archaeological field, especially in the light of the argument for methodological pragmatism laid out in this paper.
For the methodological pragmatist, the advice from the Association of Internet Researchers (Buchanan & Markham 2012: 5) has perhaps the most appropriate guiding principle to consider when undertaking the types of methods and areas of research outlined in this paper:
‘Ethical decision-making is a deliberative process, and researchers should consult as many people and resources as possible in this process, including fellow researchers, people participating in or familiar with contexts/sites being studied, research review boards, ethics guidelines, published scholarship (within one’s discipline but also in other disciplines), and, where applicable, legal precedent.’
Sharing information on the complications (or not) of the many ethics approval procedures and requirements which are made by our institutional review boards would help researchers in the field of digital archaeology and heritage understand what they need to do and show how these questions have been answered before. The nature of digital social research lends it an ethical slipperiness which means that a perfect understanding of ethical methodologies is almost impossible to achieve, unless individual research questions are approached iteratively and regularly. Responsible research in digital heritage should start with an understanding of the variety of ethical guidance documents available to digital researchers. Organisations such as the Association of Internet Researchers Ethics Working Committee (Buchanan & Markham 2012), the British Sociological Association (2019), and the British Psychological Society (2017), amongst others, publish comprehensive guidance material for digital research. These documents outline the complexity involved in research design and procedures for data collection and storage, all of which are highly contextualised. All members of the international Computer Applications and Quantitative Methods in Archaeology (CAA) must abide by their Ethics Policy, but this policy does not yet directly endorse ethical research beyond the requirement of members to ‘provide education on good practices in digital and computational archaeology’ (CAA 2018). Still, many digital archaeology and heritage researchers have either apparently neglected to explicitly address ethical considerations in their publications, or state that they rely on their institution’s or funder’s research ethics governance committees for support when dealing with digital data. It is worth reiterating that many organisations beyond the higher education sector may not have ethics codes or obligations, and even those researchers in institutions with a research governance process may find their oversight committee has members who may not work in digital research fields and may not understand the complexities involved as a result. It is also important to be aware that digital researchers, and their ethical oversight committees, may not be fully cognisant of the variety of issues intrinsic to digital projects that encounter, for example, indigenous knowledge, unrepatriated museum objects, or other forms of digital colonialism (e.g. Kamash 2017). Further public discussion of these issues amongst social/digital archaeologists and the communities they serve would be of immense benefit for socially just digital heritage and future research.
The potential for archaeologically focused research using assembled data to understand opinions and interactions, dominant voices, audiences, public engagement, reception studies and opinion tracking is vast, but has been hobbled by the recent changes to access to APIs for the major players in social media. From a pragmatic methodological standpoint, it is not sufficient to study the entanglements of the online world of archaeology and heritage from a distance and at scale alone. As discussed above, the digital is tightly bound with material, political and social considerations about being human, and ‘…humans have always been a product of technogenesis’ (Duggan 2017: 4). Using big data approaches means we are often unable to see the nuanced and subjective meanings attached to human interactions, and so we need to seek to understand the complexity of social actions which extend beyond the online, or what Geertz termed ‘webs of significance’ (1973: 5). Getting to grips with the semiotics of digital interaction and the meanings ascribed to these ‘webs of significance’, whether that is in order to understand social questions about Brexit and archaeology, themes of national heritage, the antiquities trade or platform-specific archaeological interactions, also requires us to map contexts, and compile and interpret our data close up and in detail (Lindgren 2017: 261). As Rogers notes, the purposes of social media are ‘success theatre and projection, productive networking and consumer futurism’ (Rogers 2018: 454): that is, a tool focused on self-projection, whether consciously or not. It is only by seeking to describe and decode the interactions observed online that are embedded in everyday lives that we can hope to understand the meaning, use and value applied to the archaeological and historical information in the multitude of social practices we encounter on the internet.
My own PhD research (Richardson 2014a) and subsequent postdoctoral work has developed some of the strategies and methods from the discipline of online ethnography. Geertz (1973: 14) has described ethnography as ‘thick description’ – a methodological approach that specifies contextual information, complex details, structures and meanings. Digital ethnography is also, as Whitehead (2005) argues, neither qualitative nor quantitative. These methods have been harnessed with a varying degree of success and publication during the last decade of my academic life. For example, I have used digital ethnographic approaches in order to gather data on the crowd sourced archaeology project, the Day of Archaeology (Richardson 2014b). This work included an examination of the comments and interactions on the Day of Archaeology website, Facebook page, and Twitter feed. I also used participant observation to support qualitative work on Twitter use amongst archaeologists, during the period 2011–17. I worked on an unpublished AHRC-funded project ‘Data, Diversity and Inequality in the Creative Industries’ in 2018 which used a mixed methods approach to understanding sentiment, emoji, and exclusion on the Twitch platform during Esports events, and most recently have used a hybrid of participant observation and textual analysis to explore a dataset drawn from Twitter and ‘below-the-line’ comments on newspaper articles about Stonehenge.
Whilst these are all interesting and nuanced areas of digital methods research which will hopefully all see light of day in publication someday soon, they usefully highlight the propensity of a digital ethnographic framework to contain methodological pitfalls and possible misunderstandings, especially as digital ethnography is an evolving practice and a paradigm not without some controversy. There are numerous disagreements over digital research methodologies across the fields of computing, media and communication studies, sociology, and anthropology, and beyond (e.g. Marres 2017). Navigating this landscape requires a robust engagement with new fields of literature, an appreciation of old rumblings and current discussions within these fields, and a depth of understanding of the wider issues of the technical architecture of the internet, how platforms work, and how users interact within those platforms. The field of social media research in its broadest sense is highly varied, undertaken differently in different research fields, and has emerging traditions for enquiry and analysis in each of these fields. Archaeology has a long history of what has been termed the ‘ill-digested browsings of the literature of the sociology of science’ (Murray 2014: 83). Methodological pragmatism may also reinforce similar cherry-picking of internet-related literature, which can be avoided by careful consideration of the implications of interdisciplinarity and unfamiliarity with other fields and traditions. Hence, the foundations for this type of research are not immediately available for the average archaeological scholar and require significant investment in new skills and extensive exploration in new fields of literature.
There are two main approaches to ethnographic work undertaken with online environments. There have been a number of publications on the subject of online ethnography on the internet from the late 1990s onwards and this is an innovative and iterative framework for research. Here, we will consider the use of the term ‘digital ethnographies’ to refer to the study of digital environments variously termed ‘cyber ethnography’ (Ward 1999), ‘virtual ethnography’ (Hine 2000), ‘ethnography of virtual worlds’ (Boellstorff et al. 2012), or ‘netnography’ (Kozinets 1998). In most cases, these seek to use the internet and digital spaces as fields of research to understand user behaviour, interactions, and cultural practices. There is no single accepted overarching research method, although generally these methods and approaches favour covert or overt participant observation and interviews. Over the past two decades or more this sort of research has taken place in a wide variety of virtual environments, gaming sites, message boards, online fora, blog comments, dating apps, on proprietary social media platforms and mobile applications, and may also involve some form of data collection offline. This field list is growing and not exhaustive (e.g. Hine 2015; Hjorth et al. 2017). The process of digital ethnographic research may or may not begin with a formal introduction by the researcher into the field of the research subject community or platform. Data can be collected in the form of images, text, film, screenshots of comments and interactions, interviews, and reflective and observational field notes. As new platforms, communities and technologies develop, the data formats will also develop (Kozinets 2015; Hjorth et al. 2017).
The literature on the methodological background to digital ethnographies has been dominated in recent years by the work of Christine Hine, a sociologist at the University of Surrey. Hine’s discussion (2015) of ethnography with the internet moves beyond the concept of online/offline and digital versus ‘real’ world practices and actions, and better reflects the embedded role the internet has in societies and actions in the Global North. With these distinctions in mind, methodological frameworks for ethnography with the internet are important to outline here and are key considerations for the researcher seeking to examine the contemporary social landscapes of archaeology and heritage. Hines’ work argues for an E3 Framework for the internet (2015: 32), which provides context when thinking about the various research questions and methodological challenges and strategies involved in digital public archaeology in all its manifestations. These three Es refer to the internet as embedded, embodied and every day: the internet is embedded in, and is blended through, our everyday realities, material and physical, as much as the digital. As Lindgren (2017: 272) argues, the methodological bricolage ‘must move beyond any divisions between ‘qualitative’ and ‘quantitative’’ and grasping the indistinguishableness of what is digital and what is not is key to the pragmatic digital researcher’s toolkit.
The internet is not ‘some sort of meta-entity’ (Lindgren 2017: 267) where online activities and use of the internet a) separate parts of people’s actions or identities or b) used similarly everywhere, at all times, for the same things. Researching social practices and interactions around subjects such as archaeology will also entail multi sited ethnographies and gathering data with an online/offline division is an easy methodological pitfall to avoid. It is important here to also draw a distinction between the forms of digital ethnography outlined above and what Duggan (2017: 6) terms ‘non media-centric approaches to digital ethnography’: ethnographic approaches to digital media practices and cultures which are not necessarily taking place in a binary online/offline context. These move our conceptualisation of what is a digital field of study, along with associated social practices for archaeologists (beyond the screen, for example), to a broader understanding of the complexity of the digital world akin to that suggested by Bratton’s ‘The Stack’ (2016). That is, the digital encompasses everything from the geological and energy considerations of computational power, to internet apparatus, code, web design, software, and the hardware you hold in your hand, alongside the social practices that emerge within and through all of these complex layers of material and interaction.
Tufekci (2014) has raised a series of important methodological issues which must be addressed, and notes that ‘(the) meaning of social media imprints, context of human communications, and nature of socio-cultural interactions are multi-faceted and complex’ (2014: 513). First, the lack of attention to the structural biases of the main platforms that are researched by digital social scholars – and social/digital archaeologists – means that the affordances of each platform for the user might be overlooked or underestimated in relationship to the data collected (Tufekci 2014: 506). There is also the issue of sample bias, since differing platforms contain social groups and individuals with different digital literacies, different social norms and these may not be represented or sampled without careful attention to the various factors that encourage users to these spaces in the first place.
Secondly, when it comes to research methods which rely on large scale searches for keywords or hashtags alone, such as that undertaken by Huffer & Graham (2018) for example, data selection takes place through a format which is unstable. Hashtags may only be used by specific populations and participants, may be used inconsistently, or need to be well known to be used; hence ‘hashtag dataset analyses need to be accompanied by a thorough discussion of the culture surrounding the specific hashtag, and analysed with careful consideration of selection and sampling biases’ (Tufekci 2014: 508). For example, hashtag users on Twitter and Instagram self-select, and the performance and the use of hashtags have different meanings within each platform, within social networks and between individuals. A relevant hashtag may not be applied in every relevant tweet or Instagram post, and therefore data will be missing if search activity concentrates on hashtag searches or visualisations. Hashtags may be used for a variety of social and reasons, including to demonstrate community identity and empathy. However, hashtags are also used for less positive, disruptive activities such as trolling, hate-posting, or to hijack a trending topic. People may take to Twitter to subtweet, to screen capture and quote tweet, or to indicate feeling without using text, instead using an emoji, series of emoji, an image or a GIF. Similar subversive or non-textual activity takes place across all social media platforms. It is essential to bear in mind the ephemerality and moments of spontaneous, unrepeated ‘collective enthusiasm’ (Caliandro 2018: 567) represented by social interactions online.
This is especially important to consider when making claims that digital heritage and the production of heritage online takes place ‘naturally’ (Bonacchi & Kryzanska 2019: 4). The majority of social interactions on social media are not always collective, persistent or thick and may be made by singletons, disconnected from any discussion. The types of interactions may in fact be as shallow as a like, a favourite or the posting of a reaction emoji, the meaning of which cannot be interpreted with any reliability (Andrejevic 2013: 15). The use of retweets, likes, favourites, shares, open links etc. as a metric to measure engagement or ‘success’ in some form, does not contextualise this information in terms of the total audience for these materials, and the number of users who may have seen posts, images, memes and so on and done nothing at all remains ambiguous and opaque. This may be a result of the closed nature of some social media platforms in terms of the availability of this type of data, but an estimate of population is possible with Twitter for example, which provides data on engagements with Tweets, the number of interactions or the number of times images have been viewed. For example, my research on Twitch demonstrated that the use of emoji were often made at speed, during fast on-screen game play, and could be interpreted as channels for excitement, humour, keeping a hand in the conversation, or even as a form of trolling, rather than statements of serious intent or meaning.
Digital social research is, as Marres (2017: 140) argues, ‘partial to particular locations, populations, topics and forms of expression’. This raises a number of methodological problems and increases the complexity of the disciplinary framings in the broad heritage field. Studying the social in digital archaeology and heritage in contexts found in the Global North requires us to adjust our empirical framings. As infrastructure, platforms and social practices change rapidly, the sophistication in both organisational output and the growth in participatory online projects will increase, and the digital saturation of much of society across the world continues. The internet, and the social apps, tools and platforms found there provide a research context with an ‘essential changeability’ that demands a continual, conscious shift of focus and method (Jones 1998: xi). Even though Jones’ work was written 20 years ago, understanding this “essential changeability” is still crucial. Digital social research requires an interdisciplinary approach and will inevitably require some form of engagement with coding and data retrieval. And as Tsatsou (2018: 1253) notes, these areas of digital literacy are not necessarily ones that researchers with a non-digital humanities or social science background have been trained for to a sufficient degree. Tsatsou also identifies that the complexity of methodological issues in digital research and lack of high-level technical skills for the humanities researcher could pose digital literacy challenges relating to ethics (2018: 1245). This work highlights the need for social/digital archaeological researchers to contextualise and reframe their research questions as well as their research data, with the complex needs and realities of research subjects who are simultaneously online and offline. Funding for training is required, as is acknowledgement by colleagues in archaeology and heritage studies that fieldwork in digital spaces is a valid and valuable form of contemporary archaeology as much as it is a form of social research.
Lindgren’s call (2017: 234) for methodological pragmatism is a powerful one. By focusing on the issue or issues to be researched and thinking deeply and critically about the types of understanding that is sought, the research methodologies can emerge as a ‘patchwork of solutions’ as they arise. As Lindgren notes, ‘…one is by necessity forced to actively and critically navigate a landscape of old and new methods in order to seek out ways of engaging with data that suit one’s particular project’ (2017: 235). Rather than choosing a research method at the beginning of the research study, and sticking to one data collection method, or one single platform, it is possible to create something that might be considered to be a form of methodological bricolage. This builds on the developments within methods literature, and the work of methodologists Denzin and Lincoln (2005). Bricolage itself derives from Structuralism and Post structuralism, especially the work of Lévi-Strauss (1966) and Derrida (1967), describing the way in which people use what is needed to get tasks done, to make do, to improvise, and to innovate. This approach also opens up space for detailed consideration of ethical issues, to reflect on the potential paths for data collection that do the least harm to the communities and individuals under consideration.
When it comes to data analysis, the methodological pragmatism and sense of bricolage outlined by Lindgren (2017) also works very well alongside Grounded Theory, which is the theoretical positioning I have taken throughout my research career. Grounded Theory is a qualitative research and analysis method first suggested by Glaser and Strauss in 1967 which facilitates ‘the discovery of theory from data’ (Glaser & Strauss 1967: 1) through observation of and iterative reflection on social actions, attitudes and participation (Charmaz 1995; Charmaz 2006). Grounded Theory is especially useful when ‘the study of social interactions or experiences aims to explain a process, not to test or verify an existing theory’ (Lingard, Albert & Levinson 2008: 459) and offers the methodological pragmatist the flexibility to observe and explain how collective and individual actors operate.
The process of Grounded Theory is very much a bricolage, in the sense that the researcher improvises concepts and relationships that emerge from the raw collected data, which are then iteratively organised and reorganised into themes. These eventually emerge as a series of concepts which can be drawn from the data itself, both during and after the ‘data collection and analysis phases of the research’ (Charmaz 1995: 28). This method starts with a basic research question, and a non-linear pathway through the data collection process, building on complexity of the research questions as the researcher moves through data environment. By undertaking data analysis as an iterative process, the researcher can focus on the development of the research questions, change platforms and fields as they follow the topics and social actors or actions, and gather data from more than one platform or source. The results of each of these stages guide the next stage of data collection, the methods next used, the subsequent platforms used, and the increasing refinement of the research questions as a result (Pickard 2013: 182). Of course, with reflexivity baked into these methodological bricolages, the researcher is required to centre ethical practice (as discussed above) and ongoing critical reflection can only aid this process.
Building on the work of Richardson and Lindgren (2017) this paper has further emphasised that digital archaeology and heritage researchers need to develop further the interdisciplinary nature of digital social research, and work with and learn from specialist colleagues in informatics, sociology and internet studies. These fields are explicitly familiar with digital theory and methodologies and are where the many hyperboles of digital research have been addressed. A careful understanding of the culture of the platforms under examination, and the types of data sought is absolutely key to robust research. As Tufekci states, social research is robust when the researcher is able to ‘start from the principle of understanding user behaviour first and should follow the user rather than following the hashtag’ (Tufekci 2014: 509). We do need to obtain a careful understanding of the technical structures and affordances, the ‘platform vernaculars’ (Gibbs et al. 2015: 257ff), the visual languages, and the many accompanying pitfalls contained in the fields in which we choose to work: researchers should be cognisant of the ranges of digital behaviour that may take place across platforms, as well as within them.
Rather than a detailed discussion of empirical data, I anticipate that the clear methodological and theoretical focus of this paper will invite wider interest in the potential of these approaches for researchers who wish to apply a deeper understanding of digital methods to a practical exploration of the role of archaeology in digital society. This aims to enhance the recent empirical work of archaeological scholars (e.g. Bonacchi, Altaweel & Krzyzanska 2018; Bonacchi & Krzyzanska 2019; Huffer & Graham 2017; 2018) which have not included explicit discussions of the methodological challenges of this type of research. This position paper should, I hope, encourage further robust, critical discussion amongst heritage and archaeological researchers working with contemporary digital data. I hope that it will further our efforts to share our unique experiences as scholars in this small field, provide encouragement for new researchers to join us, and support further interest in developments in the wider digital methods field, especially that of digital sociology. The Grounded Theory approach to these data environments proposed here could support nuanced and iterative data collection, and the pragmatism of using multiple approaches to data, big and small, will mean that ‘the biases and shortcomings of each method can be used to balance each other to arrive at richer answers’ (Tufekci 2014: 514). There is a broader role for academics to advocate collectively with political and regulatory bodies for access to platform data. I would also support Bruns’ call to social media researchers to conclude their papers with the call to action for ‘social media platforms provide transparent data access to critical, independent, public-interest research’ (Bruns 2019: 18). However, we must also be aware that our data sources are also used for nefarious purposes and hate speech, and platforms that improve privacy and data security are not necessarily undertaking anti-research actions. It is essential that this type of archaeological fieldwork supports the development of complex and nuanced ethical digital research projects in the future, as well as the development of flexible yet robust methodologies with which to locate and analyse the rich evidence for human experiences and opinions about the past that can be found in the digital world today. Social/digital archaeology and heritage does not need try to frame itself as a form of data science through a focus on hard data science research techniques and methodologies in order to undertake robust social research. Robust research will only arise when we can also follow the activities of the social media user, who is a rounded dynamic social being who deserves to be treated ethically, as well as being treated as a form and source of data.
Thanks to Eleftheria Paliou, Jeremy Huggett and Costas Papadopoulos for organising the writing workshop at the University of Cologne in September 2018, where this paper first took shape, and for reading versions of this paper. Thanks also to the anonymous reviewers for their help, to Simon Lindgren for the original direction for a previous version of this paper, and to Shawn Graham and Andrew Reinhard for their invaluable comments on the text.
In memory of Theresa O’Mahoney, founder of the Enabled Archaeology Foundation.
This article is based upon work from COST Action ARKWORK, supported by COST (European Cooperation in Science and Technology). www.cost.eu.
|Funded by the Horizon 2020 Framework Programme of the European Union.|
The author has no competing interests to declare.
Altaweel, M. 2019. The Market for Heritage: Evidence From eBay Using Natural Language Processing. Social Science Computer Review. DOI: https://doi.org/10.1177/0894439319871015
Andrejevic, M. 2013. Infoglut. How Too Much Information Is Changing the Way We Think and Know. New York: Routledge. DOI: https://doi.org/10.4324/9780203075319
Apaydin, V. 2017. Heritage values and communities: Examining heritage perceptions and public engagements. Journal of Eastern Mediterranean Archaeology & Heritage Studies, 5(3–4): 349–364. DOI: https://doi.org/10.5325/jeasmedarcherstu.5.3-4.0349
Boellstorff, T. 2013. Making big data, in theory. First Monday, 18(10). DOI: https://doi.org/10.5210/fm.v18i10.4869
Boellstorff, T, Nardi, B, Pearce, C and Taylor, TL. 2012. Ethnography and Virtual Worlds: A Handbook of Methods. Princeton, NJ: Princeton University Press. DOI: https://doi.org/10.2307/j.cttq9s20
Bonacchi, C, Altaweel, M and Krzyzanska, M. 2018. The Heritage of Brexit: Roles of the past in the construction of Political Identities through Social Media. Journal of Social Archaeology, 18(2): 174–192. DOI: https://doi.org/10.1177/1469605318759713
Bonacchi, C and Krzyzanska, M. 2019. Digital heritage research re-theorised: Ontologies and epistemologies in a world of big data. International Journal of Heritage Studies, 25(12): 1235–1247. DOI: https://doi.org/10.1080/13527258.2019.1578989
boyd, d and Crawford, K. 2012. CRITICAL QUESTIONS FOR BIG DATA. Information, Communication & Society, 15(5): 662–679. DOI: https://doi.org/10.1080/1369118X.2012.678878
Bratton, B. 2016. The Stack: On Software and Sovereignty. Cambridge MA: MIT Press. DOI: https://doi.org/10.7551/mitpress/9780262029575.001.0001
British Psychological Society. 2017. Ethics Guidelines for Internet-Mediated Research. Available at: https://www.bps.org.uk/news-and-policy/ethics-guidelines-internet-mediated-research-2017. [Last accessed 2 October 2019].
British Sociological Association. 2019. Ethics Guidelines and Collated Resources for Digital Research. Available at: https://www.britsoc.co.uk/media/24309/bsa_statement_of_ethical_practice_annexe.pdf. [Last accessed 2 October 2019].
Bruns, A. 2019. After the ‘APIcalypse’: social media platforms and their fight against critical scholarly research. Information, Communication & Society, 22(11): 1544–1566. DOI: https://doi.org/10.1080/1369118X.2019.1637447
Buchanan, E and Markham, A. 2012. Ethical Decision-Making and Internet Research. Association of Internet Researchers. Available at: https://aoir.org/reports/ethics2.pdf. [Last accessed 2 October 2019].
CAA. 2018. Ethics Policy. Available at: http://caa-international.org/about/ethics-policy/. [Last accessed 2 October 2019].
Caliandro, A. 2018. Digital methods for ethnography: analytical concepts for ethnographers exploring social media environments. Journal of Contemporary Ethnography, 47(5): 551–578. DOI: https://doi.org/10.1177/0891241617702960
Constine, J and Hatmaker, T. 2018. Facebook admits Cambridge Analytica hijacked data on up to 87m users. TechCrunch. Available at: https://techcrunch.com/2018/04/04/cambridge-analytica-87-million/. [Last accessed 2 October 2019].
Cunliffe, E and Curini, L. 2018. ISIS and heritage destruction: A sentiment analysis. Antiquity, 92(364): 1094–1111. DOI: https://doi.org/10.15184/aqy.2018.134
DiscoverText. 2019. About. Available at: https://discovertext.com/about/. [Last accessed 2 October 2019].
Duggan, M. 2017. Questioning ‘digital ethnography’ in an era of ubiquitous computing. Geography Compass, 11(5): e12313. DOI: https://doi.org/10.1111/gec3.12313
Facebook Newsroom. 2019. Cleaning Up Data Access for Partners. Available at: https://newsroom.fb.com/news/2019/07/cleaning-up-data-access/. [Last accessed 2 October 2019].
Gibbs, M, Meese, J, Arnold, M, Nansen, B and Carter, M. 2015. #Funeral and Instagram: death, social media, and platform vernacular. Information, Communication & Society, 18(3): 255–268. DOI: https://doi.org/10.1080/1369118X.2014.987152
Ginzarly, M, Roders, AP and Teller, J. 2019. Mapping historic urban landscape values through social media. Journal of Cultural Heritage, 36: 1–11. DOI: https://doi.org/10.1016/j.culher.2018.10.002
Glaser, B and Strauss, A. 1967. The Discovery of Grounded Theory: Strategies for Qualitative Research. Mill Valley, CA: Sociology Press. DOI: https://doi.org/10.1097/00006199-196807000-00014
Greenland, F, Marrone, J, Topçuoğlu, O and Vorderstrasse, T. 2019. A Site-Level Market Model of the Antiquities Trade. International Journal of Cultural Property, 26(1): 21–47. DOI: https://doi.org/10.1017/S0940739119000018
Halavais, A. 2019. Overcoming terms of service: a proposal for ethical distributed research. Information, Communication & Society, 22(11): 1567–1581. DOI: https://doi.org/10.1080/1369118X.2019.1627386
Hine, C. 2000. Virtual Ethnography. London: Sage. DOI: https://doi.org/10.4135/9780857020277
Hjorth, L, Horst, H, Galloway, A and Bell, G. (eds.) 2017. The Routledge Companion to Digital Ethnography. New York: Routledge. DOI: https://doi.org/10.4324/9781315673974
Huffer, D and Graham, S. 2017 The Insta-Dead: the rhetoric of the human remains trade on Instagram. Internet Archaeology, 45. DOI: https://doi.org/10.11141/ia.45.5
Huffer, D and Graham, S. 2018. Fleshing Out the Bones: Studying the Human Remains Trade with Tensorflow and Inception. Journal of Computer Applications in Archaeology, 1(1): 55–63. DOI: https://doi.org/10.5334/jcaa.8
Huffer, D, Wood, C and Graham, S. 2019. What the Machine Saw: some questions on the ethics of computer vision and machine learning to investigate human remains trafficking. Internet Archaeology, 52. DOI: https://doi.org/10.11141/ia.52.5
Huggett, J. 2018. Reuse remix recycle: repurposing archaeological digital data. Advances in Archaeological Practice, 6(2): 93–104. DOI: https://doi.org/10.1017/aap.2018.1
Instagram. 2019. Platform Policy. Available at: https://www.instagram.com/about/legal/terms/api/. [Last accessed 2 October 2019].
Kamash, Z. 2017. ‘Postcard to Palmyra’: bringing the public into debates over post-conflict reconstruction in the Middle East. World Archaeology, 49(5): 608–622. DOI: https://doi.org/10.1080/00438243.2017.1406399
Kowalczyk, S. 2016. Excavating the “Who” and “Why” of Participation in a Public Archaeology Project. Advances in Archaeological Practice, 4(4): 454–464. DOI: https://doi.org/10.7183/2326-37188.8.131.524
Kozinets, RV. 1998. On Netnography: Initial reflections on consumer research investigations of cyberculture. In: Alba, JW and Hutchinson, JW (eds.), NA – Advances in Consumer Research, 25: 366–371. Provo, UT: Association for Consumer Research.
Kozinets, RV. 2015. Netnography. In: Ang, PH and Mansell, R (eds.), The International Encyclopedia of Digital Communication and Society. DOI: https://doi.org/10.1002/9781118767771.wbiedcs067
Lewis, K. 2015. Three fallacies of digital footprints. Big Data & Society, 2(2): 1–4. DOI: https://doi.org/10.1177/2053951715602496
Lingard, L, Albert, M and Levinson, W. 2008. Grounded theory, mixed methods, and action research. BMJ, 337: a567. DOI: https://doi.org/10.1136/bmj.39602.690162.47
Lupton, D. 2016. Digital companion species and eating data: Implications for theorising digital data–human assemblages. Big Data & Society, 3(1): 1–5. DOI: https://doi.org/10.1177/2053951715619947
Natural History Museum. 2018. The First Brit: Secrets of the 10,000-year-old man. Available at: http://www.nhm.ac.uk/press-office/press-releases/the-first-brit--secrets-of-the-10-000-year-old-man.html. [Last accessed 2 October 2019].
Niklasson, E and Hølleland, H. 2018. The Scandinavian far-right and the new politicisation of heritage. Journal of Social Archaeology, 18(2): 121–148. DOI: https://doi.org/10.1177/1469605318757340
Noble, SU. 2019. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: NYU Press. DOI: https://doi.org/10.2307/j.ctt1pwt9w5
Oteros-Rozas, E, Martín-López, B, Fagerholm, N, Bieling, C and Plieninger, T. 2018. Using social media photos to explore the relation between cultural ecosystem services and landscape features across five European sites. Ecological Indicators, 94(2): 74–86. DOI: https://doi.org/10.1016/j.ecolind.2017.02.009
Puschmann, C. 2019. An end to the wild west of social media research: a response to Axel Bruns. Information, Communication & Society, 22(11): 1582–1589. DOI: https://doi.org/10.1080/1369118X.2019.1646300
Richardson, L-J. 2018. Ethical challenges in digital public archaeology. Journal of Computer Applications in Archaeology, 1(1): 64–73. DOI: https://doi.org/10.5334/jcaa.13
Richardson, L-J and Lindgren, S. 2017. Online Tribes and Digital Authority: What Can Social Theory Bring to Digital Archaeology? Open Archaeology, 3(1): 139–148. DOI: https://doi.org/10.1515/opar-2017-0008
Rogers, R. 2018. Digital Methods for Cross-platform Analysis. In: Burgess, J, Marwick, A and Poell, T (eds.), The SAGE Handbook of Social Media, 91–10. London: Sage. DOI: https://doi.org/10.4135/9781473984066.n6
Selena, S and Kenney, M. 2019. Algorithms, Platforms, and Ethnic Bias: A Diagnostic Model. Communications of the Association of Computing Machinery, 62(11): 37–39. DOI: https://doi.org/10.1145/3318157
Social Science One. 2019. Home. Available at: https://socialscience.one. [Last accessed 2 October 2019].
Tsatsou, P. 2018. Literacy and training in digital research: Researchers views in five social science and humanities disciplines. New Media and Society, 20(3): 1240–1259. DOI: https://doi.org/10.1177/1461444816688274
Tufekci, Z. 2014. Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls. Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media, 505–514. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8062.
Twitter Help Center. 2019. About Twitter APIs. Available at: https://help.twitter.com/en/rules-and-policies/twitter-api. [Last accessed 2 October 2019].
Walker, S, Mercea, D and Bastos, M. 2019. The disinformation landscape and the lockdown of social platforms. Information, Communication & Society, 22(11): 1531–1543. DOI: https://doi.org/10.1080/1369118X.2019.1648536
Ward, K. 1999. Cyber-ethnography and the emergence of the virtually new community. Journal of Information Technology, 14(1): 95–105. DOI: https://doi.org/10.1080/026839699344773
Whitehead, TL. 2005. Basic classical ethnographic research methods. Cultural Ecology of Health and Change, 1: 1–29. Available at: http://www.dphu.org/uploads/attachements/books/books_5014_0.pdf. [Last accessed 2 October 2019].
Zelenkauskaite, A and Bucy, EP. 2016. A scholarly divide: Social media, big data, and unattainable scholarship. First Monday, 21(5). DOI: https://doi.org/10.5210/fm.v21i5.6358
Zuanni, C. 2017. Unintended Collaborations: interpreting archaeology on social media. Internet Archaeology, 46. DOI: https://doi.org/10.11141/ia.46.2