How to reference this paper:
Dourish, P. E. Gómez Cruz, H. Horst, D. Lupton, S. Pink, J. Postill, S, Sumartojo, D. Verhoeven (2016) DATA ETHNOGRAPHIES (3): Humbling data in a Playful World. Available online at https://dataethnographies.com/paper-iv-data-stories/
Data Ethnographies 4: Data Stories
Paul Dourish, Edgar Gomez Cruz, Heather Horst, Deborah Lupton, Sarah Pink, John Postill, Shanti Sumartojo, Deb Verhoeven
Following on from earlier discussions in the Data Ethnographies series, we turned during a workshop in June to the relationship between data and story-telling. In the contemporary imagination, data – particularly, of course, quantitative data and sensor data – have come to occupy a central role in our conceptions of scientific practice, corporate value-creation, and governance. Throughout these workshops, we have sought to examine this rhetoric carefully from the perspective of qualitative social science. In thinking about data and story-telling, our interest is sparked by three questions. The first is, to what extent do data speak for themselves? That is, to what extent can data be made operative outside of narrative frames that make sense of, explain, account for, and historicize them? The second is, how do data allow us to speak? That is, what are the processes by which data are pressed into use and used to anchor or develop broader cultural understandings of the people, processes, and phenomena that underlie them? Third, what silences characterize data narratives? How do the processes of identification, selection, and curation that characterize large-scale data efforts leave some things unsaid, and with what consequences?
Our intuition is that ethnographic praxis has something in particular of value to offer here. Ethnographers work with data; immersive ethnographic practice is enormously data-rich, and indeed, as Strathern (2013) comments, the practice of ethnography is a way of generating more data than the ethnographer herself may be aware of at the time. But the data with which ethnography works is deeply contextualized and is narrated in the sense that what we must do is to assemble data so as to reveal stories and the patterns of those stories. Indeed some ethnographers would argue against the use of the term ‘data’ to refer to ethnographic research materials, specifically in order to differentiate ethnographic knowing from data as understood in other contexts (Pink 2013, 2015). Ethnographers narrate their settings and sites, and so perhaps naturally when we come to think about ethnography and data, in the context of contemporary interests in big data and quantitative analytics, we are attentive to the ways in which data might be narrated or might need to be narrated, or indeed the ways in which the narratives within which data make sense might disappear as data are disembedded from them.
These issues are pressing because the questions of data, voice, and perspective clearly shape the ways in which data can be effectively put to use in settings from education to civic engagement, and from corporate governance to transnational economic management. Indeed, we took inspiration in part from ethnographic studies of central bankers, who, with massive volumes of data available to them, nonetheless found it necessary to work in terms of stories about data in order to make their case, enroll others to their point-of-view, explain the consequences of macroeconomic shifts, and determine the relative significances of different data-driven analyses (Holmes 2014). In other words, it is not simply a question of the ways in which we might tell stories with, around, or through data, but rather about recognizing the ways in which we already do and the impacts and possibilities that such narration might open up .
An Anchoring Example
Our workshop discussions were anchored by a case from Dourish’s prior research, an investigation of the use of GPS technology to monitor the movements of male paroled sex offenders in California (Troskynski et al. 2008, Shklovski et al. 2009, Shklovski et al. 2015). The focus of this case in our discussion was the interplay between gathering data (ostensibly objective and precise formulations of people’s locations) and practices of story-telling: with the data, through the data, and about the data.
Story-telling with the data includes the way in which parolees might make use of the availability of location data in order to defend against accusations of being in the wrong place at the wrong time, or being suspected of illegal action. As a counter to police hassles, they might point to the data record as an alibi that supports their own position in a contest of different narratives. Story-telling through the data captures the idea that parole officers are concerned not so much with the moment-by-moment location of their cases but rather with patterns of action that might signal recidivist tendencies, inappropriate associations, or other problematic trajectories that might require intervention. Accordingly, the parole officers find themselves needing to narrate the data: that is, to tell stories about it in order to make sense of it. Story-telling about the data is also needed in order to explain gaps, absences, or unexpected signals. These stories speak to the processes of data production, accounting for the vagaries of software filtering, of hardware failures, of battery charges, satellite access, network coverage, and device characteristics.
What this example begins to reveal is the way in which sensor-derived data nonetheless needs to be accounted for, both in terms of its production and in terms of its consequences, within social settings. In seeing or framing data as a trace of an event or an action, then we inherently invoke narrative elements: actors, motives, expectations, actions, types, histories, proclivities, habits, intents, and on. It is these elements that help us make sense of data as it moves around in the world: as it moves from technical settings into social or organizational ones, for example, or as it moves between different institutions. In these settings, data are narrated differently, and made to operate within different interpretive frames. By the same token, within each of these settings, conventional (if evolving) sets of tropes are invoked in order to make sense of different data streams; we might expect to see different data streams generated by domestic appliances and embedded devices linked by an “Internet of Things” in different Australian homes, but we nonetheless expect some common narratives of domestic life to appear – narratives of rhythm and routine, of intimacy and care, of chores, celebrations, and sleep.
The Cultural Grounding of Data Narratives
A critical point to emphasize in both the anchoring case and others, then, is that the stories that arise in these settings do not rely solely on the data. Elements of these narratives are pre-figured and possibly anticipated. So it is always with stories; stories operate in terms that we recognize and that are culturally available to us, and so classical narrative forms – of enlightenment or of fall, of struggle and of transcendence, of emergence and of transformation – are both broadly culturally available and emerge in conversation with the settings and moments of narrative production. Similarly, in the anchoring case of the parolees, we noted the ways in which the stories that the data supported were stories that were already available to parole officers about types of offenders, likely patterns of behavior, trajectories of action, traditional forms of danger, sorts of places where people might be, the kinds of risks that they might encounter there, and so on. Each pass through the data tells a story that’s new, but the stories are populated and furnished with familiar elements. Similarly, cultural groundings of narrative establish not only conventions of presence, but also conventions of absence – those elements and aspects of the account that are traditionally left out, neglected, or placed to one side. A narrative about movement, for example, is simultaneously and pointedly not a narrative about age, about race, or about gender. These absences themselves speak, if we remember to attend to them.
What this further brings to our attention are the histories and geographies of data and those of data narratives. As accounts of phenomena in the world, particular forms of data – such as the reports of latitude and longitude that are the foundation of GPS tracking – are embedded within regimes of measurement and management, with their own histories and spatialities. Data formats and data representations co-evolve with programs of data use and with anticipated needs. Data have their own histories, then, and their own geographies too, since these regimes of measurement and management are unevenly distributed in the world and are often used to produce logics of spatial experience (the regularization of space through latitude and longitude sits uncomfortably, for example, with indigenous Australian accounts of space that are grounded in relational experience, radiating centers of power, and contemporary encounters with ancestral events). At the same time, data narratives, the stories that we can tell about, through, and with data, have their own histories and geographies. They reflect understandings and experiences that have grown up differentially in different parts of the world, or that differently reflect the experiences grounded in gender, sexuality, ethnicity, and economic status, to name a few. For example, drawing on the work of Fiore-Gartland and Neff (2015), Deborah Lupton drew our attention to the example of ‘data valences’ whereby patients’ and medics’ narratives about data tend to be different. Each group gives a different value to the same data: the data narratives that result, therefore, are different and used for different purposes. Further, these narratives have different ways of moving around in the world, through their proximity to or currency within groups with differing access to media and channels of distribution, from the microwave antenna to the epic poem (c.f. Gitelman 2013). It is important to emphasise that these two patterns of histories and geographies, those of data on the one hand and data narratives on the other, are not themselves the same.
What this suggests is that the process of holding data, with its histories and geographies, together with data narratives, with its own embeddings, is always both provisional and fraught; a temporary alignment that is always destined to be torn apart as both data and narrative evolve. The stories that data can tell are always stories here-and-now, stories that reflect specific perspectives that may look quite different in the morning. Data narratives have their own politics that shape how they are configured and employed. They may also be internally conflicted and contradictory. Some parts of a data narrative will resonate or “make sense” while others parts may not.
Understanding how both data and narratives are embedded within their own histories might alert us too to other questions of data’s temporalities that shape the stories that data tell. Two concern us here: data and narrative dynamics and data-driven futures.
In the realm of big data analytics, dynamism is a key component. Data are not simply fixed in place; they are being generated continually and they constantly shift and develop. So central is this idea to the data rhetoric that “velocity” is one of the four “V”s by which data science researchers characterize the conditions of “big data” analysis (the other three being volume, variety, and veracity). Velocity here speaks to the rate at which data streams are generated and must be processed – the idea that behind each data item is another and yet another, coming quickly. This implies that the rates of data processing must be matched to the rates of data generation, but it implies too that each item is meaningful primarily as an element of a sequence, fast-paced and dynamically evolving.
In this, then, data narratives help to “fix” – or perhaps “get a fix on” – data temporally. That is, the accounts that data narratives offer are ones that make sense of data within an evolving context, and so stabilize it in the sense that they situate it within a landscape of recognizable objects. This is not to suggest that the data become immutable or unchanging, but rather that they are rendered stable and accountable within the terms that a narrative offers; data may evolve and change but they do so within a stable frame. Here, we echo themes have that arisen in previous workshops that have addressed themselves to questions of uncertainty and to those of data futures: data narratives in this context help to stabilize data by shifting the temporal scale and giving data meaning even in advance of the inevitable arrival of new, unknown and unknowable signals.
The last issue of data and temporalities that we address here concerns the way that data are conjured as a means to provide insight into future events, as a matter of both projection and prediction – the idea of data-driven futures.
Of course, an account of data narratives might question the idea that we are data-driven at all, no matter what the dominant rhetoric. Are we driven by the data, or by the stories that the data let us tell? Are we oriented towards data, or are we oriented towards the narrative logics from which those data springs, and through which they come to have relevance? What does an orientation towards data open up and what does it obscure? In the context of an examination of data narratives, the role of data as a “driver” becomes a matter of considerable potential dispute.
However it is that “data-driven” accounts are themselves as much products of narrative processes and logics as they are of data itself, our turn towards topics of “data-driven futures” speaks to the imagined trajectories of data and to the drives for projection and prediction. Data in this account are understood to signal or organize themselves into patterns that project into the future as well as the past; data anticipate. However, these anticipations point backwards as much as they do forwards when they are couched or narrated in terms of pre-figured objects and categories and when they are reinforced through their hold upon new kinds of data-centred investigations.
Despite claims that data has supplanted theory (e.g. Anderson 2008), the need for narrative remains unquestionable; data make sense only to the extent that we have frames for making sense of them, and the difference between a productive data analysis and a random-number generator is a narrative account of the meaningfulness of their outputs. Significant research opportunities arise to examine way in which narrative elements are mobilized in relation to data and how these might move or operate both independently of the data and in ways that the data cannot. This is not to suggest that data can somehow be independent of narrative, but rather that specific data and specific narratives are not necessarily bound together.
The particular significance of the narrative perspective is both how it animates a series of culturally-available tropes – actors, motives, encounters, and on – and also how it lends a temporal arc to data and the objects that the data is read to represent. These speak importantly to the cultural embeddings of data narratives, and perhaps to questions of “decolonizing” data (c.f. Smith 2002) or at least recognizing the importance that these embeddings play in the creation of meaning and the mobilization of action around data that might otherwise seem to speak for itself.
Ethnographic approaches to data offer us an ideal way in which to investigate the narratives that are constituted around data, how these narratives are experienced by diverse groups, how they might come up against or contest each other, and what they mean in the context of relations of power and intimacy.
At the same time, a focus on data narratives itself creates a powerful call towards ethnographies of futures. It suggests that if we seek to understand ethnographically where futures are constituted in the present, one of our objects of study needs to be precisely the question of how narratives are engaged in predictive data analytics, as well as in everyday uses of personal data. Understanding how these narratives situate data in relations of power, time and space is fundamental to the task of militating for a responsible and ethical approach to data. This implies a need for a research agenda that includes ethnographies of how predictive big data analytics are forged, interpreted and used (that is engaged with and for the making of narratives) across different organisations, and of how personal digital data is implicated in the ways that people envisage or narrate their own futures.
Anderson, C. (2008) The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired, June 23. http://www.wired.com/2008/06/pb-theory/
Fiore-Gartland, B. and Neff, G. (2015) Communication, Mediation, and the Expectations of Data: Data Valences across Health and Wellness Communities. International Journal of Communication, 9, 1466-1484.
Gitelman, L. (ed) (2014) “Raw Data” is an Oxymoron. Cambridge, MA: MIT Press.
Holmes, D. 2014. Economy of Words: Communicative Imperatives in Central Banks. Chicago, IL: University of Chicago Press.
Pink, S. (2013) Doing Visual Ethnography. 3rd edition. London: Sage
Pink, S. (2015) Doing Sensory Ethnography. 2nd edition. London: Sage.
Shklovski, I., Vertesi, J., Troshynski, E., and Dourish, P. (2009) The Commodification of Location: Dynamics of Power in Location-Based Systems. Proc. Intl. Conf. Ubiquitous Computing Ubicomp 2009 (Orlando, FL), 11-20.
Shklovski, I., Troshynski, E., and Dourish, P. (2015) Mobile Technologies and Spatiotemporal Configurations of Institutional Practice. Journal of the Association for Information Science and Technology, 66(10), 2098-2115.
Smith, L.T. (2002) Decolonizing Methodologies: Research and Indigenoush Peoples. Zed Books.
Strathern, M. (2003) Commons and Borderlands: Working Papers on Interdisciplinary, Accountability and the Flow of Knowledge. Sean Kingston.
Troshynski, E., Lee, C., and Dourish, P. (2008) Accountabilities of Presence: Reframing Location-Based Systems. Proc. ACM Conf. Human Factors in Computing Systems CHI 2008 (Florence, Italy), 487-496.