Paper V: Broken Data

How to reference this paper:

Pink, S., Ruckenstein, M., Willim, R., Ardèvol, E., Berg, M., Duque, M., Fors, V., Lanzeni, D., Lapenta, F. Lupton, D. (2016) DATA ETHNOGRAPHIES (5): Broken Data. Available online at


Sarah Pink, Minna Ruckenstein, Robert Willim, Elisenda Ardevol, Martin Berg, Melisa Duque, Vaike Fors, Debora Lanzeni, Francesco Lapenta, Deborah Lupton

Broken data.png

Figure 1: a preview of our final discussion board In this position paper we discuss how and when it might be useful to apply the concept of Broken Data.

We began our Broken Data workshop with the following provocation, which was intended to disrupt certain understandings and treatments of data by considering theories that put notions of breakage at their centre:

In a world where predictive big data analytics and data driven policy and design are increasingly prevalent, the concept of broken data seeks to interrogate and disrupt the possibilities associated with these trends. Concepts of breakage, damage and repair, and recent literatures about ‘broken world’ type theories, offer us an alternative starting point: what are the implications of putting these concepts at the centre of our understanding of digital data and its futures? By whom and where does data explicitly and more invisibly manifest itself as broken, incomplete and damaged? How is it repaired?

What might an agenda for broken data research look like? And why might we need one?

This position paper works through some of the disruptions, issues, challenges and opportunities implied by such as approach, as well as suggesting where this approach is also challenging, difficult to apply, and where there might be alternative renderings of some of the principles it advocates for. It ends with a question that has resonated throughout our series of workshops to date: ‘how to break the logic of Big Data’.

The world as broken

In recent literatures, themes and practices of decay, repair, and displacement have been put at the centre of an analytical narrative that seeks to develop, as Fernando Domínguez Rubio writes, ‘an approach that takes seriously the seemingly banal fact that things are constantly falling out of place’ (2016: 60). This refers to a world where Caitlin Desilvey proposes we should ‘accept that the artefact is not a discrete entity but a material form bound into continual cycles of articulation and disarticulation’ (2006: 333), and that ‘decay reveals itself not (only) as erasure but as a process that can be generative of a different kind of knowledge’ (2006: 323). Both these existing works emerge from a material culture studies tradition that focuses on material artefacts/things; Domínguez Rubio discusses art restoration and Desilvey has focused on material repair of everyday objects. A similar approach has also been advanced by Stephen Jackson. He asks ‘what happens when we take erosion, breakdown, and decay, rather than novelty, growth, and progress, as our starting points in thinking through the nature, use, and effects of information technology and new media’ (2014: 174). Through this focus on the materiality of technology likewise Jackson seeks to advance a theory of the world as being ongoingly broken, but in a way that is ‘generative and productive’ and ‘always being recuperated and reconstituted through repair’ (2014: 175).

As shown in recent research undertaken by participants in our data ethnographies series, a focus on the digital materiality (Pink, Ardevol and Lanzeni 2016) of everyday artefacts and experience, including that of software (Dourish 2016) and data (Pink, Sumartojo, Lupton and Heyes Labond forthcoming) offers a particular way to consider how the digital and material can no longer be thought of as separate elements of our worlds, but are better considered to be part of a continuingly emergent configuration of things (Pink, Ardevol and Lanzeni 2016). Thus making it equally possible to apply a ‘broken world’ approach to our understanding of digital data, as part of our everyday digital materiality, which is what we explore in this paper.

This is a way of thinking that, as Domínguez Rubio argues, ‘opens up an entirely different approach, one that takes temporality, fragility and change as the starting points of our enquiry’ (2016: 60). In particular, it offers a radically different approach to data from that suggested by scholars such as Rob Kitchin, who seeks to define big data through other types of characteristics. We discussed this in position paper 1:

For example, in a series of articles Rob Kitchin identified Big Data as ‘huge in volume’; ‘high in velocity, being created in or near real-time’; ‘diverse in variety; ‘exhaustive in scope’; ‘fine-grained in resolution and uniquely indexical in identification’; ‘relational in nature’; and ‘flexible, holding the traits of extensionality … and scaleability’ (Kitchin 2014). His most recent analysis calls for an interrogation of the ontology of Big Data, suggesting that the most important of the previously identified characteristics are velocity and exhaustivity and emphasising that it is important that we acknowledge the seemingly obvious but previously ignored point that there are ‘multiple forms of Big Data’ (Kitchin and McArdle 2016). We would add, in the spirit of this definition that all types of Big Data are drawn from small data that is continually or at least ongoingly produced but perhaps in ways that are not consistent across even the same technologies or in the trajectories of individuals (and that ethnography is particularly well equipped for researching such sites of production), and that might be contingent on the ways in which living in a world of data is experienced. That is, a closer emphasis is needed on the contingencies and circumstances under which data is constituted and experienced, and that these considerations also need to be accounted for in how it is defined.

Through this emphasis on its materiality, we seek to explore further how the ‘thingness’ of digital data, as leaky (Pink, Sumartojo, Lupton and Heyes Labond forthcoming) and open, as discussed in position paper 2, can underpin understandings of what people do with and how they experience data.

There several other metaphors that we might use to understand how digital data are configured, and how they may make sense or not to those who seek to use them. One metaphor is that of lively data. This incorporates the idea that digital data are about life itself when they are generated about humans and other living creatures; that they have their own social lives as part of a constantly changing digital data economy; that they have effects of people’s everyday lives, potentially changing the ways in which people view and conduct themselves or the decisions that others make about them; and finally, that they have become part of livelihoods, generating income in the digital data economic market for actors like the internet empires and data harvesting companies, and also contributing to governmental, managerial and research endeavours (Lupton 2016a, 2016b, 2016c). If digital data are conceptualised as lively, thinking through what happens when they fail to work or be useful can involve seeing these data as dying, dead, decaying, ageing, dirty, contaminated, worn out or sick. We might begin to think about the lifespans of data, or even conceptualise them as companion species or new forms of life (Lupton 2016c). Alternatively, employing another set of organic/nature metaphors, we could talk about digital data as liquid entities (as in the common references to ‘data deluges’, ‘data tsunamis’ and ‘data flows’) but also subject to being blocked, stuck, leaking or frozen when these flows are halted or diverted in some way (Lupton 2015).

Here we look at these approaches specifically in relation to the broken world theories that we have introduced above. We explore what broken world theories can contribute to such understandings, as well as examining where a greater emphasis on the concepts of lively or frozen data or leaky data (Pink et al forthcoming) are better suited to guide an analysis. Following on from this, we are interested in reflecting on the implications of this for how we might critically redefine and understand the analytic, determinate and predictive potential and capacities of Big Data. This perspective invites us to enter the world from a particular researcher perspective that is not always ethnographic as such, but which lends itself to ethnographic approaches because it focuses on the practical activity that people ongoingly engage in, in relation to things, the improvisatory processes they becoming involved in and the contingencies that are involved in this. This paper establishes an initial position on these points, to be developed further ethnographically in an article.

Our workshop discussion drew on three different examples of ways that we might consider the data generated by digital devices and interactions online to get broken and repaired, presented by Sarah Pink (self tracking data), Minna Ruckenstein (data analytics and cleaning) and Robert Willim (digital creative practice data). Here in the next section we outline these briefly (they will be elaborated further in published work) to provide context for our position, and to indicate empirically how and where we have found it useful to apply a concept of ‘broken data’ and where we have found it less applicable.

When data get broken:

Example 1: Broken Tracks (Sarah Pink)

Sarah Pink presented an idea of broken data through the ethnographic examples of self-tracking activities and broken wearables. Here there were two ways in which data could be seen as broken. First in the sense of their relationship to the wearable itself, which when it breaks or breaks down in its relationship to software and stops collecting/producing data. While we have no statistical data on this, it seems at least anecdotally that it is not infrequent that the technologies break in this way, and indeed, as noted in an earlier position paper, many users are short term. Figure 2 shows how one of the participant’s in Pink’s research was wearing a broken self-tracking wearable on his wrist at the time of their meeting. This led to an interesting discussion about how this had happened, why he was still wearing it, and how the wearable would be replaced. Here it is not only the technology that breaks, but the technological, personal and data narratives that are inextricably entangled with it are also partially disrupted. While in one sense the personal sensory experience of wearing the technology continued, since it was still worn, another implication of these ruptures or breaks in narrative are that the data themselves can be seen as broken: they become cut off, incomplete, or have gaps and breaks. This means the data are not a coherent representation of what has happened, but instead exist as a series of fragments. Or putting this another way, what it is was meant to be or was assumed it would be, has been shattered. Although of course all data begin as fragments, and must be manipulated to make sense, the difference here is that the processes and technologies that participate in this process of making sense become broken.

For instance, in Pink’s own auto ethnography of using self tracking technologies, as well as for the participant whose broken wearable is shown in figure 2, the relationship between our self tracking devices and software broke down. This happened for different reasons – because the wearables themselves were broken, and also because a smart phone update meant that the app and wearable no longer worked together. Why is this an occasion to refer to the data as broken? – because the data is no longer workable when this relationship decays, the connections between the technologies are broken and the data themselves cannot be assembled (see the concepts that are used below to discuss what happens when data are broken or repaired). The only way to repair the data, or the data flow, is to repair the connection between the hardware and the software, from neither of which does the data exist independently.

Broken data II.png

Figure 2: the broken wearable. Copyright Sarah Pink.

The way these fragments are constituted does not follow a logic of completeness but rather a logic of contingency and improvisation which is related to processes of damage, decay and sometimes (but not always) repair or restoration (eg when the wearable is replaced with a new one, or the software is updated). Such data are not ‘continuous’, but rather they are fragile and vulnerable to processes of decay, breakage and practices of repair and mending. This example calls for further investigation about what happens if we put these concepts of decay, breakage, repair and mending at the centre of our attention to digital data, and engage them to understand how data emerge as part of the environments in which we live.

Indeed, the examples that emerge from the self-tracking research suggest that we have much to learn about personal data and what it can possibly mean when we decentre it from a narrative of digital data as all-knowing, to understand it through a narrative of these data as being something that can be broken, damaged and repaired, and moreover as something that is made sense of by people in their everyday worlds precisely through the practices that centre around these activities with data and technologies.

Example 2: Repairing and freezing data (Minna Ruckenstein)

Minna Ruckenstein contemplated the notion of broken data, and connected practices of breakage and repair, through her involvement in the ‘Citizen Mindscapes’ initiative, which is an interdisciplinary open data project that contextualizes and explores a Finnish-language social media data set (‘Suomi24’, or Finland24 in English), consisting of tens of millions of messages and covering social media over a time span of 15 years. The data in question are originally proprietary data owned by Aller, and therefore secondary use of data in terms of research.

The guiding question of the presentation was: How does an open data initiative, such as the Citizen Mindscapes, contribute to, or question, broken data thinking? The analysis of large data sets calls for new kinds of research methods and reflexivity from researchers, particularly if the aim is to disrupt existing big data logic. In this context, that breakage means that the data initiative departs from dominant modes of social media data mining, mainly concentrating on profiling of users and personalizing services. The data initiative described is not interested in user categorizations, but the goal is to identify patterns and shaping principles that stimulate and condition emotional and topical waves in the social media discussion.

Ruckenstein described the production of big data as socially situated. From the research perspective, company-owned social media data require repair and cleaning work. The data might appear as broken, discontinuous, or interrupted, and the gaps, errors and anomalies in the data need to be clarified (see Figure 3).

Broken data III.png

Figure 3: Identified gaps and anomalies in the Suomi24 data.

For some researchers the knowledge of all possible breakages in the data are of utmost importance. In the field of statistics, for instance, research findings cannot be published without intimate knowledge of the inconsistencies of the data. In an attempt to tame the data and to make them workable, the Citizen Mindscape project is also setting up a database to keep the data still: the data are transformed into ’frozen’, or ’immutable data’.

Ruckenstein suggested that the brokenness of data is a question of perspective: What appears as broken to some researchers might be irrelevant for others, or a research opportunity. The anomalies of the data tell of ruptures of the technological infrastructure, or of automated spam messaging. The challenge is to develop research frameworks that explore social media in the context of everyday rhythms and conversational landscapes and treat breakages of the data as characteristic for the patterning of social media. Brokenness of data reminds us of the continuously flowing, situated and possible discontinuous nature of data, suggesting that we need to actively question data production and the diverse ways in which data are, or could be adapted for different ends by practitioners.

The presentation ended with a series of questions highlighting the need to develop analysis methods for continuously updating, flowing and breaking data. In a research project, such as the Citizen Mindscapes, social media data mining uses secondary data in ways not intended by the producers and custodians of the data. By combining ‘frozen data’ with other data sources the data can regain its liveliness. Who keeps the data lively and how is a question that offers insights into data practices, paving the way for further exploration of everyday data relations (Ruckenstein 2016). Similarly important is the question of who breaks the data. One option is to highlight breakages of data in different contexts and compare them. Users might break the data by non-use and misuse of technologies and those breakages can reveal affective and political orientations to data. “Refusing to share data is becoming a political act”, as Moore and Robinson (2015, p. 14) suggests. The politics of breaking and repairing data lead to questions of data infrastructures and knowledge dissemination frameworks. Data should not be seen apart from the sociomaterial structures that support data work.

Example 3: Noise and Data (Robert Willim)

Robert Willim started his presentation by discussing how data can be understood when it is packaged into eg. a sound file. In that case data should be understood in relation to the ways they are intended to be processed or used. The distorted often unappreciated parts of a sound file consist of data, and so do other parts of the file that we could experience as, for example, a voice uttering a sequence of words. What we hear as a distorted glitch become categorised as noise in relation to other parts of the file that we define as a meaningful and expected signal, or as appreciated content. It’s only in the light of intended uses that some data might be considered as broken, split, fractured, malfunctioning or noisy (Demers 2010, Krapp 2011, Schwarz 2011).

The removal or filtering out of noise is an important part of data management. Noise abatement can be seen as a process of maintenance or repair. However, like all sorting and arrangement this maintenance, this filtering, has its own politics, aesthetics and peculiarities (Willim 2013, 2014). If we assign the filtering to algorithms, someone or something has to decide how these algorithms sort out noise or smooth a data set or a data stream.

Willim continued to discuss broken data in relation to one of his artworks. Close to Nature was a video work that built on ideas of an iterative noise inducing process, making it an extension of earlier works like Alvin Lucier’s I Am Sitting in a Room ( see Kahn 2009) or Nam June Paik’s ”live feed works. Close to Nature was probing ideas about how visual experiences of Nature are evoked and engendered through uses of technology. (See: ) .

Broken data IV.png

Figure 4: The image shows stills from the nine iterations of the artwork Close to Nature (2011) that give perspective on what might be seen as data and the fluid relationship between signal and noise.

The work was based on a short initial video clip that was re-recorded a number of times. Each step, each iteration created noise, distortion and glitches that made the image start breaking and cracking, forming patterns and evoking colours not visible in the first clip. The procedure was iterated eight times, bringing the image closer while it also became more abstract. Each time the procedure was repeated the clip was transmuted between different formats or conformations, between digital code, between data, electric currents, algorithms and the analogue world of light and matter outside the electronic circuits. Willim ended by asking the question how we frame data in this context? When is it even meaningful to define data as broken or breakable? A provisional answer might be that it depends on the specific context and the aim with the processing of what has been categorized and framed as data.


Each example brought breakage, decay, damage, repair and mending into the discussion in different ways. Collectively they show that there is much in the empirical/ethnographic evidence to ‘break the logic of big data’, as well as suggesting that the theoretical concepts through which we can achieve this are also coherent with our understandings of other phenomena. As Vaike Fors put it, the theoretical stances of broken data help us to question what digital data are for; they also help us to ask what we might do with data, and what new roles and values we can give them, or re-assemble them for. This way of thinking opens up new routes to re-think the technologies and digital services that help us produce, spread, and learn from our personal data. Instead of thinking of data and the technologies that produced them as simply things that organise and monitor our lives in new ways, ‘broken world’ – theory helps us to think beyond simply the novelty of such innovations; that is, how they become part of how habits are sustained and restored in combination with these new components. Pink’s and Fors’ ethnographic work on people’s use of self-tracking devices shows how data become inhabited and embodied through the use of digital services that makes it possible to re-live the data-producing moments and environments (Pink & Fors, forthcoming). This includes how data is part of both breaking and maintaining habits and routines. As Chun suggests (2016), our media matters the most when they seem not to matter at all in terms of novelty and capacities – when they have moved from “new” to habitual. This perspective adds another analytic layer to the analytic, determinate and predictive potential and capacities of Big Data.

Melisa Duque drew on her experience of doing design research in Op Shops (opportunity shops – charity or second hand shops) in Australia to suggest similarities with the processes that Ruckenstein described for data analytics. Thus enabling us to think of data as being like other things in the world. For instance, when data have to be cleaned, where there are parts missing or odd days, when there is a diversity in volume or when data need be repaired – there are resemblances to processes where second hand goods have to be somehow re-valued (as Melisa Duque put it), or re-assembled (as Deborah Lupton put it) for a new consumer/user. For example, in the op-shop context the revaluing of digital devices represents a challenging area that constantly pushes the charity volunteer workers to adjust to the rapid rhythms of technological innovation. On the one hand these things are often donated incomplete (missing cables, batteries, parts or accessories), and on the other hand even if complete, their outdated model places these technologies in less desirable commercial positions, unless they are old enough to be considered collectable. Which means too much time and work required for their processing, for low to non-economic exchange in return, placing these items at a low priority, leading them to quickly fall again ‘out-of-place’.

Here the broken value of these technologies evidences the problematic consequences of the planned obsolescence that is designed into these things in order to fuel agendas for economic growth. This embedded brokenness represents a latent opportunity for second-hand markets to adjust to the processing and revaluing of these materials. However, they also evidence the responsibility of producers to make designs that accommodate end-of-life transitions for such things. In this case, brokenness not only highlights a material dysfunctionality, but more broadly a brokenness of cultural values. However in the opp-shop scenario revaluing often fails. This raises questions for our understanding of data: what happens when the revaluing of data fails, do we even know when it fails or understand what failure is or can mean in this context. This leaves a question open to explore.

Data value and re-value are terms and concepts that deserve further discussion and definition. The parallel with the Op Shop introduces some new questions about how, if we see data, like everything else, as possibly damaged, repaired, old and new, then what would this have to do with the value of data? Since the question of understanding digital data and their value for whom and in what circumstances is a key contemporary concern for individuals, companies and government in their quest to determine how to make a data society work, then this is a fundamental question.

Data have been referred to as ‘dirty’ as opposed to ‘clean’ for a long time. In these existing discussions ‘dirty’ means incomplete, inaccurate or otherwise not useful for analytic purposes and requiring work to ‘clean’ or render useful. Dirty data are rarely, if ever, viewed as valuable or useful in this discourse. However, when compared to the context of the opp-shop, dirtyness there is not always something that would de-value an item. When data are dirty or old does this add or change its value, since it doesn’t necessarily need to reduce it – for instance, in the Op Shop dirtyness could potentially add value, as Duque pointed out. What would be the implications of this in Big Data analysis – the data that Ruckenstein spoke about needed to be cleaned – but was it it’s dirtiness that was inseparable from its value, in that had it not been dirty it would not have existed and therefore its potential value could not have been extracted in the cleaning process. In this sense, clean and dirty are not separate concepts but they are part of a shared process where they are mutually interdependent. Notions of data as raw and cooked or rotted (Boellstorff and Maurer 2015) raise similar questions, and suggest that other researchers are likewise unsatisfied with the possibilities for discussing data that are derived from data science and data logics, and have subsequently sought concepts from everyday life that bring data into the world with other things rather than seeing it as a special case that should have a measurement-based logic of its own. That is to emphasise that data is not insulated or isolated from the same processes and challenges that are faced by the other ‘things’ with which it co-exists in the world: like everything it can go rotten, decay, get broken, be cleaned, repaired and revalued, and it can expire.

In this scenario then, there can be value added to something once it has been broken and repaired, or when it has decayed. As Martin Berg pointed out, things can become more valuable as they age given a certain cultural context. Using Kintsugi, a Japanese art technique of repairing broken pottery with noble metal in order to increase the value of a broken item, as an example we might think about data being broken and repaired as becoming part of their history and altering their characteristics rather than losing in quality and usefulness. In fact, by embracing flawed dimensions of broken data and using the imperfections as a resource, it becomes possible to challenge general views of data as precise and as resulting from automated processes and instead think of it as having a life on its own with an always present risk of decaying or being chipped when encountering other data objects and streams. Antiques are also a good example, as Pink noted things that are aged, repaired or modified also become revalued in the antiques trade and market. How might this be usefully applied to the potential values and revalues of data and/or how (returning to the point made by Fors) does this redefine what data are for? Such questions have not yet been approached in social science and humanities research about data. This opens up a site for future data ethnographies, as Lupton suggested.

These approaches begin to treat data not as something that is meant to stand for something else already in the world, but as something that can be reconstituted and revalued as part of different practices and towards different aspirations and that can have not just different values but different modes of value.


Our workshop was composed of a multidisciplinary group. Although we all originate in the social sciences and humanities, we cross anthropology, ethnology, sociology, media studies and pedagogy; each of which brings with it a particular relationship to ethnography and other research approaches, theory and intervention. Different approaches to digital data might be more suited to generating critical understandings from different disciplinary perspectives, and in our group we remained open to considering how an understanding of data as ‘broken’ might intersect with other theoretical ways of situating data in the world and understanding their potential for the generation of meaning. This is significant in two ways, since it asks us to reflect on how theoretical turns and strands are emerging across the social sciences and humanities more broadly, how they connect across disciplines and how they are, in their different manifestations, leading to forms of re-thinking the world, which might not be exactly the same but that drive towards similar principles. It also invites us to work out what the particular role of a view of data as broken might have in the development of a critical agenda about digital data might have, and what specific ways this agenda might have impact or lead to intervention in the world.

Above we have discussed the relationship between seeing data as broken and seeing it as open, lively, fluid, leaky or malleable. The idea of data as broken certainly connects with all these existing arguments. Moreover, we are not here suggesting that this idea ought to be the primary way of redefining digital data, but rather that in certain circumstances it might offer us a particular route through which to connect to or critically engage with arguments and discourses in society and policy.

Francesco Lapenta, argued that there are significant political consequences of taking the counter position that ‘data should be always considered broken’, as ‘incomplete’, ‘constantly increasing but to a different degree always a relative approximation and the ideal, and utopian “full representation” a scientifically impossible endeavour’. This Lapenta suggested enabled particular way of putting politics at the centre of this discussion/agenda because data can only be ‘observed through a filter’ an ‘algorithm’, and that filter is never neutral, hence the concept of ‘politics hidden behind the illusion of achieving perfect unbroken data’. For Lapenta, the important emerging question here concerns where we as scholars from different fields stand and delineate our difference from other groups in our approach to the collection and use of data. He proposed the idea that ‘data ethics’, as developed by Hasselbach and Tranberg (2016) is the first step towards the recognition of data analytics as inherently ‘political’ and ‘data politics’ as the inherent effect of the use of any specific ‘algorithms’ that always embodies an interpretation and reveals ‘agency’ and ‘selection’ over the impossibility of ‘full accountability and full representation’. This focus invites the question of how we might critically engage as academics with the politics and power relations of data, which the political agendas for which big data are being used to shape, inform and to endorse? It also raises the question of where we engage? For instance, Lapenta suggested the focus should be on the tools that are used to analyse data. What might be the role of an approach to data that regards them as inevitably broken (or liquid, leaky, open or lively) in the shaping of critical and incisive interventions in the relationships that are emerging between data and politics? And how might it shift these relationships in beneficial ways?

An Agenda/Critical Intervention: a position

If, as outlined above, seeing data as broken or through other lenses that enable us to take a critical perspective that can ‘break the logic of big data’, where and how might these interventions be made? How can we participate in and/or facilitate a move towards responsible data futures, and what is the role of theory and ethnography in this? Our discussions raised a series of positions and proposals, which constitute questions as much as solutions. However there is a need to go further than just studying big data ethnographically so that we can arrive at critical understandings. The bigger question relates to how we can act in this context to bring about change, or to guide the processes as they emerge:

Broken data VI.png

1.How can we make interventions in the design of software to ensure privacy, forms of choice, and to enable empowerment to users? In doing so we would need to acknowledge that data are always broken (or dirty, rotted, blocked, frozen, leaky, dead and so on), but that the choices that are made when they are fixed, repaired, enlivened, re-invigorated or reassembled should be transparent, or that the power relations through which the choices are articulated should be visible, evident and where necessary re-balanced to favour consumers/citizens, or to involve co-creative processes where power is balanced.

2.We should explore how algorithms already work to repair and reassemble data, in order to make them work and make existing apps and wearables meaningful to users? How would we identify and influence this process? What are the ‘good’ algorithms that repair, repurpose or otherwise assemble data into forms that enable their participation in responsible futures for citizens, consumers and organisations?

3.The future roles of citizens/consumers have very much to do with their data and how they are used. This might be interpreted to mean, how in the future our data are curated, cared for. This leads us to ask what happens if these data begin to decay, where are they repaired, or how they might be reassembled. If we put the notions of breakage, decay and repair at the centre of this problem, can they serve as metaphors that can help us how we might intervene in this arena?

4.If we use the logic of existing dominant definitions of big data, we will be encapsulated in their logic. Understanding data as in processes of breakage, damage and repair offers us one way to stop giving credit to the “data driven” phenomenon as involving efficiency and accuracy? Can this understanding effectively be used to contest the notion of “data driven” as it is currently being applied to policy, design and business decisions?

5.The broken data approach can demonstrate that a lot of the current scholarship, even big data critical research, exaggerates data movements: often digital data do not move, or they only move with considerable effort and requiring repair. Ethnographers can highlight the lack of data movements, and thereby offer a corrective, or an intervention to the current way of discussing the power of data to do things.

6.Theory has a role to play in the making of new strategies for thinking about and intervening in big data analytics. The position papers collectively show this as we have used different theoretical perspectives to interrogate what and how data can mean something in the world. A broken world perspective is one possibility.


Boellstorff, T. and B. Maurer (2015) ‘Introduction’ in T. Boellstorff and B. Maurer (eds) Data: Now Bigger and Better. Chicago. University of Chicago Press.

Chun, W. H. K. (2016). Updating to remain the same. Habitual new media. MIT Press.

Demers, J. (2010). Listening Through The Noise. The Aesthetics of Experimental Electronic Music. Oxford: Oxford University Press.

DeSilvey, C. (2006). Observed Decay: Telling Stories with Mutable Things. Journal of Material Culture, 11(3), 318–338. doi:10.1177/1359183506068808

Domínguez Rubio, F. (2016) ‘On the discrepancy between objects and things: An ecological approach’ Journal of Material Culture. 21(1): 59–86

Hasselbalch, G. and Tranberg, P.,(2016) Data Ethics. The New Competitive Advantage, Publishare, Denmark.

Jackson, S.J. (2014) ‘Rethinking repair’ in T. Gillespie, P. Boczkowski, and K. Foot, eds. Media Technologies: Essays on Communication, Materiality and Society. MIT Press: Cambridge MA

Kahn, D. (2009). Alvin Lucier I Am Sitting in a Room Immersed and Propagated. OASE Journal for Architecture, 78, 24-37.

Kitchin, R. (2014) ‘Big data, new epistemologies and paradigm shifts.’. BIG DATA AND SOCIETY, 1:1 – 12.

Kitchin, R. and McArdle, G. (2016) ‘What makes big data, big data? Exploring the ontological characteristics of 26 datasets’. BIG DATA AND SOCIETY, 3:1 – 10.

Krapp, P. (2011). Noise Channels: Glitch and Error in Digital Culture. Minneapolis: University of Minnesota Press.

Lupton, D. (2015) Digital Sociology. London: Routledge.

Lupton, D. (2016a) The Quantified Self: A Sociology of Self-Tracking. Cambridge: Polity.

Lupton, D. (2016b) Personal data practices in the age of lively data. In J. Daniels, K. Gregory and T. McMillan Cottom (eds) Digital Sociologies. Bristol: Policy Press, 335-350.

Lupton, D. (2016c) Digital companion species and eating data: Implications for theorising digital data–human assemblages. Big Data & Society (1). Accessed 7 January 2016. Available from

Moore, P., & Robinson, A (2015) The quantified self: What counts in the neoliberal workplace. New Media & Society, online first.

Pink, S. E. Ardevol and D. Lanzeni (2016) ‘Digital Materiality: configuring a field of anthropology/design?’ in S. Pink, E. Ardevol and D. Lanzeni (eds) Digital Materialities: anthropology and design. Oxford: Bloomsbury

Pink, S., S. Sumartojo, D. Lupton and C. Heyes le Bond (forthcoming) ‘Mundane Data: the routines, contingencies and accomplishments of digital living’ Big Data and Society

Ruckenstein, M. (2016). ‘Keeping data live: talking DTC genetic testing’. Information, Communication and Society

Schwarz, H. (2011). Making Noise. From Babel to The Big Bang and Beyond. Brooklyn: Zone Books.

Willim, R. (2013). Enhancement or Distortion? From The Claude Glass to Instagram Sarai Reader 09: Projections.

Willim, R. (2014). Transmutations of Noise. In B. Czarniawska & O. Löfgren (Eds.), Coping with Excess. How Organizations, Communities and Individuals Manage Overflows. Cheltenham: Edward Elgar Publishers.