A new paradigm in cultural data: Focus on user needs instead of mass digitization
The year 2020 was special for cultural data and digital culture for several reasons. To the broader public this meant visiting exhibitions and theater plays virtually while sitting on their sofa due to the COVID-19 pandemic. From a narrower, policy-making perspective, 2020 marked the Year of Digital Culture – a policy initiative that aimed to promote new forms of culture driven by the use of digital technologies. As a result of this initiative, a report will be launched in January 2021 summarizing the state of Estonian digital culture and open cultural data. Based partly off of this report, this blog post discusses the situation of digital culture in Estonia and explains why a change of paradigm is needed in the field of cultural data.
The very form of the term digital culture suggests that culture is defined through a certain function. Digital culture does not simply mean consuming culture on a digital platform, for example listening to Arvo Pärt on Facebook. Digital culture refers to the emergence of new social activities due to using digital channels. This not only involves consumption (who wouldn’t like to watch TV while eating sandwiches) but also the emergence of new forms of culture that become possible thanks to new technologies.
From a strategic aspect, we don’t need to talk about digital culture to develop digital culture – we need to talk about measures that support the development.
Regarding Estonia, the upcoming report on digital culture highlights that there is no common understanding in Estonia of what digital culture really means. Two complementary perspectives deserve to be highlighted: a) a rather clearly defined concept of digital culture applied in academia and research, and b) a looser conceptual framework used in strategic policy documents to set long-term development goals. From a strategic aspect, we don’t need to talk about digital culture to develop digital culture – we need to talk about measures that support the development. These may include language technologies, semantic interoperability of archives and databases, educational innovation, development of digital skills, intellectual property rights, digital tax, reduced VAT rates for electronic publications, and so on. With support from the Estonian Ministry of Culture, Ministry of Education and Research, and Ministry of Economic Affairs and Communications, a lot of work has been done over the past 15 years to popularize different aspects of the so-called digital culture, i.e. digitized culture.
Focus on End Users instead of Digitization
The first strategic documents in the field date back to mid-2000s. In retrospect, these strategies set out to achieve very ambitious goals in a very short timeframe. An analysis of the National Audit Office from 2009 found that the goals of the strategy „Digital Cultural Heritage 2008-2010“ are impossible to achieve in a way that would yield substantial benefits for memory institutions and users of the information produced by them. Still in 2013, Estonia’s strategic goal was to digitize the majority of Estonian cultural heritage and make it available to the public. The current action plan on digitization was adopted in 2018 and has an implementation period until the first half of 2023 . The action plan aims to digitize a third of the cultural heritage held by Estonian memory institutions, as well as upgrade their information storage infrastructure.
The Digital Heritage Council, which coordinates the digitization strategies and action plans, will soon start discussing the next, post-2023 action plan. In the next action plan, the intention is to focus less on digitizing different forms of heritage and to approach data from the end user point of view. The priority is to answer questions such as what sort of new activities and services are needed, what technologies should be developed to make them possible, and so on.
Orientation towards a user-centric provision of cultural data entails a substantial paradigm change also for open data.
Orientation towards a user-centric provision of cultural data entails a substantial paradigm change in the activities that have so far prioritized mass digitization of cultural heritage. This also means a paradigm change for open data. The current list of open cultural datasets published on the national open data portal may look impressive at first glance. However, the quality of the data is uneven and a lot of users find the data difficult to use. Various cultural institutions do release open data, but often with no tangible outcomes. The recent digital public service competition ‘Su)g’  illustrates the point: only one out of 75 competitors represented the cultural field – the Tallinn Art Hall’s application for virtual exhibitions.  Therefore, from an open data perspective, we can speak of ’digital culture’ as a so-called cultural practice that comes along with broader technological developments.
The perspective of open data in relation to digital culture becomes much more interesting if we peek into the future instead of the present. In winter 2020, discussions are ongoing on what datasets Estonia should submit to the European-wide list of high-value open data in relation to PSI Directive . Estonia aims to add language resources to the existing list of high-value data categories. This will likely mean that in the coming years, much more attention will be given to all data feeding into the development of language technologies, which Estonian memory institutions have so far published on the open data portal in diverse formats. In other words: the archives of all Estonian memory institutions will become machine-readable, interoperable and processable using various machine-learning algorithms.
We don’t know yet what this means in practice. Just like we don’t know whether and in what form we can expect data from the often-discussed ’Internet of Things’ to be made open and what could be the potential economic benefits of doing so. This is because the 5G technology, which could facilitate the provision of such services, is not a thing in itself. Although the adoption of 5G is being widely discussed (admittedly, often from a bio-psychological aspect), we still don’t know what applications we could use in the future thanks to this technology.
All Data are Cultural Data
Estonia has come a long way, but in practice we still have very little data that could be analyzed as cultural data.
Our discussion of cultural data could as well end here because Estonia has come a long way, but in practice we still have very little data that could be analyzed as cultural data. But as they say, where facts are not enough, fiction starts. At this point I would like to draw the Estonian readers’ attention to an excellent dystopia where the plot – very fittingly – unfolds around the algorithmic intrigues born out of interpreting culture and social data. Victor Pelevin’s „iPhuck 10“ was first published in 2017 and translated into Estonian in spring 2020. Pelevin’s book imagines the future at around the beginning of the 22nd century. Porfiry Petrovich belongs to the internal security forces and serves as an investigator/literature robot. Its job is to collect data and tell stories based on the data. As a figure of speech, Porfiry Petrovich has been trained to see the forest for the trees – to see causal relationships between data and datasets, speculate with them, and create different narratives based on the data. Doing this, Porfiry Petrovich surfs around various networked devices and analyzes the data and log files stored therein – thereby, second by second, mapping the social history of the world. It knows everything, and if it doesn’t, the necessary information can either be found or derived from the data.
The daily rhythms of this dystopia are determined by algorithmic solutions that create vast amounts of data. People no longer remember a time when it used to be possible to understand and interpret societal events without the help of robots who can “see” and point to important changes when necessary. This dystopia of a datafied society is detailed enough to suggest that such a world will probably be realistic much earlier than in a hundred years, at least in technological terms.
Four aspects are important in Pelevin’s dystopia. First, in case “thinking“ storytelling robots continue to be very expensive, owning an efficient algorithm is a privilege that offers its owner far greater social advantages than any web-based service that we can conceive of today.
Second, future datafied and data-driven societies risk creating new forms of inequality that we cannot even imagine yet. Inequality in the sense that while the so-called social data is created by all members of society, only those who are rich or competent enough will benefit from the data.
What makes data ’cultural’ is the interpretation of said data as part of the cultural context. Essentially this means that in the broadest sense of the term, all data are cultural data.
The third aspect is in fact the reason why Pelevin should be cited when talking about cultural data in Estonia in 2020: any data can be cultural data if those interpreting the data attribute a cultural narrative to the data. In other words, what makes data ’cultural’ is the interpretation of said data as part of the cultural context. Essentially this means that in the broadest sense of the term, all data are cultural data. For example, the Cultural Data Analysis Center (CUDAN) operating at the Tallinn University since spring 2020 applies a very broad approach to cultural data, without restricting itself to one or another definition that would limit researchers’ freedom. In principle, any dataset originating in the social context can be explored for cultural issues.
Fourth, as Pelevin’s literature robot suggests, the key question is what happens to the data later? What part of cultural data are in public use, how much of the data belong to the private sector, what are the data used for and what regulations govern these processes? For example, are there any kinds of data that should not be reused beyond the purpose for which it was collected?
Let’s come back to Estonia at the end of 2020. Pelevin’s Porfiry Petrovich is largely the type of data user foreseen in our national open data policy: it uses data that are transparent, available to everyone, and accompanied with detailed metadata (PP, of course, also has many qualities that are not yet regulated today). It is worth noting that the use of future cultural data is also not regulated in the current Estonian legislation. And why should it – after all, this kind of new and unimaginable data use is not yet a reality today.
What is certain is that the services designed in a datafied society, and the social reality growing out of them, are only limited by our own thinking.
In November 2018, President Kersti Kaljulaid Kaljulaid held a speech at the e-government conference UX Tulevikku in Tallinn. Her presentation focused on the dilemmas related to the co-existence of humans and robots. From her presentation, the media quite unjustly only picked up the idea „it is high time to start talking about humans’ relationships, including sexual relationships, with robots – what is forbidden and what is not.“  It is not even important whether the president was inspired by Pelevin, whose book had appeared the year before, or if this is simply a sign of the times we live in. What is certain is that the services designed in a datafied society, and the social reality growing out of them, are only limited by our own thinking. Even just a couple of years ago, the type of technological dystopias depicted in the TV series Black Mirror seemed like an exciting intellectual entertainment. To the 40+ generation, George Orwell’s „1984“ has always been something more than just a bad dream-like warning novel. Those born in the middle of the 1970s still remember what a totalitarian society looked like. Yet, just ten years ago, even this generation could not imagine that social profiling the way it is already done in China, Russia and elsewhere is actually possible. Not to mention customer service chatbots that have been solving our daily problems in more innovative organizations for several years already. They, too, process cultural data.
Nevertheless, as long as real-life services are not catching up with literature, it is worth remembering that dystopia as a genre has at least three cultural functions: 1) warning as a certain modus operandi of social critique: a vision on the theme what if…; 2) entertainment as the only means to take this message to broader audiences; and 3) dystopia as a genre that reflects our current understanding of what is possible in the first place. Depicting one or another scenario as realistic means it has become an object of discussion.
 See the existing action plans and composition of the council on the website of the Ministry of Culture: https://www.kul.ee/et/kultuuriparandi-digiteerimine-0
 More information on the “Su(g” digital service competition: https://medium.com/digiriik/selgusid-riigi-digiteenuste-konkursi-su-g-finalistid-e2a47472f1d2
 PM 08.11 2018 Aivar Pau, „Kaljulaid: on aeg rääkida seksist robotitega“, https://tehnika.postimees.ee/6448926/kaljulaid-aeg-on-raakida-seksist-robotiga-video