The lessons that OSINT provides to open-data portals
Open-source intelligence (OSINT) is the practice of collecting and analysing information gathered from open sources to produce actionable intelligence. This intelligence can support, for example, national security, law enforcement and business intelligence. OSINT investigates open (source) data collected for one purpose and repurposes it to shed light on hidden topics. The whole concept of OSINT sounds counter intuitive—using open data to reveal information that organisations want to keep secret.
In the war in Ukraine, OSINT has been used to get a better idea of the movements of Russian military equipment and also to understand real progress during the war. This includes first-hand video footage collected by ordinary citizens, but also data points collected, for example, by web sites looking to track aircraft and train movement.
Open sources that feed into OSINT can be divided up into six categories of information flow:
- Public media – print newspaper, magazines, and television.
- Internet – online publications and blogs, discussion groups such as forums, and social media websites, such as YouTube, Twitter and Instagram.
- Public government data – public government reports, budgets, press conferences, hearings, and speeches.
- Professional and academic publications – journals, conferences, academic papers, and theses.
- Commercial data – commercial imagery, business and financial assessments, and databases.
- Grey literature – technical reports, patents, business documents, unpublished works, and newsletters.
OSINT-related challenges and the role for open-data portals
Data collected and repurposed for OSINT is unique as it cannot be compared against an objective reality to see that the data is being collected properly. Gathered weather data, for example, can be compared against an observable reality of rain, while OSINT data is looking to reveal an uncertain reality. Consequently, portals that hold OSINT data (OSINT portals) operate in an environment of uncertainty in which understanding and measuring data quality is key.
In the war in Ukraine, for example, local residents who film and take pictures of destroyed equipment focus attention on Russian equipment that is being destroyed rather than on Ukrainian loses. Given that Russian soldiers have reportedly had mobile phones confiscated, this can lead to a natural bias of available information that is skewed.
At the same time, some actors may also be interested in manipulating the story that is being told by OSINT. For example, moderators of OSINT data sort through pictures and videos that claim to be separate incidents, but are actually from the same incident taken from a different angles. Within the fog of war, unreliable data comes from both innocent mistakes and purposeful deceptions. These manipulations multiply the complexity of the task of ensuring the validity of the data being collected, which ultimately determines its value. OSINT gains value not through the amount of data being collected, but via the meticulous research required to ensure its authenticity.
Clearly, OSINT provides real benefits with some caveats, creating a wealth of data with uncertain reliability. In the case of OSINT, the data needs to be vetted and put into context before being distributed via a portal or platform. Open-data portals that distribute OSINT must take a proactive role in ensuring data quality through selection and validation. Without this context-giving role, the open data that OSINT portals collect would quickly be dismissed as unreliable noise and as a source of unreliable propaganda.
This is an important lesson—collecting vast quantities of data without reorganising and qualifying it can reduce its value. Open-data platforms are not necessarily conduits through which data flows freely, they also play a valuable role as arbitrators of quality. Ultimately, developers and users need to be able to easily access the data, but also trust that the information they are using can be relied upon to provide accurate results for the intended purpose.
Further examples of OSINT portals
While OSINT has received media attention given the war in Ukraine, it has been gaining momentum for the last decade. Several institutions have been incorporating OSINT into their practice, including national security, law enforcement, and non-governmental organisations, such as Bellingcat, the Center for Information Resilience and Oryx.
Bellingcat, an independent international collective of researchers, investigators and citizen journalists, has been a particularly heavy user of the kind of open data that OSINT provides. In one piece of reporting, Bellingcat attempted to identify key suspects in the downing of Malaysian Airlines Flight 17 over Eastern Ukraine in 2014. Based on images taken from social media and phone intercepts that the Ukrainian Security Service made available on YouTube, Bellingcat published a report outlining evidence that a Buk Missile launcher downed the airline. They also published the names of the individuals that they believed responsible. Other examples of their work include exposing illegal shipping precursors of the nerve agent Sarin to Syria by Belgian companies, revealing the use of drones by non-state actors in Syria and Iraq, and most recently mapping incidents of civilian harm in Ukraine.
The Center for Information Resilience, a non-profit organisation based in the United Kingdom, dedicates itself to countering misinformation, exposing human rights abuses and combatting online behaviour that is harmful to women and minority groups. Currently, they are part of the OSINT community of researchers that are studying the war in Ukraine with the aim of providing reliable information on the conflict through verified open-source evidence. Recently, they published their findings on the Yalivshchyna Burial Site and the mass graves after the Russian invasion after combing through information on the Russia-Ukraine Monitor Map.
Oryx is a blog website run by two military analysts since 2014 that is devoted to investigating and sharing information about conflict research, OSINT and military history. A recent example of their work is a list of destroyed and captured vehicles and equipment from Russia and Ukraine since the invasion on 24 February 2022. This list only includes destroyed vehicles and equipment where a photo or videographic evidence of the destruction is available. Users are welcome to add links or files of additional photos and videos that show destroyed vehicle and equipment to be verified by the team and later added to the list.
OSINT and open data portals: a powerful alliance
Open source intelligence relies heavily on ordinary, unpaid citizens in much the same way that open source software relies on ordinary, unpaid developers. Without this army of unpaid OSINT supporters, who collect and check the quality of raw data, OSINT would lack its authority and quality. The decentralised and crowd-sourced nature of OSINT erodes the ability of centralised authorities—whether government or corporate—to hide certain truths. This is not to say that crowd-sourced judgements always lead to sound analysis. As with any data source, information should be corroborated and continually questioned. This, however, makes open data portals all the more important in providing a governance structure and framework that allows OSINT to thrive.