Disaster analysis: Like looking for a needle in a haystack
With its forensic disaster analysis, the Center for Disaster Management and Risk Reduction Technology (CEDIM) has launched a new kind of interdisciplinary research. TOPICS GEO talked to project leader Stefan Hinz about the possibilities and limitations of mobile crowdsourcing for the analysis of natural disasters.
The phenomenon of mobile crowdsourcing first appeared just over ten years ago. That was when the first smartphones came onto the market, enabling every user to record specific data and provide them immediately in digital format. At CEDIM we have been using crowdsourcing to analyse natural disasters for around two years. We have been very impressed with the results, especially in terms of the speed of information provision. We can currently localise many disasters worldwide within the first three minutes after people are affected.
What are the biggest problems in terms of practical use?
You need to distinguish between purely technology-based and scientific challenges. On the technology side, the first task is to define practical, standardised interfaces. The difficulty here is that, depending on the operating system of the smartphone or the app used, the definition of data and attributes can vary greatly. In practice, finding solutions in this area requires a lot of time and effort, since it involves recording and processing millions of individual data submissions. From a scientific perspective, the challenge is to extract and correctly assess the relevance of a piece of information. Of course, most users do not tweet or post images on the internet in order to help us evaluate disaster-related data. So we have to filter out the information that is most useful for our analysis from a mass of unstructured data. This can be like looking for the proverbial needle in a haystack.
Where is the search most productive? Are there particular social networks that you prefer?
We are forced to concentrate on particular applications because otherwise our resources would simply not be sufficient. One of our favourites is the short message service Twitter. It is used worldwide and users respond to developments very quickly. Then there are also special apps and services, like Ushahidi or Google Crisis Response, that are used during natural disasters. The advantage with these is that the information you obtain is provided systematically, if only to a limited degree. This is because you need to have someone on the spot who is registered with the special service, and who distributes suitable information.
How do you go about analysing the enormous amount of data?
Since we cannot monitor all the information communicated, we initially restrict the selection of data on a regional basis. This is done automatically using algorithms, into Like looking for a needle in a haystack With its forensic disaster analysis, the Center for Disaster Management and Risk Reduction Technology (CEDIM) has launched a new kind of interdisciplinary research. TOPICS GEO talked to project leader Stefan Hinz about the possibilities and limitations of mobile crowdsourcing for the analysis of natural disasters. Since 2008, Dr. Stefan Hinz has been Professor of Photogrammetry and Remote Sensing at the Karlsruhe Institute of Technology (KIT). His main areas of research are developing methods for automated image and data analysis, including applications in the field of geoinformatics. Within CEDIM, he is project leader for research activities in the field of crowdsourcing. which we incorporate background information such as population density. One approach that has proved useful is to compile statistics over a longer period on how many reports are normally made on certain topics. Once this kind of background noise is defined as the normal state, special features are easy to spot. Using text analysis techniques, you can then determine whether a particular word, for example “storm”, occurs more frequently than usual. In a second stage, you need to establish whether the anomaly is connected with a natural event, or whether a different reason is responsible.
How many different terms do you analyse on a regular basis?
Depending on the language, a targeted search is performed for between 20 and 50 specific words. Of course, this will depend on the level of specialisation, in other words on whether a term like hail is included in the list. The selection must not be too big, as otherwise In focus 14 Munich Re Topics Geo 2014 evaluation becomes more and more complex. Besides, our service at CEDIM does not focus exclusively on monitoring current events. As part of our research work, we try to refine the analysis and develop new versions that remedy past shortcomings.
What steps do you take when you identify an anomaly for a particular region?
Our work is actually done once we have ascertained, after analysing the online data, that a serious event is involved. An automatic e-mail alert is then sent to our cooperation partners. The local authorities and aid organisations decide what happens on the ground. That is not our role. However, we have also taken on the task of analysing the course of a natural disaster retrospectively. We want to identify certain causal links. For example: What effect did a landslide have on energy supplies in a region, and in turn, what repercussions did that have on the transport system? If we can identify how different factors in a disaster interact, we can prepare better for future events.
Many crisis managers are somewhat critical of crowdsourcing in terms of the data quality involved. Can big data and crowdsourcing really live up to their promised potential for crisis management?
quality standard here because there is no reference available for it. For example, how do you define a 100% recall rate for crowdsourcing? But that does not have to be the decisive factor. The added value lies in obtaining key information more rapidly when a disaster occurs. So it’s all the more important that the algorithms provide measures of quality and confidence about the identified information. At any rate, crowdsourcing has more than proved its value overall, because you obtain data from many different sources, rather than from a single or just a few monitoring points.
How could norms and standards be harmonised to enhance the potential of big data?
We have to be realistic about this. The manufacturers of smartphones and the developers of social media platforms have commercial objectives. Crowdsourcing for humanitarian purposes is not very high on their agenda. That being so, we have to try to make the best possible use of what the market offers us.
What are the prospects for cooperation between different crowdsourcing platforms?
Such approaches are already being used in the science sector, although efforts at an operational level are still at a very early stage. We have found that relief organisations are happy to receive any information they can from crowdsourcing. But there is definitely a need for more active cooperation.
The general public has become much more aware of the topic of data protection. How can contributors to crowdsourcing be protected against data abuse?
That is a subject that is not just relevant for crowdsourcing. Many apps force the user to approve the use of certain data, such as their location or contact lists, even if these are not needed for the use of the app. Each person should have full control over their privacy. However, in reality things are a little different. After all, who is really sure what rights they have granted to the operator of an app? I think that this is a job for the politicians. They need to push for standardised regulations similar to those that apply for roaming within Europe. Greater transparency would certainly be an advantage for crowdsourcing. As a user, I can make a conscious choice once I know who would like to view which data.
If you venture a look ahead, how do you see mobile crowdsourcing developing over the next few years?
If you just consider the sophisticated algorithms that online retailers and advertising marketers are already using to analyse customer behaviour, we can expect major progress. If you extend these technological possibilities to crowdsourcing for natural disasters, substantial improvements could be achieved, especially in terms of the quality of information selection. Of course, politicians may scotch these developments on data protection grounds, in which case progress will certainly be less dynamic.