HiDALGO: Merging HPC and High-Performance Data Analytics to Address Global Challenges

Keyvisual image main
The HiDALGO plenary meeting took place at HLRS earlier this year.

HLRS is technical lead for an EU-funded project that is developing potential solutions relevant for migration, air pollution, and the spread of information through social media.

As transportation and communications networks have grown, they have brought the world closer together. Whatever positive effects this has had for human development, it has also created a situation in which many local or regional challenges that societies now face have become global. Addressing climate change, fighting disease pandemics, and managing forced migration, for example, are all issues that not only profoundly affect individual nations but also cross international boundaries.

With the arrival of more powerful computing technologies, demand has been growing among governments and other decision-making entities for tools that provide real-time forecasts they can use to manage sudden challenges. As more data related to these kinds of problems becomes available, high-performance computing (HPC) incorporating simulation, data analytics, and artificial intelligence holds the potential to provide such tools. However, it can only do so if it can rise up to a range of technical challenges, such as in creating and adopting the frameworks needed to manage and analyze the large, complex datasets involved.

Late last year, the High-Performance Computing Center Stuttgart (HLRS), in collaboration with project coordinator ATOS and 11 other institutions from seven countries, began a new research project called HiDALGO aimed at achieving this goal. Funded under the Horizon 2020 Framework Programme of the European Union, HiDALGO is developing novel computational methods, algorithms, and software for modeling complex processes that arise in connection with global challenges. This includes improving simulation quality by incorporating more data sources — including collections of batched data as well as real-time streamed data — and combining existing models into more comprehensive coupled models.

A key scientific focus of HiDALGO is on the integration of high-performance data analytics with high-performance computing. As large, multidimensional data sets representing different facets of global challenges become available, it is clear that HPC will play an essential role in their processing and analysis. Developing methods to efficiently manage and analyze the enormous data sets necessary to represent such problems, as well as coupling a diverse set of data sources and computational models, are formidable challenges that HiDALGO is addressing.

Pilot project focuses on predicting forced migration

At the same time, HiDALGO is supporting pilot projects in which these methods could have a practical impact. One, led by Dr. Derek Groen in the Computer Science department at Brunel University London, is focused on developing realistic models of forced migration during war.

Groen, along with colleagues Diana Suleimenova and David Bell, has been developing an open source simulation code called Flee, which predicts the destinations of refugees escaping conflict situations. Using a computational approach called agent-based modeling, Flee inserts virtual displaced persons into a simulated conflict situation; each agent moves through the virtual world based on a set of predefined rules until it reaches a safe location. Their goal is to develop a tool that decision makers could use — along with up-to-date data from conflict zones — to predict refugee movements and allocate relief resources to locations closest to a humanitarian crisis.

Although Flee was initially run on a local desktop, Groen’s group is now working with staff at HLRS and the Institute of Communications and Computer Systems (Greece) to optimize the code for large-scale supercomputers, and have already been able to run it efficiently using over 400 compute cores. More computing power should enable them to develop increasingly realistic models that incorporate greater numbers of factors — such as an individual's ethnicity and language, weather conditions, or dynamic situations on the ground like border closures — that affect refugees' movements. Groen’s group also works with HLRS and the KNOW Center in Austria to develop visualization tools that present the results of HPC simulations in a format that would quickly enable decision-makers to forecast and react to sudden population movements.

Developing general tools for high-performance data analytics

In other pilot projects underway as part of HiDALGO, researchers are developing tools and sensor networks to forecast and minimize air pollution in cities, and to identify and prevent the spread of false or malicious messages over online social networks. In each case, the goal is not only to address a specific global challenge, but also to produce advances in the science of high-performance computing that will improve the use of high-performance data analytics in an HPC framework.

In conjunction with the pilot projects, HiDALGO is also investigating how artificial intelligence could support the development of more realistic models. Although the specific methods are still in development, the project is exploring how AI could help simulation researchers to identify and tune parameters within their algorithms that are most significant for particular problems they are investigating; this could help accelerate the development of more realistic models.

"HiDALGO is showing that high-performance computing and high-performance data analytics are not only useful for scientific research or optimizing engineering designs," says Bastian Koller, Managing Director of HLRS. "Instead, the project demonstrates that HPC also has an important role to play in helping to address some of the most difficult challenges that we as a society are facing."

Click here to learn more about HiDALGO.

Christopher Williams