Complexity, Risk, and Uncertainty

Today, both society and the economy generate a complex torrent of data. If this unprecedented flow of information is to be made useful, we require new tools and methods for its analysis.

The measurement, interpretation, and communication of complexity and risk is a key part of modern science. Te Pūnaha Matatini researchers working within the Complexity, Risk, and Uncertainty theme (formerly called Complex Data Analytics) are developing tools for understanding and dealing with complex systems by developing the underlying theory. This includes work on optimising stochastic systems from supply chains to healthcare, inferring numbers of New Zealand birds from their calls (AviaNZ), and building a library of New Zealand soils from their spectral signatures. Public engagement with science is also a key part of Te Pūnaha Matatini’s work, and the researchers in this theme are working on ways to improve scientist-public interactions.

 

2018-2020 Projects

The reflexive scientist

In recent years, scientist-led science communication in New Zealand has been funded, celebrated and rewarded to the extent that Shaun Hendy, in 2014, said we were living in a “golden age of science communication”. Much of this work, however, is based on instinct, personal preference, or prior activities that appeared to be well received by the audience, media, or colleagues. The same is true internationally: while the international science community invests a considerable amount of time in public outreach, there is limited effort to evaluate, peer-review, and subsequently improve, these activities. While most research in this area focuses on the communication’s target audiences (or ‘publics’) and mechanisms for engagement, an exciting and emerging literature within public engagement of science is beginning to focus on scientists.

This project proposes that new mechanisms are required for reflexive analysis of science communication activities by scientists, and has three parallel strands: theoretical, practical, and reflexive. It will contribute towards building a new theoretical model for public engagement by scientists, develop a New Zealand framework for best practice science communication and public engagement, and establish novel mixed methods approaches to public engagement activity design and evaluation.

Three-year outcome: A new theoretical model for public engagement by scientists and the development of a professional learning community.

 

Communicating complexity, risk, and uncertainty to different publics

Effective communication of complexity, risk and uncertainty is one of the biggest challenges of science communication. Translation of information from scientist to communications professional to media to publics often results in the loss of nuances and changed meanings and messages. At the same time, new technologies and social-media platforms are changing the ways in which scientists can engage, the types and sizes of audiences/publics they can reach, and providing them with new metrics on the spread and perception of their messages.

This research project uses tools, methods and theories from science communication, public engagement with science and media studies to describe and analyse communications efforts by Te Pūnaha Matatini researchers to different publics and through different media channels, for example through traditional scientific channels such as scientist -> academic journal -> institutional press release -> news media story and through new media channels such as scientist or institution -> social media -> comments and responses and through story (eg, from Māori scientists to Māori audiences). Data analytics will be used to measure the propagation of outreach and communication efforts. Texts (including tweets, interview transcripts, press releases, and media reports) will be analysed (for example, using simple text mining, sentiment analysis, and modelling of diffusion on social networks), to track how complexity, risk and uncertainty are communicated and interpreted at different stages of the media cycle.

Three-year outcome: The development of models for the communication of complexity, risk, and uncertainty, particularly visualisation models.

 

Healthcare analytics: Medical signals

Medical data is collected continuously, and yet the primary form of recording is still often nurses looking at instantaneous snapshots and writing down the values. This provides the evidence that clinicians use for diagnosis, often of the form rule X –> action Y. The team at Wellington ICU are keen to use the plethora of personalised data streams now available to improve upon this, and Te Pūnaha Matatini have the opportunity to contribute. There are a whole host of simple things that could be done to improve every part of the process, not least simply storing the data so that evidence-based medicine can play its role. However, far more interesting questions (from the Te Pūnaha Matatini point of view) are how to present visualisations of data that is changing, how to detect changes in the waveform without generating excessive false positives, and what can be automated as responses to monitoring of vital signs – a simple example of the latter is hydration: automatic regulation of hydration through monitoring and reacting to changes in vital signs as they occur, not hours later when a nurse happens to take a reading at the right time.

We will work with the clinical team in the ICU at Wellington hospital to study a variety of questions in applied machine learning and data visualisation. The application of the methods that we develop is of primary interest to this team. Our work will be a combination of theory and practice building on the current state of the art, but since we will be interacting directly with the clinical team we will remain linked to the real questions of medical relevance.

Three-year outcome: the development of tested and useable tools for clinicians in utilising real-time monitoring of digitally-stored and analysed medical records

 

Healthcare analytics: Investigating strategies and effects in patient pathways

Patient pathways are complex involving many steps and the use of resources that are often limited, e.g., surgical teams, or required by many different pathways, e.g., imaging and diagnostics. Prioritising one patient pathway’s access to a resource may have adverse effects for other pathways that also need that resource. Determining effective prioritisation strategies for patients requires modelling of the complexity of patient pathways, their use of resources, and the effect of different prioritisation strategies. It also requires the consideration of appropriate metrics, in order to define a “good” prioritisation strategy.

We plan to combine analytic methods with simulation. Analytic methods allow for the characterisation of “optimal” prioritisation within a given set of metrics. However, they are usually limited in the scale and complexity of the system that can be considered. This is not an issue for simulation, which can consider a variety of complexities to accurately model patient pathways and resources, including bookings. The use of simulation to model real-world patient pathways and evaluate the efficacy of different prioritisation metrics for these pathways is a key component of this research and also provides validation of the analytic methods that will be developed.  Simulation models permit the modelling of very complex situations — here we would use them to test findings from simpler analytical models in a more realistic setting.  They may also be used to demonstrate, with historical data provided by healthcare providers, the effects and efficacy of different prioritisation schemes, (and other “what if” scenarios) before implementing in practice.  Our aim is to test first on simulation models, then run a trial, and use the results from those to aid policy decisions.

Three-year outcome: Development of fit-for-purpose patient prioritisation pathways for the New Zealand health system.

 

Networks in practice

Graphs and networks are an extremely useful way to represent many forms of agents and their interactions, enabling users to identify potential disease spread on a network of people meeting, or groups of neurons that perform some cognitive task and so are linked together in a brain, or make efficient supply chains for companies that manufacture objects from components made by other companies. Many of these networks exhibit other properties, such as having short average distance between nodes considering the average number of connections each node has, or having nodes with a huge number of connections (called hubs). These networks are known as complex networks, and there is a huge body of work studying them, showing that everybody’s favourite system is some form of complex network, and discussing what this means. What has received less focus is a more practical question: what benefit is there from knowing this? This could be useful to explain why these networks arise at all, but more pragmatically, it will enable algorithms to be designed that take advantage of the network structure for efficiency or accuracy gains. This is the traditional approach in computational graph theory, which has not transferred to complex networks for some reason; we hope to remedy this.

In this project, we will use a variety of application areas to seek common ground for what types of network arise, and devise and study algorithms that can utilise the structures in the networks. For example, small world networks – which have short average distances between nodes considering the number of connections each node has – should enable efficient parallel search and fast routing algorithms. Scale-free networks, which exhibit self-similarity in their scaling, should have efficient visualisation algorithms. We will look at three application areas: (1) biology – including phylogenetic trees and networks, (2) human interaction networks, and (3) digital humanities; seeking common ground and underlying interpretational aims for the data in those areas. We will work backwards from there to see what common algorithmic requirements there are, and how they can be exploited, developing novel algorithms that utilise the network structure if possible.

Three-year outcome: Development of new approaches in network science that utilise the structures of complex networks, as opposed to merely identifying them.

 

Visualising and analysing collections of evolutionary (and other) networks

Inferring the evolutionary history of all life on Earth has long been a fascinating problem in biology. Traditionally, phylogenetic (evolutionary) trees have been used to analyse ancestral relationships between organisms. Recent investigations into horizontal gene transfer and hybridisation, which are processes that result in mosaic patterns of relationships, challenge the model of a phylogenetic tree. Indeed, it is now widely acknowledged that graphs with cycles, called phylogenetic networks, are better suited to represent evolutionary histories (see Fig. 1). In comparison to trees, they provide a more accurate picture of the relationships between entities such as species, cancer cells, viruses, and languages. However, phylogenetic networks are much more complicated to reconstruct and analyse than trees.

The purpose of this project is to develop the first set of tools to visualise and analyse collections of phylogenetic networks with regards to several similarity measures. Most phylogenetic network reconstruction algorithms return collections of equally optimal solutions. For example, for a particular grass data set, the reconstruction of a so-called most parsimonious phylogenetic network that embeds two given phylogenetic trees returns more than 2000 equally optimal networks. While the computation of pairwise distances gives a valid first approach in attempting to analyse such a set of networks, a more holistic approach (e.g. to compute a network that summarises the information of a set of input networks) that requires an exact representation of the relationships among all networks under consideration is desirable.

Three-year outcome: Investigate the use of tools from multivariate data analysis such as clustering and multidimensional scaling to visualise and analyse collections of phylogenetic networks.

 

Hidden networks: Hybrid approaches for the history of science

The history of science in New Zealand is notable for the ways in which participant and institutional histories dominate, leading to what Kate Hannah (2017) describes as “a further absenting of women from New Zealand’s scientific history.” This project, which draws on previous experimentation with the application of computational techniques to historical data, will develop a new, blended – or hybrid – historiography, using network science, narrative history methods and approaches, close reading, and reading against the grain, in order to bring to the surface women and other underrepresented minorities’ contributions to science.

We are developing a comprehensive literature review of existing qualitative and quantitative data sources on women in science in New Zealand, and working to make these available via a repository and series of publications, including New Zealand Dictionary of Biography and Wikipedia entries. This resource will enable recommendations for digitisation priorities for New Zealand scientific institutions and repositories, and contribute to a widely accessible repository for digitised records and data sets. Phase two of the project focuses on connecting, comparing, and evaluating data sources and data-gathering methods, such as oralcy, oral history, memoir, organisational records, data sets, participant histories, and collective biographies, assessing the contribution of rich, diverse data sources to the utility of network approaches in historical research. Phase three develops and tests a new theoretical approach to historical data, through integrating computational and traditional methods. This novel theoretical approach will enable revisionism of established histories of science in New Zealand, and uncover invisible nodes and clusters within the history of science.

Three-year outcome: Develop an exploratory empirical model which prioritises marginalised histories, using new approaches in network science which utilise the structures of complex networks.