Exposing and explaining misbehaviours of deep learning systems

Zohdinasab, Tahereh

Back

Doctoral thesis

Exposing and explaining misbehaviours of deep learning systems

Università della Svizzera italiana

Zohdinasab, Tahereh
Tonella, Paolo (Degree supervisor)
Riccio, Vincenzo (Degree supervisor)

2024

PhD: Università della Svizzera italiana

Search based software engineering

English Assessing the quality of Deep Learning (DL) systems is crucial, as they are increasingly adopted in safety-critical domains. Researchers have proposed several input generation techniques for DL systems. While such techniques can expose failures, they do not explain which features of the test inputs influenced the system's misbehaviour. This research delves into diverse methodologies aimed at overcoming challenges inherent in testing DL systems, with a particular focus on generating targeted test cases and interpreting system behaviours. To this aim, we proposed three novel testing approaches for DL systems, i.e., DEEPHYPERION-CS, DEEPATASH, and DEEPTHEIA. DEEPHYPERION-CS explores the feature space at large using Illumination Search and provides a unique characterisation of a DL system's quality through an interpretable map which represents the highest-performing (i.e., misbehaving or closest to misbehaving) inputs in the space of the relevant, domain-specific features. We introduce a novel methodology to guide users in manually defining and quantifying feature dimensions effectively. Our empirical study shows that DEEPHYPERION-CS is more effective than state-of-the-art DL testing tools in generating failure-inducing inputs associated with highly diverse features. DEEPATASH is a focused test generator, i.e., a solution for generating failure-inducing inputs with specific features. It can address the development to operation (dev2op) data shift phenomenon, by focusing on interesting feature values observed in operational environments. Further enhancing test generation efficiency, DEEPATASH-LR integrates a surrogate model into the process. Experimental results show that both DEEPATASH and DEEPATASH-LR are effective in generating focused test inputs and improving the quality of the original DL systems through fine tuning on data with the targeted features without regression. DEEPTHEIA is a fully automated illumination-based test generator capable of autonomously extracting features and exploring the feature space using diffusion models. It overcomes the limitation of illumination-based approaches such as DEEPHYPERION, i.e. the need of human expert involvement for the definition of the features and the need of generative input models that can be mutated during the search process. Finally, we provide a thorough comparison of explanatory techniques used to under- stand DL system misbehaviours, including our newly proposed feature maps, shedding light on both their comprehensibility and limitations. Our findings contribute significantly to advancing testing methodologies and enhancing the interpretability of the causes of DL misbehaviours.

Collections

USI Faculty of Informatics

Language

English

Classification

Computer science and technology

License

License undefined

Open access status

green

Identifiers

NDP-USI 2024INF004
ARK ark:/12658/srd1328318
URN urn:nbn:ch:rero-006-121977

Persistent URL

https://n2t.net/ark:/12658/srd1328318

Statistics

Document views: 727 File downloads:

2024INF004: 770

Doctoral thesis

Exposing and explaining misbehaviours of deep learning systems

Università della Svizzera italiana

Software testing

Deep learning

Search based software engineering

Statistics