In the last decade, machine learning has developed into a key
technology. Its applications are everywhere, they are changing the
world we are living in, and they are influencing the way scientific
discoveries take place.
However, machine learning algorithms are never perfect - an algorithm that
always makes a correct prediction cannot exist. Rather, different
algorithms are based on different assumptions, make
different types of mistakes, and have different strengths and weaknesses.
This is where our research comes in.
Our focus is to
understand machine learning algorithms from a theoretical point of
view, understand their implicit mechanisms, and to give formal
statistical guarantees for their performance. This enables us to reveal
fundamental assumptions, biases and strenghts and weaknesses of widely used
machine learning algorithms.
Explainable machine learning
The outcome of many machine learning algorithms is so
complex that their behaviour is hard to understand, even for
experts. The field of explainable machine learning tries to
solve this problem by constructing algorithms that
``explain'' the model's behaviour. In our research group we
investigate whether such explanation
algorithms are reliable or not. Here are links to some example papers of our group:
-
From Shapley Values to Generalized Additive Models and back. 2022.
link to arxiv
-
Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts, FACCT 2022.
link to arxiv
- Explaining the Explainer: A First Theoretical Analysis of LIME, AISTATS 2020.
pdf
When machine learning algorithms provably fail
Most machine learning researchers work on deriving new or better
machine learning algorithms. In their analysis, the focus is to
demonstrate that their algorithms work well in many cases. We find it
intriguing to reveil under which circumstances popular machine
learning algorithms provably fail. Some example publications in this
direction are:
- Too Relaxed to Be Fair. 2020.
pdf
- When do random forests fail? 2018.
pdf
-
Peer grading in a course on algorithms and data structures: machine learning algorithms do not improve over simple baselines, 2016
pdf
-
Getting lost in space: Large sample analysis of the commute distance, 2010.
pdf
-
A Sober Look on Clustering Stability, 2005.
pdf
-
Limits of spectral clustering, 2004.
pdf
Understanding implicit assumptions and hidden biases
Machine learning algorithms always come with hidden assumptions and
biases. However, these are rarely stated explicitly; they are implicit
consequences of the way the algorithm has been built. Part of our
work is to make these biases explicit, formalize them in the language
of mathematics, and to reveal the implicit inductive principles that
are invoked by particular algorithms. Some examples for this line
of work are the following papers:
- NetGAN without GAN: From Random Walks to Low-Rank Approximations, 2020
pdf
- Explaining the Explainer: A First Theoretical Analysis of LIME, 2020
pdf
-
How the result of graph clustering methods depends on the construction of the graph, 2013.
pdf
- Phase transition in the family of p-resistances, 2011
pdf
Formal mathematical guarantees
Machine learning algorithms are never perfect - yet some of them can
be proven to survive some basic suitability tests, while others do
not. For example, one might be able to prove that an algorithm is
statistically consistent: when the algorithm gets to see an infinite
amount of data, its prediction performance is going to converge to the
best possible one. Part of our work consists in giving such formal
guarantees. Example papers are:
- Foundations of Comparison-Based Hierarchical Clustering, 2019
pdf
- Uniqueness of ordinal embedding, 2014
pdf
- Consistency of spectral clustering, 2008
pdf
Many more directions
Our research encompasses many other directions, please simply
check out our
publication list.