Welcome to quanda’s documentation!¶

quanda is a toolkit for quantitative evaluation of data attribution methods in PyTorch.

Note

quanda is under active development. Note the release version to ensure reproducibility of your work. Contributions, bug reports, and feature requests are welcome.

Fig. 1: quanda provides a unified and standardized framework to evaluate the quality of Training Data Attribution methods in different contexts and from different perspectives.¶

Note

This page describes quanda’s purpose, design and features. For a quick start on quanda, please refer to the Quickstart page.

Training Data Attribution (TDA) is a new avenue in the interpretation of neural networks. While some methods attempt to estimate the counterfactual effects of training new models on the subsets of the training dataset, this ground truth is noisy and hard to compute. Therefore, the community has proposed evaluating these methods’ performance on a downstream task, or measuring how well a method satisfies desired heuristic properties.

quanda is designed to meet the need of a comprehensive and systematic evaluation framework, as well as a unified interface for attributors. Please visit the Background page for a detailed explanation of TDA, including citations to the relevant literature.

Library Features¶

Here we list the main components of quanda along with basic explanations of their function. We refer the reader to the Contribution guide, the API reference and the Basic Usage for further details. Below is a schematic representation of the components of quanda:

Fig. 2: Components and their interactions in quanda¶

Explainers

quanda provides a unified interface for various TDA methods, symbolized by the Explainer base class. The interface design prioritizes ease of use and easy extensions, allowing users to quickly wrap their implementations to use within quanda.

Metrics

quanda provides a set of metrics to evaluate the effectiveness of TDA methods. These metrics are based on the latest research in the field. Most Metric objects in quanda are used to compute the evaluation scores from attributions over a test set. The Metric objects are designed to be easily extendable, allowing users to define their own metrics.

Benchmarks

Note that many metrics require training models in controlled settings, e.g. with mislabeled samples that are known. This means that the corresponding Metric objects can only be used if the user has prepared this controlled setup. Furthermore, Metric objects require generating the attributions beforehand. quanda provides a benchmarking tool to evaluate the performance of TDA methods on a given model, dataset and problem. For each Metric object, quanda provides a Benchmark object. The Benchmark objects handle the creation of the controlled setup, training the model, generating the attributions and evaluating them using the corresponding Metric object, if needed. Finally, we provide precomputed benchmarks, which can be used by initializing the object with the load_pretrained method. These precomputed benchmarks allow the user to skip the creation of the controlled setup to directly start the evaluation process, while providing a standard benchmark for practitioners and researchers to compare their methods with.

Supported TDA Libraries¶

Library	Reference
Captum (Similarity Influence, Arnoldi Influence Functions, TracIn)	Caruana et al., 1999; Schioppa et al., 2022; Koh and Liang, 2017; Pruthi et al., 2020
Representer Point Selection (Representer Point Selection)	Yeh et al., 2018
TRAK (TRAK)	Park et al., 2023
Kronfluence (Kronfluence)	Grosse et al., 2023
Dattri (Influence Functions: Explicit / CG / LiSSA / DataInf, Arnoldi, EK-FAC, TracInCP, Grad-Dot, Grad-Cos, TRAK)	Deng et al., 2024

Evaluation Metrics¶

In this section, we list the evaluation criteria that are currently available in quanda.

Name	Reference	Description	Type
Linear Datamodeling Score	Park et al., 2023	Measures the correlation between the (grouped) attribution scores and the actual output of models trained on different subsets of the training set. For each subset, the linear datamodeling score compares the actual model output with the sum of attribution scores from the subset using Spearman rank correlation.	Ground Truth
Class Detection / Subclass Detection	Hanawa et al., 2021	Measures the proportion of identical classes or subclasses in the top-1 training samples over the test dataset. If the attributions are based on similarity, they are expected to be predictive of the class of the test datapoint, as well as different subclasses under a single label.	Downstream Task Evaluator
Shortcut Detection	Yolcu et al., 2025	Assuming a known shortcut, or Clever-Hans effect has been identified in the model, this metric evaluates how effectively a TDA method can identify shortcut samples as the most influential in predicting cases with the shortcut artifact. This process is referred to as Domain Mismatch Debugging in the original paper.	Downstream Task Evaluator
Mislabeled Data Detection	Koh and Liang, 2017	Computes the proportion of noisy training labels detected as a function of the percentage of inspected training samples. The samples are inspected in order according to their global TDA ranking, which is computed using local attributions. This produces a cumulative mislabeling detection curve. We expect to see a curve that rapidly increases as we check more of the training data, thus we compute the area under this curve.	Downstream Task Evaluator
Top-K Cardinality	Barshan et al., 2020	Measures the cardinality of the union of the top-K training samples. Since the attributions are expected to be dependent on the test input, they are expected to vary heavily for different test points, resulting in a low overlap (high metric value).	Heuristic
Model Randomization	Hanawa et al., 2021	Measures the correlation between the original TDA and the TDA of a model with randomized weights. Since the attributions are expected to depend on model parameters, the correlation between original and randomized attributions should be low.	Heuristic
Mixed Datasets	Hammoudeh and Lowd, 2022	In a setting where a model has been trained on two datasets: a clean dataset (e.g. CIFAR-10) and an adversarial (e.g. zeros from MNIST), this metric evaluates how well the model ranks the importance (attribution) of adversarial samples compared to clean samples when making predictions on an adversarial example.	Heuristic
Mean Reciprocal Rank (MRR)	Akyurek et al., 2022	For fact-tracing settings, measures the mean reciprocal rank of the highest-ranked entailing proponent across fact queries.	Downstream Task Evaluator
Recall@k	Akyurek et al., 2022	For fact-tracing settings, measures the proportion of facts for which an entailing proponent appears in the top-k retrievals.	Downstream Task Evaluator
Tail Patch	Chang et al., 2025	For fact-tracing settings, measures the incremental change in target-sequence probability after taking a single training step on retrieved proponents.	Downstream Task Evaluator

Metric Interpretation Guideline¶

Metric	Output range	Better
ClassDetection	`[0, 1]`	higher
SubclassDetection	`[0, 1]`	higher
MislabelingDetection	`[0, 1]`	higher
ShortcutDetection	`[0, 1]`	higher
MixedDatasets	`[0, 1]`	higher
TopKCardinality	`[0, 1]`	higher
ModelRandomization	`[-1, 1]`	closer to 0
LinearDatamodelingScore	`[-1, 1]`	higher
MRR	`[0, 1]`	higher
RecallAtK	`[0, 1]`	higher
TailPatch	`[-1, 1]`	higher

Benchmarks¶

quanda comes with a number of pre-computed benchmarks that can be conveniently used for evaluation in a plug-and-play manner. We are planning to significantly expand the number of benchmarks in the future. Currently available benchmarks span vision (MNIST / LeNet, CIFAR-10 / ResNet-9, AWA2 / ResNet-50), text classification (QNLI / BERT), and causal language modeling (T-REx / GPT-2 fine-tuned on OpenWebText). The benchmark IDs listed below are to be passed to load_pretrained.

Metric	Type	Modality	Benchmark IDs (Dataset / Model)
TopKCardinalityMetric	Heuristic	Vision	mnist_top_k_cardinality (MNIST / LeNet) cifar_top_k_cardinality (CIFAR-10 / ResNet-9) awa2_top_k_cardinality (AWA2 / ResNet-50)
		Text	qnli_top_k_cardinality (QNLI / BERT)
ModelRandomizationMetric	Heuristic	Vision	mnist_model_randomization (MNIST / LeNet) cifar_model_randomization (CIFAR-10 / ResNet-9) awa2_model_randomization (AWA2 / ResNet-50)
		Text	qnli_model_randomization (QNLI / BERT)
MixedDatasetsMetric	Heuristic	Vision	mnist_mixed_datasets (MNIST / LeNet) cifar_mixed_datasets (CIFAR-10 / ResNet-9) awa2_mixed_datasets (AWA2 / ResNet-50)
		Text	qnli_mixed_datasets (QNLI / BERT)
ClassDetectionMetric	Downstream Task Evaluator	Vision	mnist_class_detection (MNIST / LeNet) cifar_class_detection (CIFAR-10 / ResNet-9) awa2_class_detection (AWA2 / ResNet-50)
		Text	qnli_class_detection (QNLI / BERT)
SubclassDetectionMetric	Downstream Task Evaluator	Vision	mnist_subclass_detection (MNIST / LeNet) cifar_subclass_detection (CIFAR-10 / ResNet-9) awa2_subclass_detection (AWA2 / ResNet-50)
MislabelingDetectionMetric	Downstream Task Evaluator	Vision	mnist_mislabeling_detection (MNIST / LeNet) cifar_mislabeling_detection (CIFAR-10 / ResNet-9) awa2_mislabeling_detection (AWA2 / ResNet-50)
		Text	qnli_mislabeling_detection (QNLI / BERT)
ShortcutDetectionMetric	Downstream Task Evaluator	Vision	mnist_shortcut_detection (MNIST / LeNet) cifar_shortcut_detection (CIFAR-10 / ResNet-9) awa2_shortcut_detection (AWA2 / ResNet-50)
MRRMetric	Downstream Task Evaluator	Causal LM	gpt2_trex_openwebtext_ft_mrr (T-REx / GPT-2 fine-tuned on OpenWebText)
RecallAtKMetric	Downstream Task Evaluator	Causal LM	gpt2_trex_openwebtext_ft_recall_at_k (T-REx / GPT-2 fine-tuned on OpenWebText)
TailPatchMetric	Downstream Task Evaluator	Causal LM	gpt2_trex_openwebtext_ft_tail_patch (T-REx / GPT-2 fine-tuned on OpenWebText)
LinearDatamodelingMetric	Ground Truth	Vision	mnist_linear_datamodeling (MNIST / LeNet) cifar_linear_datamodeling (CIFAR-10 / ResNet-9) awa2_linear_datamodeling (AWA2 / ResNet-50)
		Text	qnli_linear_datamodeling (QNLI / BERT)

Citation¶

If you find quanda useful and want to use it in your research, please cite it using the following BibTeX entry:

@misc{bareeva2024quandainterpretabilitytoolkittraining,
      title={Quanda: An Interpretability Toolkit for Training Data Attribution Evaluation and Beyond},
      author={Dilyara Bareeva and Galip Ümit Yolcu and Anna Hedström and Niklas Schmolenski and Thomas Wiegand and Wojciech Samek and Sebastian Lapuschkin},
      year={2024},
      eprint={2410.07158},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2410.07158},
}

If you are using quanda for your scientific research, please also make sure to cite the original authors for the implemented metrics and TDA methods.

Usage

API Reference

quanda

Background

Community

Contribution Guide for quanda