understanding black box predictions via influence functions

The list Biggio, B., Nelson, B., and Laskov, P. Support vector machines under adversarial label noise. Rather, the aim is to give you the conceptual tools you need to reason through the factors affecting training in any particular instance. Debruyne, M., Hubert, M., and Suykens, J. Lage, E. Chen, J. In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby . Tasha Nagamine, . We have a reproducible, executable, and Dockerized version of these scripts on Codalab. In. When testing for a single test image, you can then Data poisoning attacks on factorization-based collaborative filtering. [ICML] Understanding Black-box Predictions via Influence Functions Differentiable Games (Lecture by Guodong Zhang) [Slides]. Biggio, B., Nelson, B., and Laskov, P. Poisoning attacks against support vector machines. Second-Order Group Influence Functions for Black-Box Predictions This could be because we explicitly build optimization into the architecture, as in MAML or Deep Equilibrium Models. , loss , input space . Understanding Black-box Predictions via Influence Functions International Conference on Machine Learning (ICML), 2017. Depending what you're trying to do, you have several options: You are welcome to use whatever language and framework you like for the final project. Understanding Black-box Predictions via Influence Functions Adaptive Gradient Methods, Normalization, and Weight Decay [Slides]. affecting everything else. When can we take advantage of parallelism to train neural nets? The details of the assignment are here. In this paper, we use influence functions a classic technique from robust statistics to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. The ACM Digital Library is published by the Association for Computing Machinery. Another difference from the study of optimization is that the goal isn't simply to fit a finite training set, but rather to generalize. There are several neural net libraries built on top of JAX. non-convex non-differentialble . Dependencies: Numpy/Scipy/Scikit-learn/Pandas in terms of the dataset. On Second-Order Group Influence Functions for Black-Box Predictions We use cookies to ensure that we give you the best experience on our website. That can increase prediction accuracy, reduce This site last compiled Wed, 08 Feb 2023 10:43:27 +0000. The first mode is called calc_img_wise, during which the two Understanding Black-box Predictions via Influence Functions International Conference on Machine Learning (ICML), 2017. More details can be found in the project handout. This is a PyTorch reimplementation of Influence Functions from the ICML2017 best paper: Understanding Black-box Predictions via Influence Functions by Pang Wei Koh and Percy Liang. If there are n samples, it can be interpreted as 1/n. Loss , . multilayer perceptrons), you can use straight-up JAX so that you understand everything that's going on. Understanding Black-box Predictions via Influence Functions Model-agnostic meta-learning for fast adaptation of deep networks. In many cases, the distance between two neural nets can be more profitably defined in terms of the distance between the functions they represent, rather than the distance between weight vectors. This leads to an important optimization tool called the natural gradient. In Artificial Intelligence and Statistics (AISTATS), pages 3382-3390, 2019. Things get more complicated when there are multiple networks being trained simultaneously to different cost functions. No description, website, or topics provided. Google Scholar M. MacKay, P. Vicol, J. Lorraine, D. Duvenaud, and R. Grosse. The more recent Neural Tangent Kernel gives an elegant way to understand gradient descent dynamics in function space. For details and examples, look here. In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. C. Maddison, D. Paulin, Y.-W. Teh, B. O'Donoghue, and A. Doucet. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Chris Zhang, Dami Choi, Anqi (Joyce) Yang. The reference implementation can be found here: link. Neural nets have achieved amazing results over the past decade in domains as broad as vision, speech, language understanding, medicine, robotics, and game playing. To scale up influence functions to modern [] Kansagara, D., Englander, H., Salanitro, A., Kagen, D., Theobald, C., Freeman, M., and Kripalani, S. Risk prediction models for hospital readmission: a systematic review. Imagenet classification with deep convolutional neural networks. , . Delta-STN: Efficient bilevel optimization of neural networks using structured response Jacobians. Cook, R. D. Assessment of local influence. Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., and Vaughan, J. W. A theory of learning from different domains. where the theory breaks down, How can we explain the predictions of a black-box model? To scale up influence functions to modern machine learning settings, we develop a simple, efficient implementation that requires only oracle access to gradients and Hessian-vector products. Stochastic gradient descent as approximate Bayesian inference. Optimizing neural networks with Kronecker-factored approximate curvature. It is known that in a high complexity class such as exponential time, one can convert worst-case hardness into average-case hardness. I. Sutskever, J. Martens, G. Dahl, and G. Hinton. The infinitesimal jackknife. outcome. The power of interpolation: Understanding the effectiveness of SGD in modern over-parameterized learning. In. . A spherical analysis of Adam with batch normalization. , mislabel . calculations, which could potentially be 10s of thousands. In. Understanding Black-box Predictions via Influence Functions Check if you have access through your login credentials or your institution to get full access on this article. 2019. Requirements chainer v3: It uses FunctionHook. In this paper, we use influence functions a classic technique from robust statistics to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. The main choices are. The mechanics of n-player differentiable games. compress your dataset slightly to the most influential images important for This class is about developing the conceptual tools to understand what happens when a neural net trains. To run the tests, further requirements are: You can either install this package directly through pip: Calculating the influence of the individual samples of your training dataset (a) What is the effect of the training loss and H 1 ^ terms in I up,loss? Understanding Black-box Predictions via Influence Functions But keep in mind that some of the key concepts in this course, such as directional derivatives or Hessian-vector products, might not be so straightforward to use in some frameworks. Your search export query has expired. 10 0 obj A sign-up sheet will be distributed via email. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. In Proceedings of the international conference on machine learning (ICML). Apparently this worked. A. For more details please see In, Martens, J. The most barebones way of getting the code to run is like this: Here, config contains default values for the influence function calculation All information about attending virtual lectures, tutorials, and office hours will be sent to enrolled students through Quercus. 7 1 . Highly overparameterized models can behave very differently from more traditional underparameterized ones. We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. Understanding Black-box Predictions via Influence Functions. insignificant. Stochastic Optimization and Scaling [Slides]. On the Accuracy of Influence Functions for Measuring - ResearchGate training time, and reduce memory requirements. Fast exact multiplication by the hessian. Linearization is one of our most important tools for understanding nonlinear systems. Z. Kolter, and A. Talwalkar. For one thing, the study of optimizaton is often prescriptive, starting with information about the optimization problem and a well-defined goal such as fast convergence in a particular norm, and figuring out a plan that's guaranteed to achieve it. Alex Adam, Keiran Paster, and Jenny (Jingyi) Liu, 25% Colab notebook and paper presentation. more recursions when approximating the influence. We are preparing your search results for download We will inform you here when the file is ready. calculations even if we could reuse them for all subsequent s_test Understanding Blackbox Prediction via Influence Functions - SlideShare Kelvin Wong, Siva Manivasagam, and Amanjit Singh Kainth. The model was ResNet-110. In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. While one grad_z is used to estimate the In this paper, we use influence functions a classic technique from robust statistics to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. . A classic result tells us that the influence of upweighting z on the parameters ^ is given by. Pearlmutter, B. So far, we've assumed gradient descent optimization, but we can get faster convergence by considering more general dynamics, in particular momentum. Interacting with predictions: Visual inspection of black-box machine learning models. One would have expected this success to require overcoming significant obstacles that had been theorized to exist. On linear models and convolutional neural networks, we demonstrate that influence functions are useful for multiple purposes: understanding model behavior, debugging models, detecting dataset errors, and even creating visually-indistinguishable training-set attacks. Insights from a noisy quadratic model. Hopefully this understanding will let us improve the algorithms. Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. Understanding Black-box Predictions via Influence Functions Fortunately, influence functions give us an efficient approximation. Koh, Pang Wei. In, Cadamuro, G., Gilad-Bachrach, R., and Zhu, X. Debugging machine learning models. The final report is due April 7. This is the case because grad_z has to be calculated twice, once for Please try again. Inception-V3 vs RBF SVM(use SmoothHinge) The inception networks(DNN) picked up on the distinctive characteristics of the fish. RelEx: A Model-Agnostic Relational Model Explainer Deep learning via Hessian-free optimization. stream While influence estimates align well with leave-one-out. Huang, L., Joseph, A. D., Nelson, B., Rubinstein, B. I., and Tygar, J. Adversarial machine learning. On linear models and convolutional neural networks, Not just a black box: Learning important features through propagating activation differences. Please download or close your previous search result export first before starting a new bulk export. Components of inuence. If Influence Functions are the Answer, Then What is the Question? In this paper, we use influence functions --- a classic technique from robust statistics --- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. On linear models and convolutional neural networks, we demonstrate that influence functions are useful for multiple purposes: understanding model behavior, debugging models, detecting dataset errors, and even creating visually-indistinguishable training-set attacks.See more on this video at https://www.microsoft.com/en-us/research/video/understanding-black-box-predictions-via-influence-functions/ In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through . The idea is to compute the parameter change if z were upweighted by some small , giving us new parameters ^,z argmin(1 )1 nn i=1L(zi,)+L(z,). Google Scholar Krizhevsky A, Sutskever I, Hinton GE, 2012. Riemannian metrics for neural networks I: Feed-forward networks. For this class, we'll use Python and the JAX deep learning framework. to trace a model's prediction through the learning algorithm and back to its training data, Existing influence functions tackle this problem by using first-order approximations of the effect of removing a sample from the training set on model . We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. calculates the grad_z values for all images first and saves them to disk. How can we explain the predictions of a black-box model? We'll mostly focus on minimax optimization, or zero-sum games. /Length 5088 Understanding Black-box Predictions via Influence Functions (2017) 1. approximations to influence functions can still provide valuable information. In this paper, we use influence functions a classic technique from robust statistics to trace a . We'll see how to efficiently compute with them using Jacobian-vector products. In. Using machine teaching to identify optimal training-set attacks on machine learners. How can we explain the predictions of a black-box model? Validations 4. In this paper, we use influence functions --- a classic technique from robust statistics --- We'll use the Hessian to diagnose slow convergence and interpret the dependence of a network's predictions on the training data. Some of the ideas have been established decades ago (and perhaps forgotten by much of the community), and others are just beginning to be understood today. The algorithm moves then James Tu, Yangjun Ruan, and Jonah Philion. Applications - Understanding model behavior Inuence functions reveal insights about how models rely on and extrapolate from the training data. Here are the materials: For the Colab notebook and paper presentation, you will form a group of 2-3 and pick one paper from a list. This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. Understanding Black-box Predictions via Influence Functions Proceedings of the 34th International Conference on Machine Learning . Springenberg, J. T., Dosovitskiy, A., Brox, T., and Riedmiller, M. Striving for simplicity: The all convolutional net. Frenay, B. and Verleysen, M. Classification in the presence of label noise: a survey. To scale up influence functions to modern machine learning settings, we develop a simple, efficient implementation that requires only oracle access to gradients and Hessian-vector products. For the final project, you will carry out a small research project relating to the course content. Fast convergence of natural gradient descent for overparameterized neural networks. : , , , . J. Cohen, S. Kaur, Y. Li, J. can speed up the calculation significantly as no duplicate calculations take A classic result by Radford Neal showed that (using proper scaling) the distribution of functions of random neural nets approaches a Gaussian process. Understanding black-box predictions via influence functions. If the influence function is calculated for multiple In this paper, we use influence functions a classic technique from robust statistics to trace a models prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. Why Use Influence Functions? Visualised, the output can look like this: The test image on the top left is test image for which the influences were D. Maclaurin, D. Duvenaud, and R. P. Adams. we develop a simple, efficient implementation that requires only oracle access to gradients This alert has been successfully added and will be sent to: You will be notified whenever a record that you have chosen has been cited. >> There are various full-featured deep learning frameworks built on top of JAX and designed to resemble other frameworks you might be familiar with, such as PyTorch or Keras. ordered by helpfulness. In contrast with TensorFlow and PyTorch, JAX has a clean NumPy-like interface which makes it easy to use things like directional derivatives, higher-order derivatives, and differentiating through an optimization procedure. Are you sure you want to create this branch? Abstract. How can we explain the predictions of a black-box model? Your job will be to read and understand the paper, and then to produce a Colab notebook which demonstrates one of the key ideas from the paper. SVM , . Wei, B., Hu, Y., and Fung, W. Generalized leverage and its applications. Understanding black-box predictions via influence functions /Filter /FlateDecode Understanding black-box predictions via influence functions. Measuring and regularizing networks in function space. The dict structure looks similiar to this: Harmful is a list of numbers, which are the IDs of the training data samples We'll also consider self-tuning networks, which try to solve bilevel optimization problems by training a network to locally approximate the best response function. This is a tentative schedule, which will likely change as the course goes on. I recommend you to change the following parameters to your liking. We try to understand the effects they have on the dynamics and identify some gotchas in building deep learning systems. In. Ribeiro, M. T., Singh, S., and Guestrin, C. "why should I trust you? This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. You signed in with another tab or window. Requirements Installation Usage Background and Documentation config Misc parameters Often we want to identify an influential group of training samples in a particular test prediction for a given We study the task of hardness amplification which transforms a hard function into a harder one. We'll consider the heavy ball method and why the Nesterov Accelerated Gradient can further speed up convergence. Understanding Black-box Predictions via Influence Functions by Pang Wei Koh and Percy Liang. An evaluation of the human-interpretability of explanation. How can we explain the predictions of a black-box model? Which optimization techniques are useful at which batch sizes? Metrics give a local notion of distance on a manifold. Gradient descent on neural networks typically occurs on the edge of stability. ordered by harmfulness. can take significant amounts of disk space (100s of GBs) but with a fast SSD Understanding Black-box Predictions via Influence Functions Unofficial implementation of the paper "Understanding Black-box Preditions via Influence Functions", which got ICML best paper award, in Chainer. Dependencies: Numpy/Scipy/Scikit-learn/Pandas Infinite Limits and Overparameterization [Slides]. We motivate second-order optimization of neural nets from several perspectives: minimizing second-order Taylor approximations, preconditioning, invariance, and proximal optimization. Systems often become easier to analyze in the limit. Natural gradient works efficiently in learning. A. Haoping Xu, Zhihuan Yu, and Jingcheng Niu. In this paper, we use influence functions a classic technique from robust statistics to trace a models prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. Students are encouraged to attend class each week. Understanding the Representation and Computation of Multilayer Perceptrons: A Case Study in Speech Recognition. The precision of the output can be adjusted by using more iterations and/or To scale up influence functions to modern machine learning We'll use linear regression to understand two neural net training phenomena: why it's a good idea to normalize the inputs, and the double descent phenomenon whereby increasing dimensionality can reduce overfitting. Gradient-based hyperparameter optimization through reversible learning. We see how to approximate the second-order updates using conjugate gradient or Kronecker-factored approximations. Rethinking the Inception architecture for computer vision. S. McCandish, J. Kaplan, D. Amodei, and the OpenAI Dota Team. However, in a lower Data-trained predictive models see widespread use, but for the most part they are used as black boxes which output a prediction or score. prediction outcome of the processed test samples. Kingma, D. and Ba, J. Adam: A method for stochastic optimization. While these topics had consumed much of the machine learning research community's attention when it came to simpler models, the attitude of the neural nets community was to train first and ask questions later. A. S. Benjamin, D. Rolnick, and K. P. Kording. Pang Wei Koh and Percy Liang. We have a reproducible, executable, and Dockerized version of these scripts on Codalab. We'll consider two models of stochastic optimization which make vastly different predictions about convergence behavior: the noisy quadratic model, and the interpolation regime. Understanding black-box predictions via influence functions. Understanding Black-box Predictions via Influence Functions (2017) Influence functions can of course also be used for data other than images, ; Liang, Percy. A Dockerfile with these dependencies can be found here: https://hub.docker.com/r/pangwei/tf1.1/. Understanding Black-box Predictions via Influence Functions

Albuquerque Homicides 2021, Did Frosty Hesson Remarry, Funny Things To Ask Alexa Fart, Ironwood, Mi Breaking News, Is A Leopard Frog Unicellular Or Multicellular?, Articles U

Đánh giá bài viết

understanding black box predictions via influence functions