Why Some Models Leak Data

Machine learning models use large amounts of data, some of which can be sensitive. If they're not trained correctly, sometimes that data is inadvertently revealed.

Let’s take a look at a game of soccer.

Using the position of each player as training data, we can teach a model to predict which team would get to a loose ball first at each spot on the field, indicated by the color of the pixel.

It updates in real-time—drag the players around to see the model change.

This model reveals quite a lot about the data used to train it. Even without the actual positions of the players, it is simple to see where players might be.

Click this button to move the players

Take a guess at where the yellow team’s goalie is now, then check their actual position. How close were you?

Sensitive Salary Data

In this specific soccer example, being able to make educated guesses about the data a model was trained on doesn’t matter too much. But what if our data points represent something more sensitive?

We’ve fed the same numbers into the model, but now they represent salary data instead of soccer data. Building models like this is a common technique to detect discrimination. A union might test if a company is paying men and women fairly by building a salary model that takes into account years of experience. They can then publish the results to bring pressure for change or show improvement.

In this hypothetical salary study, even though no individual salaries have been published, it is easy to infer the salary of the newest male hire. And carefully cross referencing public start dates on LinkedIn with the model could almost perfectly reveal everyone’s salary.

Because the model here is so flexible (there are hundreds of square patches with independently calculated predictions) and we have so few data points (just 22 people), it is able to “memorize” individual data points. If we’re looking to share information about patterns in salaries, a simpler and more constrained model like a linear regression might be more appropriate.

By boiling down the 22 data points to two lines we’re able to see broad trends without being able to guess anyone’s salary.

Subtle Leaks

Removing complexity isn’t a complete solution though. Depending on how the data is distributed, even a simple line can inadvertently reveal information.

In this company, almost all the men started several years ago, so the slope of the line is especially sensitive to the salary of the new hire.

Is their salary higher or lower than average? Based on the line, we can make a pretty good guess.

Notice that changing the salary of someone with a more common tenure barely moves the line. In general, more typical data points are less susceptible to being leaked. This sets up a tricky trade off: we want models to learn about edge cases while being sure they haven’t memorized individual data points.

Real World Data

Models of real world data are often quite complex—this can improve accuracy, but makes them more susceptible to unexpectedly leaking information. Medical models have inadvertently revealed patients’ genetic markers. Language models have memorized credit card numbers. Faces can even be reconstructed from image models:

Fredrikson et al were able to extract the image on the left by repeatedly querying a facial recognition API. It isn’t an exact match with the individual’s actual face (on the right), but this attack only required access to the model’s predictions, not its internal state.

Protecting Private Data

Training models with differential privacy stops the training data from leaking by limiting how much the model can learn from any one data point. Differentially private models are still at the cutting edge of research, but they’re being packaged into machine learning frameworks, making them much easier to use. When it isn’t possible to train differentially private models, there are also tools that can measure how much data is the model memorizing. Also, standard techniques such as aggregation and limiting how much data a single source can contribute are still useful and usually improve the privacy of the model.

As we saw in the Collecting Sensitive Information Explorable, adding enough random noise with differential privacy to protect outliers like the new hire can increase the amount of data required to reach a good level of accuracy. Depending on the application, the constraints of differential privacy could even improve the model—for instance, not learning too much from one data point can help prevent overfitting.

Given the increasing utility of machine learning models for many real-world tasks, it’s clear that more and more systems, devices and apps will be powered, to some extent, by machine learning in the future. While standard privacy best practices developed for non-machine learning systems still apply to those with machine learning, the introduction of machine learning introduces new challenges, including the ability of the model to memorize some specific training data points and thus be vulnerable to privacy attacks that seek to extract this data from the model. Fortunately, techniques such as differential privacy exist that can be helpful in overcoming this specific challenge. Just as with other areas of Responsible AI, it’s important to be aware of these new challenges that come along with machine learning and what steps can be taken to mitigate them.


Adam Pearce and Ellen Jiang // December 2020

Thanks to Andreas Terzis, Ben Wedin, Carey Radebaugh, David Weinberger, Emily Reif, Fernanda Viégas, Hal Abelson, Kristen Olson, Martin Wattenberg, Michael Terry, Miguel Guevara, Thomas Steinke, Yannick Assogba, Zan Armstrong and our other colleagues at Google for their help with this piece.

More Explorables