Aggregation
When you combine data from many different sources or times in order to lower the possibility of a single individual being identified.
Augment
When a machine, software, or function extends a person’s abilities or potential while maintaining their agency.
Automate
When a machine, software, or function performs a task without user involvement.
Binary Classification
Binary classification: when an ML model predicts if an example falls into one category or another based on a set of features.
Classification
When a machine learning model identifies an object. In response to an identification question, the simplest classification is “yes” or “no”. For example, if a model was shown a picture of a cat, it could classify it as “Cat”, or “Not a cat”. More complex classifications are sorting items into one of several groups.
Confidence Level, Model Confidence
The confidence level for a model is a statistical measure of how certain a prediction or outcome is.
Context Errors
Situations when the product output doesn’t make sense in the user’s current context. Often, this output is perceived as irrelevant by the user.
Counterfactuals
Rationale for why something is classified as not within the given class. Usually in the form of a statement of how the world would have to be different for a desirable outcome to occur.
Data Cascades
Compounding events that cause negative, downstream effects from data issues, and result in technical debt over time.
Data Collection and Labeling
How product teams get the data they need and apply meaningful labels to it. For example: acquiring millions of images of cats and dogs correctly labeled as “cat” or “dog”.
Data Distribution
Shows frequency of specific values within a dataset. For example, your could find that your data includes a high number of certain values, and lower numbers of others. Usually follows “normal” distribution, or a Gaussian curve.
Data Examples
Lines in a dataset or specific pieces of data, such as a photo of a shoe or run route.
Data Features
An individual measurable property or characteristic of an observable entity. Feature should be informative, discriminating, and independent.
Data Labels
Human-added descriptions for a piece of data, or example.
Explicit Data Collection
When you request information from users outright, like in feedback forms.
Explicit Feedback
Information solicited from users from within your app. For example: rating systems, review requests, forms, or surveys.
False Negatives
When the ML algorithm classifies an object as not in a certain category, when it actually is. For example, if it was searching for sneakers, and it didn’t return several true images of sneakers.
False Positives
When the machine learning algorithm classifies an object as belonging to a certain category, but it is not in that category. For example, if the algorithm incorrectly identified a sneaker as a llama.
Features
Distinct data sources or machine learning calculations that influence a prediction or outcome.
Folk Theories
Invented (and usually false) ideas of how a product works based on existing mental models and assumptions.
General System Explanations
Descriptions of general system functionality, i.e. how and why it uses inputs to generate outputs.
Heuristic-Based
Based on static if-then functions, or rules based on desired situation-result pairs. If a certain situation arises, the software produces a specific result, every time.
Implicit Data Collection
When you gather information about users passively, usually through logging behavior.
Implicit Feedback
Information about user behaviors, preferences, and needs that’s gathered from their interactions within your application or product. Often uses logging — records of what people do within your app.
Inter-labeler Reliability
A measure of consensus between different labelers performing the same task. Also known as inter-labeler agreement, or concordance.
Labeler
A person who labels the data used to train machine learning algorithms, specifically supervised learning models.
Synonym for rater.
Labeling/Labeled
A label is the description that is either given to a piece of data by a human or derived from user actions. For example, labeling a photo as “sneakers”, or run route as “hilly”.
ML Model
Mathematical algorithm that learns the statistical relationships among examples to make predictions in the future.
Machine Learning
Techniques and methods to program computers to execute tasks without super-specific rules. ML can help machines recognize patterns and adjust to unique situations.
Machine Learning (ML) Systems
Techniques and methods to develop AI, by getting computers to do something without being programmed with super-specific rules. ML can help machines recognize patterns and adjust to unique situations.
Mental Model
Users’ internal explanations of how something works. They shape how users interact with a product or feature and it’s perceived value.
N-Best, N-Best Classifications, N-Best Lists
Refers to showing a certain number, “n”, top solutions or suggestions, such as the top 5 matches for an image search.
Network Effect
When a person starts or stops using a product or service because the majority of their network is using it or not.
Overfitting
When a model is optimized for predictive power for a training dataset that is narrower than the ML model’s intended use.
Partial Explanations
Messages that explain one aspect of how the system works. Ideally, this is the most important aspect to the user.
Precision
The proportion of true positives correctly categorized out of all the true and false positives.
Predictive Power
A percentage that refers to an ML models’ ability to correctly predict outcomes given a certain input. A model with predictive power of 100 gives the correct prediction every time, 0 is purely random.
Probabilistic
Situations where there are multiple possible outcomes, each having varying degrees of certainty of its occurrence.
Progressive Disclosures
A practice in UX when more information is revealed in subsequent screens or interactions.
Qualitative Feedback
Non-numeric feedback about how a user feels about a certain experience. Can include measures of satisfaction, happiness, verbal responses or other qualities.
Quantitative Feedback
Feedback that is numeric or converted to a number. Both implicit and explicit feedback mechanisms can be quantitative. This feedback can be fed back into your model for tuning.
Rater
Synonym for labeler.
Recall
The proportion of true positives correctly categorized out of all the true positives and false negatives.
Redaction
When some pieces of a dataset or profile are removed to lower the possibility of identifying a single user based on their data profile. You can redact certain features of data to shrink the data profile, or redact examples for a certain amount of time.
Regressions
Also known as. linear regression algorithms, which try to find the best-fit line for a plot of data points on a graph. As new data points appear over time, the algorithm adjusts the line to fit.
Reward Function
Mathematical equation that your ML algorithm uses to optimize outputs. The function weighs some results as better than others, and optimizes for certain outcomes.
Second-order Effects
When the aggregate or outcomes or behaviors over time produces additional, unexpected outcomes.
Specific Output Explanations
Descriptions of how a system arrives at a specific output based on a certain input.
Supervised Learning
When you “teach” your algorithm on training data. Often this is based on examples manually labeled by humans to show “right” and “wrong” answers.
Test Data
Datasets that you use to test your ML model to make sure its predictions work on data it hasn’t encountered before.
Training Data
Datasets that you use to teach your ML model which outcomes correspond to which inputs.
Transparency
Providing information about how a product works, including data sources, terms and conditions, privacy, permissions, and rationale behind system output.
True Negatives
When the machine learning algorithm classifies an object as NOT in a certain category and it is indeed not in that specific category. For example, it correctly classifies a llama as “not a sneaker”.
True Positives
When the machine learning algorithm classifies an object in a certain category, and the object is in that category.
Tuning
When developers adjust their machine learning algorithm based on feedback or errors to improve accuracy and performance.
Underfitting
When a model has a low predictive power across a more varied dataset.