Computer Science
Grade 10
20 min
Unsupervised Learning
Unsupervised Learning
Tutorial Preview
1
Introduction & Learning Objectives
Learning Objectives
Define unsupervised learning and contrast it with supervised learning.
Explain the primary goal of clustering algorithms.
Identify at least three real-world applications of unsupervised learning.
Describe the key terms: feature, cluster, and centroid.
Trace the steps of the K-Means clustering algorithm on a small, 2D dataset.
Explain how the 'K' in K-Means affects the outcome of the algorithm.
Ever wonder how services like Spotify or Netflix can suggest a new song or movie you might love, without you ever telling them what to look for? 🤔 That's the magic of finding hidden patterns!
In this lesson, we'll explore Unsupervised Learning, a type of machine learning where the computer learns to find patterns in data all by itself, without...
2
Key Concepts & Vocabulary
TermDefinitionExample
Unsupervised LearningA type of machine learning where the algorithm is given data without explicit labels and must find patterns or structure on its own.Giving a program thousands of customer purchase histories and asking it to find natural groupings of customers, without telling it what the groups should be (e.g., 'budget shoppers', 'brand loyalists').
ClusteringThe task of grouping a set of objects (data points) in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other groups.A clustering algorithm could analyze data from a social media platform and automatically group users into clusters based on their interests, like 'sports fans', 'movie buffs', and 'gamers&#...
3
Core Syntax & Patterns
The K-Means Algorithm Pattern
1. Initialize: Randomly place K centroids. 2. Assign: Assign each data point to its nearest centroid. 3. Update: Recalculate the position of each centroid to be the mean of all points assigned to it. 4. Repeat: Repeat steps 2 and 3 until the centroids no longer move significantly.
This iterative process is the core logic behind K-Means. It's used to find the best possible groupings (clusters) in the data by repeatedly refining the cluster centers.
Euclidean Distance Formula
For two points (x1, y1) and (x2, y2), the distance is: √((x2 - x1)² + (y2 - y1)²)
This formula is used in the 'Assign' step of K-Means to determine which centroid is 'nearest' to a given data point. It's the straight-line distance between two poi...
4 more steps in this tutorial
Sign up free to access the complete tutorial with worked examples and practice.
Sign Up Free to ContinueSample Practice Questions
Challenging
You have data points P1(2,2), P2(2,4), P3(8,6), P4(10,6). You initialize K-Means with K=2 and centroids C1(3,3) and C2(9,5). After the first 'Assign' step, which points belong to Cluster 2 (the cluster for C2)?
A.Only P3
B.Only P4
C.P3 and P4
D.P2, P3, and P4
Challenging
Using the result from the previous question, where Cluster 2 contains P3(8,6) and P4(10,6), what are the coordinates of the new C2 after the first 'Update' step?
A.(9, 6)
B.(9, 5)
C.(8, 6)
D.(10, 6)
Challenging
A dataset consists of two long, thin, parallel groups of points (like two parallel lines). You run K-Means with K=2. The initial centroids are randomly placed, one in the middle of each line. What is the most likely final outcome?
A.perfect clustering, with one cluster for each line of points.
B.poor clustering, where each cluster contains half of the points from *both* lines.
C.The algorithm will fail to run because the data is not spherical.
D.The algorithm will merge the two clusters into one.
Want to practice and check your answers?
Sign up to access all questions with instant feedback, explanations, and progress tracking.
Start Practicing Free