Prediction Models

Additional materials on prediction models, related to lessons on gnomes, quadrilaterals, and the animal tree

Predictive models are one of the most important and prominent areas of artificial intelligence. They can be linked to a variety of school subjects — they are applicable whenever we have distinct groups or categories of things, and the category a particular item belongs to depends on its characteristics. We've prepared some examples: for instance, quadrilaterals are classified into squares, parallelograms, kites, and so on based on the equality of their side lengths, the parallelism of their sides, and the perpendicularity of their sides and diagonals. Animals are classified into groups based on whether they have feathers, breathe through gills, produce milk, lay eggs, etc.

This page is organized as follows:

We start with an Introduction, which contains some ideas on how to present predictive models in the classroom.
Next, we provide some Background on Models. We recommend familiarizing yourself with at least one of the lessons before reading this section (the simplest is What are the Gnomes Doing).
Then, we describe the technical details of three lesson ideas:
- In What are the Gnomes Doing, we observe dwarfs performing different jobs. The profession of some dwarfs is known, while for others, we must determine it based on their appearance. This lesson serves as a simple introduction to predictive models, specifically decision trees.
- Identifying Quadrilaterals is part of mathematics for the end of sixth or the beginning of seventh grade. In this lesson, students review geometric concepts and classify quadrilaterals. They construct a quadrilateral tree and observe how a computer constructs it from the same data.
- In The Animal Tree, students review characteristics of mammals, birds, amphibians, etc., and help the computer build an animal classification key by gradually adding new animals from different groups and describing their characteristics.

This material is being developed as part of the DALI4US project.

Chapter 1: Introduction

I vividly remember how Professor Ivan Bratko defined artificial intelligence for us computer science students. He said that it's hard to define precisely, so the best way to describe it is that something (a program, a computer, etc.) is "artificially intelligent" if it imitates human intelligence. With his characteristic smile, he added that this way, we leave the definition to the psychologists.

By definition, then, AI researchers have always been interested in mimicking skills that humans possess—well, at least those of us who are intelligent. :) One such skill is prediction.

The example is fictional. I have no idea whether the weather in Vipava actually depends on St. Sophia. As a resident of Ljubljana, my years of experience only allow me to predict that from September to March, it will be foggy.

Let's take a farmer from the Vipava Valley in Slovenia a hundred years ago, preparing for a task whose success depended on the weather in the coming days. Even without a radio weather forecast, he knew well how the weather the day after tomorrow depended on the wind direction and whether Mount Nanos had a "cap" of clouds. To orient himself with dates and the dynamics of approaching weather fronts, he relied on the fact that if there was no frost on St. Pancras' Day, it would usually rain on St. Sophia's Day (about three days later).

Such weather proverbs was based on years of experience. Intelligent people collected data and formulated rules over decades—not on paper but by memory. Even my father could recall the exact date of the first snowfall in many years of his youth.

Of course, drawing conclusions from observed cases is not limited to weather. In various fields, we can collect data, recognize patterns, and create rules that help determine unknown properties in new cases.

The branch of artificial intelligence that imitates this skill — just as AI, by definition, imitates human abilities — is (machine learning). Based on data describing certain properties (attributes, features, variables), we attempt to construct a prediction model that can tell an outcome or determine a certain characteristic, such as the type of an object.

The terms prediction and prediction model can be misleading. This isn't necessarily about forecasting future events but rather about inferring or determining a property that is not yet known. Yes, meteorologists predict the weather. But when we visit a doctor, they determine—based on symptoms—which (hopefully none) disease we have. This, too, is called "prediction."

Computer-generated predictive models are everywhere. They are not only used by meteorologists: banks have systems that detect suspicious transactions, email inboxes filter out spam, mobile phone providers try to predict which customers are likely to switch to competitors, pharmaceutical companies analyze the effects of new compounds, and cars are becoming increasingly precise at recognizing road obstacles. Predictive models have long been an invisible part of our daily lives.

Modern prediction models are complex. Building deep neural networks requires massive datasets, sophisticated algorithms, powerful computers, vast storage, and a tremendous amount of energy to run. However, simple models can be built even in a classroom.

Chapter 2: Background: Models

The essence of this activity is to introduce the concept of a prediction model. A model can be built manually, based on our knowledge of a particular domain. However, machine learning algorithms construct models based on training examples.

A model is expressed in a specific formal structure. What students first discover are classification rules (decision rules). When we guide them to rewrite these rules in the form of a tree, we obtain a classification tree or decision tree.

A classification tree consists of internal nodes where, based on a certain attribute, a decision is made to follow one of the tree’s branches. The endpoints (which computer scientists call leaf nodes) classify every item into a specific group.

Decision trees were among the first models built using machine learning. Although they are not as powerful as modern models, they are still widely used as examples due to their simplicity.

A model captures the essential properties of the training data. For classifying future cases (such as new gnomes whose roles we do not yet know but want to predict), we no longer need the training data—the model alone is sufficient.

Machine learning generalizes individual cases into general rules. The more examples we have, and the more accurate they are, the more reliable the model becomes.

Even in fourth grade, after completing such a lesson, we were already discussing how a similar model could be created to determine whether a particular student likes a specific school meal. We would need to collect enough different meal examples, describe their characteristics, and record the student's opinion. We then explained to the students that similar, though more complex, models are used in real life—to predict whether someone has a certain disease (and which one), to forecast tomorrow’s weather, and to determine whether a particular medication will be effective.

Chapter 3: Gnomes

What are the Gnomes Doing should be preferably concluded by showing the model built by the computer. This provides a natural transition into a discussion about how such (and even more complex) models are used in practice for real-world applications.

To support this, the activity includes a simple ready-made workflow for Orange. It is pre-configured so that in class we only need to open the windows displaying the data, the decision tree, and the predictions. Below, we provide additional explanations to help us better understand the workflow and modify or adjust it if necessary.

The Datasets widget is set up to fetch the Gnomes dataset from the web. If we had to find it manually, we would type the beginning of the dataset name (e.g., gnom...) in the top-left field. Once the dataset appears in the lower table, we double-click to select it.

Connected to the Data File widget is the Table widget, which allows us to display the data to students. This is important because it lets them see that, besides the relevant features, the dataset also includes irrelevant attributes, such as belt color, shoe color, and hat shape. The computer will ignore these features as unimportant, just as the students did.

The Data File widget is also connected to the Tree widget, which constructs a decision tree, and to the Tree Viewer widget, which visualizes the tree. The settings of the Tree widget are not crucial here and should not affect the tree’s structure.

The Predictions widget requires two inputs: the data for which it should generate predictions (from the Data File widget) and the predictive model (from the Tree widget). This widget displays various outputs that we do not need; to reduce clutter, the workflow disables the display of probabilities and classification errors (top section) as well as evaluation results (between the tables or at the bottom when the lower table is hidden).

Expanding the Workflow: Decision Rules

Students (or teachers! 😊) might be curious whether the computer can also find rules similar to the ones they created at the beginning, such as "if it has a buckle and no shovel, it’s a tailor." These are called decision rules.

We can attach the CN2 Rule Induction widget to the dataset and connect it to the CN2 Rule Viewer widget.

To obtain rules similar to those created by students, we set the quality measure to Laplace accuracy.

Additionally, we can experiment with Ordered or Unordered rules. With ordered rules, we read them one by one: a rule applies only if no previous rule applies to a dwarf. With unordered rules, the order does not matter. The most suitable approach depends on how students expressed their rules.

We can view the rules in the CN2 Rule Viewer widget. To improve readability, we can enable the Compact View.

Chapter 4: Quadrilaterals

Quadrilaterals can be conducted entirely on paper, on a computer, or as a combination of both. Here, we describe some details related to the computer-based implementation.

Using the Website for Data Entry

The data entry form is prepared on the page https://data.pumice.si/quadrilaterals. By clicking "Create activity page" we get two links: the first one is entered into the tablets or computers that students will use for data entry. The second one is used in Orange to access the collected data.

We can conduct the lesson by distributing a single set of quadrilaterals to the whole class. In this case, we only need one link.

If we divide students into groups, each with their own set, we can decide whether all groups will enter data into a shared table (which is perfectly fine—in fact, it makes spotting errors easier) or whether each group will have a separate table. To create separate tables, simply refresh the browser page (F5, Ctrl-R, or Cmd-R) and click "Create activity page" again.

Students will see a form on the link, similar to the physical cards. When they enter the shape's number, they will get an image of the shape, circles for marking properties, and a shape selector.

They "color" the circles by clicking on them. If they click multiple times, the color changes: the first circle turns red on the first click, blue on the second, and becomes empty again on the third. The third circle behaves the same way. (No worries, students will figure this out quickly. 😊)

Each shape must also be assigned a category before submitting the response and moving on to the next one.

Workflow

The workflow in Orange is quite extensive, but almost everything is preconfigured, so there’s not much that can go wrong. Just in case, let's go through what each component does and what to watch out for.

The two key components are Tree Viewer and Predictions. The first one allows us to examine the assembled model (the tree), and the second one is used to review its predictions. Both of these components obviously need a decision tree, which they receive from the Tree component.

Now, let's look at the beginning: there are two File components. The upper one loads the quadrilateral examples used for building the tree — hence, it is connected to the Tree component. The lower one loads the test cases and is connected to Predictions.

Additionally, there are three components for observation. The first Table component is connected to the input data. This allows us to show students that the computer learns (i.e., constructs the tree) from the same data they used. The second Table component and the Image Viewer are connected to Tree Viewer. By clicking on different parts of the tree, these two components display the corresponding shapes.

This serves two purposes. First, it helps demonstrate how the computer categorizes shapes in the same way students sorted them on their desks. Later, when working with the data collected by students, we use this Table and Image Viewer to identify errors in the dataset.

File

Both File components are preset to read data from the website. Each must have a selected URL: the first should point to https://pumice.si/en/quadrilaterals/resources/quadrilaterals.xlsx, and the second to https://pumice.si/en/quadrilaterals/resources/quadrilaterals-test.xlsx. If these links stop working due to website updates (and we forget to update the workflow and this document—apologies in advance! 😊), you can find the correct links in the box at the top of the activity description.

To train the model using student data, replace the URL with the one obtained from the data entry page. The component remembers a few recent links: if you want to switch back to the original data or toggle between datasets from different groups, click the arrow next to the URL input field.

Tree

The Tree component has multiple settings, but the most important one for us is Build binary tree. When enabled, the algorithm constructs a tree where each node has only two branches. For properties with multiple possible values, some values will be grouped together. For example, under the criterion Equal-length sides, one branch might contain All, while the other groups the remaining options: Two pairs and None.

Binary trees have both theoretical and practical advantages, but we don't focus on this during the lesson. Since students create trees with up to three branches per node, it's best to disable Build binary tree to match their approach.

The other options influence how the algorithm handles errors in the data. For example, if a dataset contains 50 rhombuses and only 5 squares, it's likely that the squares were misclassified, so ignoring them might make sense. However, in this lesson, we don't have such large datasets, and mistakes will be noticeable regardless of these settings.

Tree Viewer

The most important thing to know about Tree Viewer is that clicking on a node will send the corresponding shapes to the connected components — Image Viewer and Table in our case.

Each internal tree node has a small circle where branches originate. Clicking this circle hides (or reveals with another click) the subtree below it. This can be useful when explaining the tree-building process step by step.

On the left side, we can zoom out the tree using the Zoom and Width controls.

Image Viewer

The Image Viewer component has two key settings.

The first setting determines the column name in the dataset that contains the image URLs. In the pre-configured workflow, this value is already set. If something goes wrong, the component tries to guess by scanning the column contents. If it guesses incorrectly, we can manually set it to "Image URL".

The second setting specifies the column name for the text displayed under each image. When working with student-entered data, the "Shape" column contains their classifications, while "True Shape" holds the correct answers. When using pre-prepared data, both columns contain the same values.

No one will scold us—or even give us a disapproving look—if we choose a property instead of the shape name. In fact, this is one of the easiest ways to spot mislabeled properties. If a shape in the image has two pairs of parallel sides, the text beneath it definitely shouldn’t say that it has none!

Chapter 5: Animals

Even though the lesson Animal Tree relies entirely on Orange, even a less experienced teacher has nothing to worry about—everything is already set up. The only thing to remember is that after making any changes to the Excel spreadsheet, we must save it in Excel and reload it in Orange.

The Excel spreadsheet contains two hidden rows. It's best to leave them as they are. But just so nothing remains hidden from you, here’s what they contain:

Row 2 defines the data type for each column. It specifies that the first column contains only animal names, so this column should not appear as a criterion in the decision tree.
Row 3 determines the role of each column. It indicates that the second column contains the target variable—the characteristic we want to predict. Without this, Orange might assume we want to predict, for example, whether an animal can fly based on its classification. (In reality, it wouldn’t; instead, it would show an error saying it doesn’t understand what we’re asking for.)