Concept Learning in Description Logics
Tutorial at KR 2023
Presenters
Maurice Funk, Leipzig University
Jean Christoph Jung, TU Dortmund University
Axel-Cyrille Ngonga Ngomo, Paderborn University
Frank Wolter, University of Liverpool
Introduction
Description Logics (DLs) are a family of well-established languages for formulating ontologies and knowledge bases. Most notably, they form the logical underpinning of the OWL standard for web ontology languages. The manual curation of DL ontologies or, more generally, DL knowledge bases is a time-consuming and expensive task and automated methods for supporting the user are urgently needed. In this tutorial, we focus on automated learning of DL concepts. Concepts are the main building blocks of DL ontologies and are also used for querying DL knowledge bases. The underlying idea is that of learning from examples, that is, given positive and negative examples (mostly in the form of labeled databases), the goal is to find a DL concept that fits these examples, typically in the presence of an ontology. A fitting concept can be thought of as a classifier that separates the positive from the negative examples. Hence, there is a close link to inductive logic programming and machine learning in general. The very idea of learning from examples can be traced back at least to the 1970s. The need for automated support has been recognized by the Knowledge Representation and Reasoning and Semantic Web communities, and consequently, quite some progress has been made in the recent years both on the theoretical and the practical side.
The goal of this tutorial is to give an introduction to the field of concept learning and highlight the most important lines of research in that area. We will cover both foundational results and practical aspects. On the theoretical side, we will discuss the separability problem, that is, given examples decide whether there is a fitting concept, and learning of concepts in Angluin’s framework of exact learning. We will also explain the relationship of the latter to learning in the probably approximately correct (PAC) sense. On the practical side, we will present an overview over existing implementations and the main underlying methods. We mostly focus on DLs from the popular ALC and EL families. It is worth mentioning that there is also the related field of ontology learning, which, however, will not be part of the tutorial.
Intended Audience
We aim to present the material in such a way that it is accessible to all KR attendees with a solid background in computer science and a good understanding of formal logic. Naturally, those familiar with description logics and description logic knowledge bases will benefit the most, but the presentation will be self-contained and attendees not familiar with description logics will be able to follow.
We pursue the following learning goals:
- the attendee will understand the general setup of concept learning;
- the attendee will know the main dimensions of concept learning and will be able to access the literature in the field in an informed way;
- the attendee will have a basic understanding of the separability problem, in particular its computational complexity and its relationship to Craig interpolation and ontology-mediated query answering;
- the attendee will know the state-of-the-art in systems for concept learning and appreciate the main challenges for improving these;
- the attendee will be able to learn concepts in a game-like fashion in Angluin's framework of exact learning.
Agenda
1. | Introduction (15 minutes) Slides |
2. | The Separability Problem (50 minutes) Slides |
3. | Neurosymbolic Concept Learning (50 minutes) Slides |
4. | Exact and PAC Learning of Concepts (50 minutes) Slides |
5. | Summary and Conclusion (15 minutes) Slides |
Content
The Separability Problem
We define separability as the fundamental decision problem underlying the concept learning task: given a knowledge base consisting of an ontology O and a database D and sets P and N of positive and negative examples from D, does there exist a concept C that fits P,N in the sense that C applies to all constants in P and to no constant in N? We discuss the following dimensions of this problem:
- weak separability vs strong separability: while demanding that O and D entail C(d) for all d in P is uncontroversial, due to the open world semantics adopted in the presence of ontologies, we might demand that O and D either do not entail C(d) for all d in N (weak separability) or the stronger condition that O and D entail the negation of C(d) for all d in N (strong separability).
- projective vs non-projective separability: while in existing systems one typically only admits symbols from O and D in C (non-projective case), it turns out that sometimes fresh symbols in C (projective case) lead to more succinct and natural fitting concepts.
- separability with or without signature restrictions: one might admit any symbol that occurs in O and D in C (case without signature restrictions) or only symbols related to a topic the user is interested in (case with signature restrictions).
All considered dimensions have their applications and, interestingly, lead to quite different characteristics of the separability problem. In the tutorial, we will particularly discuss the influence of the choices in terms of computational complexity (of deciding separability) and the link to ontology-mediated query answering and Craig interpolation.
Neurosymbolic Concept Learning
Common implementations of concept learning are based on the iterative refinement of concepts within a partially ordered search space. We will first present the foundations of this common paradigm and discuss some of its advantages. We will also consider the practical limitations of this approach to implementing concept learners for DLs from the ALC family and discuss recent approaches that aim to address these limitations. In particular, we will explore means to improve:
- the retrieval function: The retrieval of all instances of a given concept C is currently the main bottleneck in systems for concept learning. By considering the learning problem under the closed-world assumption, we will show that instance retrieval can be mapped to tensor operations, which can be performed efficiently.
- the quality function: Concept learning can be mapped to reinforcement learning and shown to be captured well by deep Q-learning under the assumption of the existence of set embeddings. We will discuss how the idea of traversing a refinement tree using classical implementations of concept learning is equivalent to myopic learning and discuss means to improve search by means of a better approximation of the cumulative discounted reward attached to each concept.
Exact and PAC Learning of Concepts
Exact learning is a game-like protocol between a learner and a teacher for learning formal objects and has been introduced in a seminal paper by Angluin. Applied to concept learning, the teacher has a (secret) target concept in mind which the learner wants to identify. For doing so, the learner can ask certain queries which the teacher has to answer faithfully, but not necessarily in the most helpful way. The main question is whether the learner has an efficient strategy for finding the target concept. The motivation of considering this setup is in applications in which logic and and domain expertise are in different hands, which is commonly the case when the ontology designer (logician, learner) interacts with a domain expert (non-logician, teacher). We will consider mainly the Horn DLs EL and ELI and allow membership and/or equivalence queries and highlight when efficient learning is (not) possible.
It is well-known that an efficient learning strategy in the framework of exact learning can be transformed into an efficient algorithm that learns from examples which are sampled from an unknown distribution in a probably approximately correct (PAC) sense, that is, with high probability the learning algorithm makes small classification
error over the unseen examples. These PAC algorithms, however, often rely on an oracle that answers membership queries. We also report on boundaries on the existence of PAC algorithms that work without such oracles, but still in a preferably efficient way.