Slides are at the bottom of the page! until I correct it—there’s a typo on slide 46, the conjunction and disjunction symbols are backwards (true: ^ = and ^ v = or). Latex should have caught this error :)
This section is largely based on Sowa’s Conceptual Structures; Sussman and Abelsons’s SICP, Suppe’s Introduction to Logic, Partee et al’s Mathematical Methods in Linguistics, Cruse’s Lexical Semantics and Thinking about Android Epistemology.
We’re concerned with the problem with getting knowledge into computers 1
We can load the knowledge into the computer, or we can build a learning system that builds its own representation from examples.
Assuming we’ve got the knowledge, how do we use it to solve problems? (And what problems are worth solving?)
Knowledge is more than just a static set of fats; it involves the ability to use those “facts” to interact with the world.
Teddy Ruxpin needs an interface with the environment.
Information retrieval problem. Which knowledge is relevant
goal: provide a method for “correct reasoning”.
Bertrand Russell and Rudolf Carnap showed how Frege’s logic w/ set theory ,could describe the constructions of objects from the data of sensation. Russell and Whitehead tried to reduce mathematics to logic; Godel’s showed these formal systems are incomplete.
Math is useful because it distills the idea to its essence (simplification) and is a precise and universal way of describing structures. For example, which would you prefer?
Sometimes the elements are never defined but left as abstractions that can be represented in different ways.
Curly brace notation. Order doesn’t matter.
One way to list sets is just to describe their members.
{5, 33, 2, 445}
{California, Colorado, Connecticut}
When its hard to enumerate, you can define some rule or property.
{x | tall(x) and cold(x) }
The complete enumeration of the elements in a set is its extension. This can be tedious when sets get big, so we can resort to definitions.
{x | x<3 and x>1 }
But this can cause problems, Russell’s paradox:
U=\{x|x ∉ x\}
U ∈ U?
Some Set theoretic equalities:
Associative Laws (X ∪ Y) ∪ X=X ∪ (Y ∪ Z) …
These make up Boolean algebra.
All, every, each, not
Logic has a long history, beginning with Aristotle ( 4th BC). First began to be combined with math by Leibniz (17).
There are many good books on the subject 2.
Sentences have truth values, and map to {True, False}. For example:
P = “it will rain today”
Q = “I will wear my umbrella hat”
IF it is raining | P -> Q
THEN I will wear an umbrella hat |
Could be complete irrelevant:
Here is inference.
P Q | P -> Q | !P v Q
-------------------------
T T | T | T
T F | F | F
T T | T | T
F F | T | T
Applying truth-preserving transformations
Either a) check consistency—try to derive a contradiction (?) or b) deriving conclusion from premises.
Leibniz (1679)’s Universal Characteristic, concepts are combinations of primitive concepts (each a prime number) that can be combined to create products of primes (more abstract concepts).
2 = ALIVE
3 = MOSTLY-WATER
5 = IS-STRUGGLING-ARTIST
7 = WORKS-IN-E15–320G
11 = STUDIES-AI
DUSTIN = 2 x 3 x 7 x 11 = 462
CONCEPT % PROPERTY == 0—> true
CONCEPT % PROPERTY != 0—> false
From Sun’s Java OOP Tutorial:
“Software objects are conceptually similar to real-world objects: they too consist of state and related behavior. An object stores its state in fields (variables in some programming languages) and exposes its behavior through methods (functions in some programming languages). Methods operate on an object’s internal state and serve as the primary mechanism for object-to-object communication. Hiding internal state and requiring all interaction to be performed through an object’s methods is known as data encapsulation — a fundamental principle of object-oriented programming.”
Natural languages, as opposed to programming languages, are used for communication between people. Linguists have divided these into different levels: 1. Prosody 2. Phonology 3. Morphology 4. Syntax 5. Semantics 6. Pragmatics
not a puzzle with separate pieces, but these are all interacting parts and the distinctions may do more harm than good, particularly on the top three.
Category-based reasoning involves classification (inferring an instance is a member of some kind) and then inheriting properties of that kind. Kind-referring words are known as generics in developmental linguistics. Typically refer to properties that are “essential, enduring, timeless” not accidental or tied to context (see Lyons 1977 via Susan Gelman: Learning words for kinds…). Often generics are over-generalizations and do not need universal quantifiers. (gelman 1998: “birds lay eggs” despite less than half of kinds of birds lay eggs)
( Endel Tulving (1972) classified memories into: episodic and semantic. Episodic is about individual activities and events. Semantic is dictionary-like knowledge like “All dogs have paws.”)
Induction problems. (Prasada 2000: problem of generic knowledge) when is a feature accidental or essential? with limited number of examples.
to learn words: children exploit inductive biases, formal morphosyntactic cues, contextual cutes, and theory-based knowledge to solve the problem of learning generic language/knowledge.
the knowledge is underspecified but adheres to their existing knowledge in appropriate ways. “birds lay eggs” -> birds are a kind of animal such that the mature female lays eggs" (Shipley 1993, 278)
Categorization and nouns People group objects into the world into categories on the basis of their “similarities”.
Semantic trait: mode of participation
Expected: properties. “expressive paradox” (common sense level)
Combinations of words are not just constrained by the meanings but grammatical properties.
Problems: (From Tom Mitchell’s lecture slides ) - "Can’t tell whether it has learned concept - fails with inconsistent training data - picks maximally specific h - there many be several concepts in H
Ordered tuples.
Functions
Kinds of relations
(skippped)
Generative grammars Hierarchy
Binding
Functions are a relation between two sets A and B, where: 1. A is the domain, and 2. Each element in A (domain) is paired with only one element in B (range).
Note: Slides are available online.