Manipulating symbols Last class Typology of signs Sign systems Symbols Tremendously important distinctions for informatics and computational sciences Computation symbol manipulation Symbols can be manipulated without reference to content syntactically ID: 631096
Download Presentation The PPT/PDF document "Information and uncertainty" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Information and uncertaintySlide2
Manipulating symbols
Last class
Typology of signs
Sign systems
Symbols
Tremendously important distinctions for informatics and computational sciences
Computation = symbol manipulation
Symbols can be manipulated without reference to content (syntactically
), due
to the arbitrary nature of convention
Allows computers to operate! All signs rely on a certain amount of convention, as all signs have a pragmatic (social) dimension, but symbols are the only signs which require exclusively a social convention, or code, to be understood.Slide3
Symbol manipulation
Some
have meaning (in some language)
The relation between symbols and meaning is arbitrary
Example: cut-up method for generating poetry pioneered by
Brion
Gysin
and William Burroughs and often used by artists such as David Bowie, or use of samples in electronic music
aedl
:
adel adle
aedl
aeld
alde
aled dael dale deal dela dlae dlea eadl eald edal edla elad elda lade laed ldae ldea lead leda
4! Permutations:
4
x
3
x
2
x
1 = 24Slide4
Information theory
“The mathematical theory of communication”, Claude Shannon (1948)
Efficiency of information transmission in electronic channels
Key concept:
information quantity that can be measured unequivocally (objectively)
Does not deal at all with the subjective aspects of information semantics and pragmatics.
Information is defined as a quantity that depends on symbol manipulation aloneSlide5
What’s an information quantity?
How to quantify a relation?
Information is a relation between an agent, a sign and a thing, rather than simply a thing.
The most palpable element in the information relation is the sign, symbols
But which symbols do we use to quantify the information contained in messages?
Several symbol systems can be used to convey the same message
We must agree on the same symbol system for all messages!Slide6
What’s an information quantity?
Both sender and receiver must use the same code, or convention, to encode and decode symbols from and to messages.
We need to fix the language used for communication
Set of symbols allowed (an alphabet)
The rules to manipulate symbols (syntax)
The meaning of the symbols (semantics).
A language specifies the
universe of all possible messages =
Set of all possible symbol strings of a given size.
Shannon Information is thus defined as “a measure of the freedom from choice with which a message is selected from the set of all possible messages”
DEAL
DELA
DLAE
DLEA
DAEL
DALE
EALDEADLELADELDAEDLAEDALALDE
ALED
ADLE
ADEL
AELD
AEDL
LDEA
LDAE
LEAD
LEDA
LADE
LAED
DEAL is 1 out of 4! = 4×3×2×1 =
24
choices
.Slide7
What’s an information quantity?
Information is defined as “a measure of the freedom from choice with which a message is selected from the set of all possible messages”
Bit (short for binary digit) is the most elementary choice one can make between two items: “0’ and “1”, “heads” or “tails”, “true” or “false”, etc.
Bit is equivalent to the choice between two equally likely choices.
Example, if we know that a coin is to be tossed, but are unable to see it as it falls, a message telling whether the coin came up heads or tails gives us one bit of informationSlide8
Decision-making
Decision-making:
Perhaps the most fundamental capability of human beings
Decision always implies uncertainty
Choice
Lack of information, randomness, noise, Error
“The highest manifestation of life consists in this: that a being governs its own actions. A thing which is always subject to the direction of another is somewhat of a dead thing. ”
“A man has free choice to the extent that he is rational.”
(St. Thomas Aquinas)
“In a predestinate world,
decision would be
illusory; in a world of perfect foreknowledge,
empty
; in a world without natural order,
powerless
. Our intuitive attitude to life implies non-illusory, non-empty, non-powerless
decision… Since decision in this sense excludes both perfect foresight and anarchy in nature, it must be defined as choice in face of bounded uncertainty” (George Shackle)Slide9
Uncertainty-based information: original contributions
Information is transmitted through noisy communication channels: Ralph Hartley and Claude Shannon (at Bell Labs), the fathers of Information Theory, worked on the problem of efficiently transmitting information; i.e. decreasing the uncertainty in the transmission of information.
Hartley, R.V.L., "Transmission of Information",
Bell System Technical Journal
, July 1928, p.535.
C. E. Shannon, ``A mathematical theory of communication,''
Bell System Technical Journal
, vol. 27, pp. 379-423 and 623-656, July and October, 1948.Slide10
Choices: multiplication principle
“If some choice can be made in M different ways, and some subsequent choice can be made in N different ways, then there are M
x N different ways these choices can be made in succession” [Paulos
]3 shirts and 4 pants = 3 x 4 = 12 outfit choicesSlide11
Hartley uncertainty
Nonspecificity: Hartley measure
The amount of uncertainty associated with a set of alternatives (e.g. messages) is measured by the amount of information needed to remove the uncertaintyA type of ambiguity
Quantifies how many yes-no questions need to be asked to establish what the correct alternative is
Number of Choices
Measured in bits
A
= Set of Alternatives
x
n
x
3
x
2
x
1
BSlide12
Hartley uncertainty
Number of Choices
Measured in bits
Quantifies how many yes-no questions need to be asked to establish what the correct alternative is
A
Menu Choices
A = 16 Entrees
B = 4 Desserts
How many dinner combinations?
16
x
4 = 64
H(AxB
) = log2(16x4)
= log2(16)+log2(4) = 4+2 = 6
AxBSlide13
Hartley uncertainty: decision trees
Number of Choices
Measured in bitsSlide14
What about probability?
Some alternatives may be more probable than others!
A different type of ambiguity
Higher frequency alternatives: less information required
Measured by Shannon’s
entropy
measure
The amount of uncertainty associated with a set of alternatives (e.g. messages) is measured by the
average
amount of information needed to remove the uncertainty
Probability distribution of letters in English text (Orwell’s 1984 in fact):Slide15
Shannon’s entropy
Probability of alternative
Measured in bits
A
= Set of weighted Alternatives
x
n
x
3
x
2
x
1
Shannon’s measure
The
average
amount of uncertainty associated with a set of
weighted
alternatives (e.g. messages) is measured by the
average
amount of information needed to remove the uncertaintySlide16
Entropy of a message
Message encoded in an alphabet of
n
symbols, for example:English = 26 characters + space
More code = dots, dashes and spaces
DNA: A, T, G, CSlide17
What it measures
missing information, how much information is needed to establish what the symbol is, or
uncertainty about what the symbol is, or
on average, how many yes-no questions need to be asked to establish what the symbol is.
One alternative
Uniform distributionSlide18
Example: Morse code
1) All dots: p1 = 1, p2 = p3 = 0.
Take any symbol – it’s a dot; no uncertainty, no question needed, no missing information, HS = -1.log2(1) = 0.
2) 50-50 dots and dashes: p1 = p2 = 1/2, p3 = 0.
Given the probabilities, need to ask one question
one piece of missing information
HS = -(1/2.log2(1/2) + 1/2.log2(1/2) ) = -1.log2(1/2) = - (log2(1) - log2(2)) = log2(2) = 1 bit
3) Uniform: all symbols equally likely, p1 = p2 = p3 = 1/3.
Given the probabilities, need to ask as many as 2 questions - 2 pieces of missing information, HS = - log2(1/3) = - (log2(1) - log2(3)) = log2(3) = 1.59 bitsSlide19
Bits, entropy and Huffman codes
Given a symbol set {A,B,C,D,E}
And occurrence probabilities P
A
, PB
, PC
, PD
, PE
,
The Shannon entropy then corresponds to:
The average minimum number of bits needed to represent a symbolHuffman coding: variable length coding for messages whose symbols have variable frequencies that minimizes number of bits per symbol?
Coding:
H = -(0.250*log2(0.250)+
0.375*log2(0.375)+
0.167*log2(0.167)+
0.125*log2(0.125)+
0.083*log2(0.083)) = 2.135Huffman code: #bits per symbol=0.375 * 1+0.250 * 2+0.167 * 3+0.125 * 4+0.083 * 4= 2.208 Slide20
Critique of Shannon’s communication theory
The entropy formula as a measure of information is arbitrary
Shannon’s theory measures quantities of information, but it does not consider information content
In Shannon’s theory, the semantic aspects of information are irrelevant to the engineering problemSlide21
Other forms of uncertainty
Vagueness or fuzziness
Simultaneously being “True” and “False”
Fuzzy Logic and Fuzzy Set TheorySlide22
From crisp to fuzzy sets
Fuzziness: Being and Not Being
Laws of Contradiction and Excluded Middle are Broken
Set of all People
Tall People
1
Set of all People
Tall People
1Slide23
Papers:
1)
boyd
,
danah
and Crawford, Kate, Six Provocations for Big Data (September 21, 2011). A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society, September 2011R
.
2)Ryuji Suzuki, John R. Buck and Peter L. Tyack (2006) Information entropy of humpback whale songs, J.
Acoust
. Soc. Am, 199(3), March
3)David A. Huffman (1952). A method for the construction of Minimum-Redundancy Codes, in Proceedings of the I.R.E, September.
This week’s discussion