Linguistic feature mining of 2 contrasting corpora Text from Financial Statements Transcripts of 911 Homicide Calls Text Verbal communication transcribed to text Carefully written and edited over weeks to months ID: 257164
Download Presentation The PPT/PDF document "Today’s Discussion" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Today’s Discussion
Linguistic feature mining of 2 contrasting corpora:
Text from Financial Statements
Transcripts
of 911 Homicide Calls
Text
Verbal
communication transcribed to text
Carefully written and edited over weeks to months
Unrehearsed
Formal: conforms
to genre for financial communiqués
Informal: includes slangSlide2
Financial Statement Fraud:
Problem and MotivationInvestors look for credibility, transparency, and clarity of financial documents to make investment decisions and to maintain confidence in companiesManagement’s Discussion and Analysis (MD&A) is among the sections of 10-Ks that is read most oftenAuditors need innovative ways to assess risk based on not only financial and nonfinancial measures but also financial statement textsSlide3
Deception Is Strategic
(Buller and Burgoon, 1996)FOOTNOTE 16. RELATED PARTY TRANSACTIONS
In 2000 and 1999, Enron entered into transactions with limited partnerships (the Related Party) whose general partner’s managing member is a senior officer of Enron. The limited partners of the Related Party are unrelated to Enron. Management believes that the terms of the transactions with the Related Party were reasonable compared to those which could have been negotiated with unrelated third parties…Subsequently, Enron sold a portion of its interest in the partnership through securitizations.”
(Enron 2000)Slide4
Leakage Theory Applied to
Fraudulent Financial Reporting (Ekman 1969) Managers engaging in fraud cannot completely match behavior exhibited when truthfulCues leak out unintentionallyLanguage usage should leave clues to deceptionSlide5
Mining Linguistic Features for Detecting Obfuscation in Financial Reports
Do MD&A sections of fraudulent 10-Ks have a higher level of obfuscation?Based on the research in deception detection and obfuscation, we can look for the following (among other cues) in fraudulent MD&As:
More complex words
More complex sentences
More causation words
More achievement wordsSlide6
Our Methodology
Linguistic Extraction and Classification Tools
Linguistic Cues for Deception
Classified as
Deceptive
Classified as Not Deceptive
101 MD&As with fraud problems
101 MD&As with no fraud problemsSlide7
Example of Results
Greater in Fraudulent MD&As
Rate of Three Syllable Words**
Conjunctions**
Causation Words**
Achievement Words*
**
= p < .05, * = p < .10Slide8
Application of Automated Linguistic Analysis
to Transcripts of 911 Homicide Calls
for Deception Detection
Caller from Orange County, Florida
Caller from Columbia, MissouriSlide9
911 calls are a potentially rich source of verbal deception indicators
911 calls are unrehearsed, high-stakes communicationsMotivation: Identify if linguistic content of truthful vs. deceptive 911 calls differs
911 Calls: Problem & MotivationSlide10
Can automated linguistic analysis techniques accurately classify deceptive vs. truthful callers in transcripts of 911 homicide calls?
Based on the research in deception detection, we can look for the following (among other cues) in deceptive 911 calls:Higher use of theyHigher use of we
More suppressed answers, using as few words as possible --- the
opposite
of obfuscation!
Negation
Assent
than truthful callers.
QuestionSlide11
Methodology
Linguistic Extraction and Classification Tools
Linguistic Cues for Deception
Classified as
Deceptive
Classified as Not Deceptive
Twenty-five 911 Calls Labeled as Deceptive
Twenty-five 911 Calls Labeled as TruthfulSlide12
Examples of Results
Variable name
Direction
Example
1st person plural
D>T
We
don't know.
3rd person plural
D>T
Yes,
they
said,
they
said if
they
heard anything
they
were going to my house.
Negation
D>T
No nothing
, he's gone.
Assent
D>T
Okay
, they're here
.Slide13
Truthtellers:
Display more negative emotion (including emotion-filled swearing) and anxiety than deceivers.Refer to singular others (she or he).Use more numbers to ensure responders find address as quickly as possible or know phone number.
Use more generic names of locations, such as ‘apartment’ or ‘garage’ to give more accurate, helpful information to responders.
Discussion: TruthtellersSlide14
Deceivers:
‘Distance’ themselves from what is said by referencing others in the 3rd person (they).Try to ‘share the blame’ by referring to self as plural (we) rather than as singular.
Use more negation and assent words because they are trying to subdue, constrain, or suppress answers/affect.
Tell the operator to ‘wait’ or ‘hold on’ if the operator is asking them to do something, such as CPR, that they are reluctant to do.
Discussion: DeceiversSlide15
Questions?