/
PolyAnalyst  Web Report Training PolyAnalyst  Web Report Training

PolyAnalyst Web Report Training - PowerPoint Presentation

isabella2
isabella2 . @isabella2
Follow
342 views
Uploaded On 2021-12-08

PolyAnalyst Web Report Training - PPT Presentation

2014 Megaputer Intelligence Inc PolyAnalyst Dictionaries Outline Why do we use Dictionaries Dictionaries are essential to good Text Mining Outline Changes In PolyAnalyst Dictionary Old Dictionary ID: 904542

outline dictionary spell word dictionary outline word spell dictionaries statistics checks synonyms default sentiment lists improving list stop check

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "PolyAnalyst Web Report Training" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

PolyAnalyst Web Report Training

© 2014 Megaputer Intelligence Inc.

PolyAnalyst Dictionaries

Slide2

Outline

Why do we use Dictionaries?Dictionaries are essential to good Text Mining

Slide3

Outline

Changes In PolyAnalyst DictionaryOld DictionaryCompanies

GeoAdministrative

Human Names

Morphology

Organizations

Phrases

Semantics

Statistics

Synonyms

Word classes

Spell Checks

Stop Lists

Word Lists

Old Dictionary Split into Multiple Parts

Slide4

Dictionaries

New DictionaryCompaniesGeoAdministrative

Human Names

Morphology

Organizations

Phrases

Semantics

Statistics

Synonyms

Word classes

Spell Checks

Entity Extraction

Sentiment Analysis

Multiple Nodes

Stop Lists

Word Lists

Keyword Extraction

Slide5

Statistics Dictionary

Statistics

Slide6

Statistics Dictionary

Keyword Extraction computes Significance from base frequencies in the Statistics Dictionary

Slide7

Improving Keyword Extraction

The Default Statistics dictionary is based on a large corpus of text to estimate word frequency in typical English.Your data might not be typical.

Slide8

Domain Specific Statistics Dictionaries

In Pubmed Medical Abstracts the most significant word is “Placebo”“Placebo” is a common word in clinical drug trials and not helpful in this domain

Slide9

Domain Specific Statistics Dictionaries

Train the Statistics Dictionary on Domain Data

Statistics Dictionary

Apply on our data

Slide10

Editing a Dictionary

All Dictionaries are the Dictionary ManagerGo To File-> Manage Dictionaries or Ctrl +D

Slide11

Setting Default Dictionaries

Go to Settings -> Program options -> Project options

Slide12

Setting Default Dictionaries

Select Default Dictionaries for the project

Slide13

Training a Statistics Dictionaries

The Statistics Dictionary is generated in the Index Node

Slide14

Training a Statistics Dictionaries

Go to Generate -> Statistic Dictionary

Slide15

Statistics Dictionaries

In the Keyword Extraction Node Select the Statistics Dictionary

Slide16

Statistics Dictionaries

Updated keywords from new dictionaries

Slide17

Multiple Nodes Dictionaries

SynonymsSpell ChecksStop Lists

Multiple Nodes

Slide18

Spell Checks Dictionary

Spell Checks

Slide19

Good Spell Check Practices

Editing the default spell checks dictionary isn’t best if you’re working in a group.Create a project Spell Check dictionary or a personal user dictionary.

Slide20

Create New Dictionary

Creating a Spell Checks DictionaryDictionary Manager

Slide21

Inherit Default Dictionaries

Creating a Spell Check Dictionary

Slide22

Outline

Editing Spell Checks DictionaryImproving the Spell Checks Dictionary from within the spell check node .

Slide23

Outline

Editing Spell Checks DictionarySelect the Proper Dictionary

Slide24

Outline

Spell Checks DictionaryGreen color shows suggested correction.

Slide25

Outline

Spell Checks Dictionary CodingBlue = Known Misspell from Dictionary (Confidence = 100%)Black = Probable Misspell from Algorithm (Confidence > Threshold)Grey = Suggested Misspell from Algorithm (Confidence < Threshold)Empty = Unknown Misspell (Confidence = 0)

Slide26

Outline

Improving Spell Checks DictionaryCase 1) Correcting a misspellSpell Check Algorithm is baffled.

From context we can infer the word is “commitment.”

Slide27

Outline

Improving Spell Checks DictionaryCase 1) Correcting a misspell

Select the word and click the Add button

Slide28

Outline

Improving Spell Checks DictionaryCase 1) Correcting a known wordWrite the corrected word and click OK

Slide29

Outline

Improving Spell Checks DictionaryRight Click -> Mark as known WordCase 2) To add a new word to the Spell Check dictionary

Slide30

Outline

Improving Spell Checks DictionaryThe new word will turn red and be added to the dictionary.Case 2) To add a new word to the Spell Check dictionary

Slide31

Outline

Improving Text Mining through SynonymsSynonyms

Slide32

Outline

Improving Text Mining through SynonymsMany PDL functions make use of Relationships within the dictionary.Synonym is the most common relationship.

Slide33

Outline

Dictionary Synonyms

Slide34

Outline

Edit Dictionary Synonyms Manually

Slide35

Outline

Import Dictionary Synonyms ListSynonym ListImport Dialog

Slide36

Outline

Dictionary Synonyms PDLThe thesaurus function matches all synonyms of a token.

Slide37

Outline

Dictionary Synonyms PDL

Slide38

Outline

Dictionary Synonyms PDL

Slide39

Outline

Stop List DictionaryThe Stop List Dictionary is a list of terms to ignore in Text Analysis.Keyword Extraction doesn’t include terms in stop list by default

Slide40

Outline

Stop List DictionaryStop Lists

Slide41

Outline

Stop List DictionaryImport Dialog

Slide42

Outline

Morphology DictionaryMorphology

Slide43

Outline

Morphology DictionaryLemmaAbdomen

Abdomen

Abdomen’s

Abdomens

Abdomens’

Singular

Singular Possessive

Plural

Plural Possessive

Slide44

Outline

Semantics Dictionary

Slide45

Outline

Semantics DictionaryDictionary RelationshipsHyponyms

Hypernyms

Meronyms

Holonyms

Synonyms

Antonyms

Slide46

Outline

Hyponyms and Hypernyms

“Cardinal”, “Eagle”, and “Ostrich” are all hyponyms of “Bird”

“Bird” is a

hypernym

of “Cardinal”

Slide47

Outline

Meronyms and Holonyms“Feather” is a Meronym of “Cardinal”“Cardinal” is a

Holonym of “Feather”

Meronym

= Is Part Of

Slide48

Outline

Synonym and Antonyms“Birdcage” is a synonym of “Aviary”“Heat” is a antonym of “Cold”

Slide49

Outline

PolyAnalyst DictionariesCompaniesGeoAdministrative

Human Names

Organizations

Word classes

Entity Extraction

Sentiment Analysis

Slide50

Outline

Adding Word ClassesStep 1) Create a CSV FileVertical EntryHorizontal Entry

Slide51

Outline

Adding Word ClassesStep 2) Create a New Dictionary

Slide52

Outline

Dictionary Import ScreenStep 3) Name the DictionaryThe inherit option clones the inherited dictionaries

Slide53

Outline

Dictionary Import ScreenStep 4) Import CSV as Word class

Slide54

Outline

New Word Class

Slide55

Outline

Use in a Lingua Mark Expression{<,P(1)> <Temperatures,PL(SP)>:@}:Temp

Slide56

The high for Wednesday is 105 degreesRoom temperature is about 25 C

The product was left in the freezer at 11 F75 Fahrenheit is a comfortable temperature{<,P(1)> <Temperatures,PL(SP)>:@}:Temp Extracted Temperature

Slide57

Word Classes that Convey Sentiment

The sentiment analysis relies heavily on wordclasses that convey sentiment.

Slide58

Word Classes that Convey Sentiment

Default Word class DictionaryabsbadadjAccursedAwfulTerrible

badadv

Badly

Immorally

Irresponsibly

goodadj

Accommodating

Accurate

Adequate

Sentiment Word Classes convey Polarity, Part of Speech, Degree

Slide59

Outline

Sentiment Word ClassesSentiment Word Classes are Customizable Domain specific additions such as slang and emoticons.:D   ;( ;)

Slide60

Wordlists are an older form of wordclassesLists of associated wordsDefault Wordlists are “Positive” and “Negative” and are used for Sentiment Analysis

Word Lists Dictionary

Slide61

Word Lists Dictionary

Positive Word List

Slide62

Using Word Lists

In the Taxonomy Node use the Term Function

Slide63

Phrases DictionaryPhrases Dictionary is similar to Wordlists using multiple words or “Phrases”

Slide64

Other Dictionaries

CompaniesGeoAdministrativeHuman NamesOrganizations

Entity ExtractionSentiment Analysis

Slide65

Outline

Default Entity ExtractionPeople- “Leader Alvaro Hernandez”, “Bill Martin”Companies-”Blue Shield of California”, ”Global Systems Inc.”GeoAdministrative- “

Tucson Arizona”, “Ecuador”Units-

“Second, Meter, Degree”

Slide66

Outline

Dictionaries are essential to good Text Mining

Slide67

Contacting Megaputer

Questions?