In preparation for exam2 Introduction Download pig from pigapacheorg into timberlake or your local computerlaptop Unzip and untar it You are set to go You can execute in local mode for learning purposes Later on you can test it on your ID: 481624
Download Presentation The PPT/PDF document "Pig from Alan Gates’ book" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Pig from Alan Gates’ book(In preparation for exam2)Slide2
Introduction
Download pig from pig.apache.org (into
timberlake
or your local computer/laptop)
Unzip and
untar
it. You are set to go.
You can execute in local mode for learning purposes. Later on you can test it on your
hadoop
installation.
Navigate to the director where pig is installed.
./bin/pig –x local
Will put you in grunt mode or local modeSlide3
Data and pig Script
Create a data (called data) directory in the directory where bin is located.
Download from
github
all the data files related to pig book and store in the data directory
NYSE_divdidends
NYSE_daily
Etc.
Now go thru’ the examples in chapters 1-4, either by typing them in line by line or by creating script files.
Mystockanalysis.pig
can be executed by
./bin/pig –x local
Mystockanalysis.pig
or line by line on gruntSlide4
Chapter 1
Hello world of pig.
Mary had little lamb example.
Go through the example in page.3
Create “
mary
” file in your data directory
Type in the commands line by line as in p.3
Now create a ch1.pig file out of the
coammands
Run the script file using the pig command
Try some other commands not listed there.
Understand the examples discussed in p.5,6Slide5
Chapter 2
Discusses installing and running pig
Go through the example in p.14.
That’s all.Slide6
Chapter 3
Discuss the grunt shell that is the prompt for the local mode
p
ig –x local
Results in grunt
g
runt>
See the example in page 20Slide7
Chapter 4
Pig data model
Scalars like:
int
, long, float, double, etc.
Complex types: Map,
chararray
to element mapping, sort of like key, value pair
Tuple ordered collection of Pig elements (‘bob, 55)
Bag is an unordered collection of tuples
Nulls
Schemas: Pig has lax attitude towards schemas
Explicit:
d
ividends = load ‘
NYSE_dividends
’ as (
exchange:chararray
,
symbol:chararray
, date:
chararray
,
dividend:float
);
Or you could say
divs
= load ‘
NYSE_dividends
’ as (exchange, symbol, date, dividend);
See the table on page 28
See the example p.28,29,30.Slide8
Chapter 5
Pig Latin
Look at the examples p.33-50
Commands discussed are:
Load, store, dump
Relational operations:
foreach
, filter, group, order ..by, distinct, join
Data operation: limit, sample
, parallel.