/
Introduction to Shell scripting Introduction to Shell scripting

Introduction to Shell scripting - PowerPoint Presentation

tatiana-dople
tatiana-dople . @tatiana-dople
Follow
426 views
Uploaded On 2016-12-19

Introduction to Shell scripting - PPT Presentation

Presented by Shailender Nagpal Al Ritacco Research Computing UMASS Medical School AGENDA Shell basics Scalars Arrays Expressions Printing Builtin commands Blocks Branching Loops String and Array operations ID: 503284

shell echo data file echo shell file data script command string linux line dna commands gene variables operations array scripting bash program

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Introduction to Shell scripting" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Introduction to Shell scripting

Presented by:

Shailender Nagpal, Al Ritacco

Research Computing

UMASS Medical SchoolSlide2

AGENDA

Shell basics: Scalars, Arrays, Expressions, Printing

Built-in commands, Blocks, Branching, Loops

String and Array operations

File operations: Text processing utilities SED, AWKWriting custom functionsProviding input to programsShell scripting strategiesUsing Linux scripts with the LSF cluster

2Slide3

What is Shell scripting?

Series of

linux

commands in a text file that can be executed on a

linux shell in top-down fashionThe Linux shell provides a high-level, general-purpose, interpreted, interactive programming environmentSimple iterative, top-down, left to right programming style for users to create small, and large’ish programsMainly for automating linux tasks but also for writing integrated workflows

3Slide4

Features of Shell scripting

Linux code is for Linux operating system only

Easy to use and lots of resources are available

Procedural programming, not strongly "typed"

Similar programming syntax as other languagesif, for, do, functions, etcProvides limited methods to manipulate datascalars, arraysStatements don’t need to be terminated by semi-colon (but can be)

4Slide5

Advantages of Shell scripting

N

ot as fully-featured as C, Java, Perl, Shell script, but still very useful for automation, file processing and workflow development, making it advantageous to use it in certain applications like Bioinformatics

Fewer lines of code than C, Java. Similar to Perl, Python

No compilation necessary. Prototype and run!Run every line of code interactivelyVast command library Save coding time and automate computing tasksCode is even more concise than Perl and Python5Slide6

Types of linux "shells"

Shells provide a user interface (command prompt) to the underlying

unix

operating system

They give users an environment to execute commands upon loginMany shells are available, which are mostly the same, but with some minor differencesBourne shell (sh), C shell (csh), TC sh

ell (

tcsh

),

K

orn

sh

ell (

ksh

),

B

ourne Again Shell (bash)Which "shell" are you using?echo $SHELL

6Slide7

Shell features

7

FEATURES

Bourne

C

TC

Korn

BASH

Command

history

No

Yes

Yes

Yes

Yes

Command

alias

No

Yes

Yes

Yes

Yes

Shell

scripts

Yes

Yes

Yes

Yes

Yes

Filename

completion

No

Yes

Yes

Yes

Yes

Command line editing

No

No

Yes

Yes

Yes

Job

control

No

Yes

Yes

Yes

YesSlide8

First Shell program

The obligatory "Hello World" program

#!/

usr/bin/bash

# Comment: 1

st

program: variable, echo

name="World"

echo "Hello $name"

echo "Hello ${name}"

Save as ".

sh

" extension, then at

linux

shell:

chmod

755 hello.sh # Make it executable

./hello.sh

Hello World

Hello World

8Slide9

Understanding the code

The first line of a Shell script requires an interpreter location, which is the path to the "bash" shell

#!/path/to/bash

2nd line: A comment, beginning with "#"3rd line: Declaration of a string variable4th, 5th line: echoing some text to the shell with a variable, whose value is

interpolated by $ sign

The

quotes are not

echoed

, and

"name"

is replaced by

"World"

in the

output.

9Slide10

Second program

Report summary statistics of DNA sequence

#!/

usr

/bin/bashdna

="ATAGCAGATAGCAGACGACGAGA"

dna_length

=`

echo $

dna

|

wc

-m`

echo "Length of DNA is

$

dna_length

"

echo "Number of A bases are" `

echo

$dna | grep -o

"A"

| wc -

l

`

echo "Number

of

C

bases

are" `

echo

$dna | grep -o "C" | wc -l`echo "Number of G bases are" `echo $dna | grep -o "G" | wc -l`echo "Number of T bases are" `echo $dna | grep -o "T" | wc -l`echo "Number of GC dinucleotides are ", `echo $dna | grep -o "GC" | wc -l`gc=$((`echo $dna | grep -o "G" | wc -l`+`echo $dna | grep -o "C" | wc -l`))gc_per=`echo $gc/$dna_length*100 | bc -l`printf "G+C percent content is %2.1f" $gc_perQuick summary, re-use code to find motifs, RE sites, etc

10Slide11

Linux Commands

ls

cp

rm mv

cd

mkdir

pwd

rmdir

cat

head

tail

clear

vi

passwd

less

more

history

export alias functiondate who whoami last exit wc grep man sort uniqgzip tar file ssh rshscp rsync ftp echo touchfile cut tee dos2unix psbg fg wait top timewho df du screen lastchmod chown chgrp grep/egrep sedawk test expr csplit diff11Slide12

Linux Commands (…contd

)

find locate finger history host

hostname jobs join kill

lnmail make mount

umount

nl

nohup

passwd

ps

pstree

nice

renice

rlogin

rsh

set

setenv

tee test top

tr

unalias

uname

untar

unless unzip uptime

vmstat

wget which while xargszip env su sudo emacspico nano bzip2 sleep disownsource exec bash umask pastesvn free banner fgrep crontab12Slide13

Linux Commands (…contd)

if then else

elif

fi

for do done

while case

esac

13Slide14

Application commands

allegro

bedtools

blast

b

owtie

bwa

clustalW

crossbow

cufflinks

fasta

fastx

maq

maqview

m

fold

plink

polyphen primer3 prinseq samtools snpEff sratools tophat vcftools vmd namdTo run the "blast" command for example, run this:blastall –p blastn –d nr –i query.fa14Slide15

Shell comments

Use "#" character at beginning of line for adding comments into your code

Helps you and others to understand your thought process

Lets say you intend to sum up a array of numbers

#

(sum from 1 to 100 of X

)

The code would look like this:

sum=0 # Initialize variable called "sum" to 0

for

i

in $(

seq

1 100); do

#

Use

"for"

loop to iterate over 1 to 100

sum=$(( $sum + $

i

)) #

Add the previous sum to

x

done

echo "The

sum of 1

..x

is

$sum" # Report the result

 15Slide16

Shell script: Variables

Variables

Provide a location to "store" data we are interested in

Strings, decimals, integers, characters, arrays, …

What is a character – a single letter or numberWhat is a string – a array of charactersWhat is floating point – a number 4.7 (sometimes referred to as a real if there is a decimal point)Variables can be assigned or changed easily within a Shell script16Slide17

Variables and built-in keywords

Variable names should represent or describe the data they contain

Do not use meta-characters, stick to alphabets, digits and underscores. Begin variable with alphabet

Shell scripting as a language has keywords that should not be used as variable names. They are reserved for writing syntax and logical flow of the program

Examples include: if, then, fi, for, while, do, done, switch, function, etc17Slide18

Special shell variables

Shell built in variables

available:

$#

- Shows number of command line arguments$* - All arguments are sent to shell$@ - All arguments, any type, are sent to shell$$ - Process ID of the program running or ran$! – Process ID of the last program put into the Back Ground$?

Exit code of the command just submitted for execution

18Slide19

Shell "Environment" variables

Try out the commands

env

printenv

Variables that control the behavior of the shell are called Environment VariablesAn important variable is the “PATH” variable, which controls the order of the directories where commands will be executed fromTry:

which man

19Slide20

Variables, Arrays

Variables that hold single strings are string variables

Variables that hold single integers are integer variables

rank=3

score=5.3dna

="ATAGGATAGCGA"

Collection of variables are called arrays… could be a array of students in a class, scores from a test,

etc

students=("Alan" "Shailender" "Chris")

scores=(89.1 65.9 92.4)

binding_pos

=(9439984 114028942)

20Slide21

Printing text and variables

Single quotes do not process delimiters or variables and are therefore generally not used

Double quotes process variables prefixed with the "$" sign. Delimiters are not processed with "echo"

Ex:

x=1echo "This \t is a test\nwith text $x"Output:

This

\

t is a test\

nwith

text 1

To process delimiters use "

printf

",

printf

"This

is a \t

tab"

printf

"This i

s a \t

tab %s %s" $x $x

21Slide22

Printing arrays

Array variables can also be

echoed

as a array with a default delimiter, but another way to

echo arrays is put them in a loop and echo them as scalarsstudents=("Alan" "Shailender" "Chris")echo "students\n" # Does not work!

printf

"%

s %s %

s"

${students[@]}

# Method 1

printf

"%

s %s %

s" ${students[0]} $

{

students[1]} ${students[2]} # Method 2

If you run this as a program, you get this

output:

Alan Shailender Chris

#

Method

1

Alan Shailender

Chris # Method 2

22Slide23

Math Operators and Expressions

Math operators

Eg

:

echo $((3 + 2))+ is the operator We read this left to rightBasic operators such as + - / * ** ( ^ )Variables can be usedecho "Sum of 2 and 3 is " $((2+3))

x = 3

echo "Sum of 2 and x is " $((2+$x))

PEMDAS

rules are followed to build mathematical

expressions.

Floating point operations not allowed

23Slide24

Mathematical operations

Another way for integer arithmetic

let

"x=3"

let "y=5"let "z=

y+x

"

echo $z

let

"x=x*z"

let "y++"

echo $x

echo $y

Yet another way

z=`

expr $x + 4

` # space required between operands

24Slide25

Floating point arithmetic

Many ways to do this. If "

bc

" is available, print an expression and send it to built-in calculator

x=1.5y=2.9echo "$x/$y" | bc

-l

One can also use the "

awk

" program

echo `

awk

'BEGIN {print

5/3}'`

z=`

awk

'BEGIN { x = 1.5; y = 2.9;

printf

("%2.1f",

y/x)

}'`

e

cho $z

25Slide26

Creating Arrays

Integer arrays can be created using the command "

seq

", which needs a start and end position, alongwith

increment sizeseq 1 2 10echo $(seq 1 2 10)

26Slide27

Array Indexing

Arrays can be indexed by number to retrieve individual elements

Indexes have range

0 to (n-1),

where 0 is the index of the first element and n-1 is the last item's indexnucleotides=("adenine" "cytosine" "guanine" "thymine" "uracil")echo ${nucleotides[3]} is equal to 

thymine

echo ${nucleotides[4]}

is equal to

what?

Any element of an array can be re-assigned

nucleotides[4]="Uracil"

echo ${nucleotides

[@]} # @ represents all elements

27Slide28

Array Operations

Consider an array

data=(10 20)

To get the number of items in array

echo ${#data[@]}To add items to the end of the arraydata=(${data[@]} 30 40); echo ${data[@]}

To get the string length of a particular item in array

echo ${#data[3]} Slide29

Array Operations (…contd

)

To

extract a slice of items in array

echo ${data[@]:2:2}To find and replace items in the arrayecho ${data[@]/0/5}

To remove an

item at a given

position

u

nset data[3];

echo ${data

[@]}

To remove item based on patterns

echo ${data[@]/2*/}Slide30

Shell script provides excellent features for handling strings contained in variables

The "split" command allows users to search for patterns in a string and use them as a delimiter to break the string apart

For example, to extract the words in a sentence, we use the space delimiter to capture the words

x="This is a sentence"

echo $x | tr " " "\n"for

word

in `echo $x | tr

" " "\n"`;

do

echo $word;

done

String Operations: SplitSlide31

String Operations

Two

strings

placed next to one another with a space will concatenate automatically in the echo command

echo "Hello "" world" Hello worldwords=`echo "Hello "" world"`

e

cho $words

Hello worldSlide32

String Sub-scripting

Once a string is created, it can be subscripted using its indices that begin with 0

word="Programming"

echo ${word:0

} # "Programming"

echo

${word:3} #

"

gramming

"

echo ${word:0:3

}

# "Pro"

Slices

of

Shell script

strings cannot be assigned,

eg

${word:0:1}="D"

#

This

won't

workSlide33

String Commands

Some examples

dna

="ATAGACGACGACGTCAGAGACGA"

Length of DNA isecho "Length is" ${#dna}

Find the index of a pattern

echo `expr

index

"$dna"

GA

`

Extract a substring

echo `expr substr $dna 1 2

`

Convert to uppercase or lowercase

echo $

dna

|

tr

[A-Z] [a-z]Slide34

String Functions (…contd

)

Delete a pattern within a string

echo ${

dna#A*A} # Delete shortest from frontecho ${

dna

##A*A} # Delete longest from front

Find

and replace a string

echo ${

dna

//AT/GGGGG

}# Replace all

occurancesSlide35

File Processing operations

There are many commands in

linux

that operate directly on files, without having to open them and save data in arrays,

etcThis is a big advantage over Perl, Python sort cut uniq

awk

sed

tr

split

head tail

35Slide36

File Processing operations: AWK

Consider CSV file "gene_data.txt"

awk

-F

"," '$2>1000' gene_data.txtawk

-F

","

'$

2>50 && $4<50' gene_data.txt

awk

-F

","

'$2>50 && $

4<50 {print $1}' gene_data.txt

awk

-F

","

'$2>50 && $4<50 {

printf

("%s\

t%s

\

t%s

\n",

$

1,$3,$5)}' gene_data.txt

awk

-F

","

'$2>50 && $4<50 {

printf("%s\t%s\t%s\n", $1,$3,$5)}' gene_data.txtawk -F "," '$2>50 && $4<50 {printf("%s\t%f\n", $1,$4-$2)}' gene_data.txt 36Slide37

File Processing operations: SED

SED is a text stream editor that operates on files as well as standard output. Main function is to find patterns and act on them – delete or replace text

Here’s some simple examples of using SED

Delete lines from a

file containing a patternsed '/^>/d' sequence.fa

# Result in STDOUT

sed

'/^>/

d' –

i

sequence.fa

# In-place

Replacement

of text pattern with another text

sed

's/T/U/g'

sequence.fa

37Slide38

File Processing operations: CUT

Dissect the "gene_info.txt" file in a few ways

Extract the 2

nd

column from file (each line)cut -f 2 -d "," gene_info.txtExtract the 1st and 4th columns from file (each line)

cut -f

1,4

-d

","

--output-delimiter

=" " gene_info.txt

Extract the 10

th

character in each line

cut

-c

10

gene_info.txt

Extract the 10

th

to 12

th

characters in each line

cut -c 10-12

gene_info.txt

Extract the

3

rd

and 13

th

characters in each linecut -c 3,13 gene_info.txt38Slide39

File Processing operations: SORT

Sort the "gene_data.txt" file in different ways

1

st

column, dictionary order. Delimiter is ","

sort

-k 1 -d

gene_data.txt

2

nd

column, numerical increasing order. Delimiter is ","

sort -k 2 -n -t

"," gene_data.txt

4

th

column, numerical

decreasing order

. Delimiter is

","

sort

-k

4

-

nr

-t

"," gene_data.txt

39Slide40

File Processing operations: UNIQ

The "

uniq

" command finds consecutive lines in files or STDIN that are the same and merges them for display

The best use of the command is with delimited files where a particular field is "cut" out and sortedHow many unique chromosomes are represented in the file "gene_info.txt"?cat gene_info.txtcat gene_info.txt | cut -d "," -f 3 | sort |

uniq

40Slide41

File Processing operations: TR

"

tr

" translates, squeezes, and/or

deletes characters from standard input, writing to standard outputIn string, delete all spaces echo "Sam Smith" | tr -d ' '

In string, replace spaces with tabs

echo

"Sam Smith"

|

tr

–s [:space

:] '\t'

In string, delete all spaces

echo

"Sam Smith"

|

tr

-d '

'

In

FASTA file, concatenate all DNA into string

sed '/^>/d' sequence.fa | tr

-d

'\

n

'

41Slide42

File Processing operations: SPLIT

Lets say you want to break a FASTQ file into pieces so you can align each piece separately in parallel – how would you split the file?

One approach will be to count the reads and split by "m" equal reads

Another would be to divide into "n" pieces of somewhat equal size – may corrupt FASTQ

Shown below:nlines=`wc -l

reads.fq

| cut -f 1 -d

" "`

echo $

nlines

/100 |

bc

split -l 132000 -a 3 -d

reads.fq

42Slide43

File Processing operations: CAT

With the "cat" command, many file operations can be accomplished

Lines of a file can be loaded into an array

l

ines=`cat filename.txt`echo ${lines[@])Files can be loaded into STDOUT for string operationscat filename.txt |

wc

-l

Files can be re-directed as output to other files with the re-direction operator

cat filename.txt >> filename2.txt

43Slide44

Commands blocks in Shell script

A

group of statements surrounded by braces

{}

No! There are no curly braces in Shell script!Shell script blocks begin with "then", "elif", "else", "do" and "case" statements and end in "fi", "done" and "esac" statementsCreates a new context for statements and commandsEx: if (( $x>1 )); then

echo "Test"

echo "x is greater than 1"

fi

44Slide45

Conditional operations with "if-then-else"

If-then-else syntax allows programmers to introduce logic in their programs

Blocks of code can be branched to execute only when certain conditions are met

if [condition1 is true]; then

<statements if condition1 is true> else;

<statements

if condition1 is

false>

fi

Nested if statements are possible

45Slide46

Conditions/Tests

Linux supports many kinds of "tests" that result in a T/F value, which can be used in an if-then-else statement

if [ -f file.txt ]; then echo "File exists" \

else; echo "Does not exist"; fi

if [ -d

dirname

]; then echo

"Directory exists"

\

else; echo

"Does

not

exist"; fi

if [ "string" = $string ];

then echo

\

"Identical strings" else; echo "Not same"; fi

if

[

"string" !=

$string ]; then echo \

"Not identical strings"

else; echo

"Same";

fi

if [ -n $string]; then echo "String not empty"; \

else "Empty string"; fi

46Slide47

Conditions/Tests (..contd

)

47

if [ INTEGER1

-eq INTEGER2]; then echo ""; else; echo "" fi

if [ INTEGER1

-

ge

INTEGER2]; then echo

"";

else; echo

""

fi

if [ INTEGER1

-

gt

INTEGER2]; then echo

"";

else; echo

""

fi

if [ INTEGER1

-le

INTEGER2]; then echo

"";

else; echo

""

fi

if [ INTEGER1 -lt INTEGER2]; then echo ""; else; echo "" fiif (( $num <= 5 )); then echo "Number less than 5"; fiDouble square bracket syntax is also used. (When?)Slide48

Rules of conditional statements

Always keep spaces between the brackets and the actual

check/comparison

Always terminate the line with ";" before

putting a new keyword like "then", since it is a shell commandQuote string variables if you use them in conditionsYou can invert a condition by putting an "!" in front of itYou can combine conditions by using "-a" for "and" and

"-o"

for

"

or"

48Slide49

Flow Control: "For" loop

"For"

loops allow users to repeat a set of statements a

pre-set

number of time.STAGE=$(seq 1 10)

for

i

in ${STAGE};

do

echo "Stage $

i

"

done

The "in" syntax allows for other arrays to be created

f

or file in `

ls

`; do

echo $file

done

for line in `cat gene_info.txt`; do

echo $line

doneSlide50

Iterating over arrays with "while"

Example:

nucleotides=("adenine" "cytosine" "guanine" "thymine" "uracil")

i

=0while [ $i -

lt

${#nucleotides[@]}

]; do

printf

"Nucleotide is:

%

s\n" ${nucleotides[

i

]}

i

=$(($i+1))

done

Output:

Nucleotide is: adenine

Nucleotide is: cytosine

Nucleotide is: guanine

Nucleotide is: thymine

Nucleotide is: uracil

50Slide51

Switch-case

Case statements allow for branching to be performed on code blocks based on different values a variable takes

Like an if-then-else statement, except instead of condition, the syntax checks for values of variable

x=1

case $x in "1")

echo 1 ;;

"2")

echo 2 ;;

*)

echo

"

none

"

;;

esac

51Slide52

Shell script File access

What is file access?

set of Shell script commands/syntax to work with data files

Why do we need it?

Makes reading data from files easy, we can also create new data filesWhat different types are there?Read, write, append52Slide53

File I/O

Low level file I/O is usually not performed in Linux

Abundance of file manipulation tools/commands

If needed though, ASCII/text files can be read line by line using shell script easily.

file="sequence.fa"while

read

line; do

#

display $line or do

something

with $line

echo "$line"

done < "$file"Slide54

File read and write example

file="

mailing_list

"

while read line; do

printf

"%

s %

s" "$fields[1]" "$fields[0]"

printf

"%

s %

s" "$fields[3]" "$fields[4]"

printf

"%

s %s %

s" "fields[5]" \

"fields[6]" "fields[7]"

done

<

"$file"

Output:

Al

Smith 123 Apple St., Apt. #1 Cambridge, MA 02139 54Input file:Last name:First name:Age:Address:Apartment:City:State:ZIP Smith:Al:18:123 Apple St.:Apt. #1:Cambridge:MA:02139 Slide55

Functions

What is a function?

group related statements into a single task

segment code into logical blocks

avoid code and variable based collisioncan be "called" by segments of other codeSubroutines return valuesExplicitly with the return commandImplicitly as the value of the last executed statementReturn values can be a scalar or a flat array55Slide56

Functions

A function can be written in any

Shell script

program, it is identified by the

"def" keywordWriting a functionfunction echostars {

echo "***********************"

}

function

exitIfError

{

if [[ $1 -ne 0 ]]; then

echo

"ERROR

! - return code $

1"

exit 1

fi

}

echostars

;

exitIfErrorSlide57

Functions with Inputs and Outputs

The

"echo"

statement can be used to return some output from the function

function fib2 { result=(1 1)

a=0

; b=1

while

[ $b -

lt

$1 ]; do

result=(${result[@]} $b)

a=$b

b=$(($a+$b))

done

echo

${result[@]}

}

The function can then be called

source fib2.sh; fib2 100Slide58

Providing input to programs

It is sometimes convenient not to have to edit a program to change certain data variables

Shell script allows you to read data from shell directly into program variables with the "

raw_input

" commandExamples:echo –n "Enter your name: "read name

Shailender

echo $name

Shailender

58Slide59

Command line arguments are optional data values that can be passed as input to the Shell script program as the program is run

After the name of the program, place string or numeric values with spaces separating them

Accessed them by the

xargs

variable inside the program or $1, $2, $3 …Avoid entering or replacing data by editing the programExamples:bash arguments.sh arg1 arg2 10 20Command Line ArgumentsSlide60

Creating a bash "Shell script"

The power of

linux

can be captured in a script, where commands can be placed sequentially to be executed from top to bottom, left to right

The text file containing these commands is called a "shell script"Scripts are useful because a compilation of commands executes a task in an automated and precise manner, repeatedly60Slide61

Shell scripting strategies

Use "exit" codes

Shell scripts can be terminated abruptly with the use of the "exit" command, it is desirable to terminate if errors occur, rather than continuing to run

Example

cd /home/sn34w/project1 # Change to "project1"rm –rf * # Delete everything thereWhat if "project1" did not exist and there was an error?Your entire current directory would get deleted!Use of exit codes avoids this problem

61Slide62

Shell scripting tips and tricks (…contd

)

The "$?" special variable stores an error message after every

linux

command, has value of 0 if command was successful, otherwise 1 or more (see error code array)cd /home/sn34w/project1echo $?if [[ $? eq

0 ]]; then

rm

rf

*

fi

62Slide63

Useful Shell scripting tips

Pipes (|) send the output of one command to another as Standard input so that powerful constructs for operating on data become possible

Order of execution is from left to right

cat

sequence.fa | grep "ACTTTA"

|

wc

-

l

A

linux

command can be split across multiple lines by using the "\" character at the end of the line

cat

sequence.fa

|

grep

\

"ACTTTA" |

wc

-l

63Slide64

Useful Shell scripting tips (…contd

)

Shell expansion with wild cards

Input and Output redirection

with "<", ">", and ">>"Tab completionCombining options/flagsUsing flag names with "--"64Slide65

Useful Shell scripting tips (…contd)

Copying and pasting clipboard with left and right mouse clicks

Using multiple shells at the same time

Using semi-colon to run commands on same line

Evaluating linux commands with backticksConditional execution of commands with && and ||65Slide66

Shell scripts in our home directory

Users of the bash shell have scripts in their home directory that control shell behaviors

.

bashrc

, executed with new interactive terminal session .bash_profile, executed with new login session.

bash_history

, contains history of commands – saves commands on exit and loads them upon start of session

.

bash_logout

, contains things to do upon logout

To look any of these, say .

bashrc

, do:

ls

–a ~ # Display hidden files in home

dir

vi ~/.

bashrc

# Open .

bashrc

file in home

dir

00/00/2010

Information Services,

66Slide67

Shell script example: Downloading the human genome

The hg19 build of the human genome can be downloaded from the UCSC website, but before it is usable, it has to be unzipped, "cleaned up", etc.

vi make_hg19.sh

00/00/2010

Information Services,67Slide68

Using Shell script programs on the cluster

Shell script scripts can easily be submitted as jobs to be run on

the

MGHPCC infrastructureBasic understanding of Linux commands is required, and an account on the cluster

Lots of useful and account registration information atwww.umassrc.orgFeel free to reach out to Research Computing for helphpcc-support@umassmed.edu 68Slide69

What is a computing "Job"?

A computing "job" is an instruction to the HPC system to execute a command or script

Simple

linux commands or Shell script/Shell script/R scripts that can be executed within

miliseconds would probably not qualify to be submitted as a "job"Any command that is expected to take up a big portion of CPU or memory for more than a few seconds on a node would qualify to be submitted as a "job". Why? (Hint: multi-user environment)

69Slide70

How to submit a "job"

The basic syntax is:

bsub

<valid linux command> bsub: LSF command for submitting a job

Lets say user wants to execute a Shell script script. On a

linux

PC, the command is

bash countDNA.sh

To submit a job to do the work, do

bsub

bash countDNA.sh

70Slide71

Specifying more "job" options

Jobs can be marked with options for better job tracking and resource management

Job should be submitted with parameters such as queue name, estimated runtime, job name, memory required, output and error files, etc.

These can be passed on in the

bsub command bsub –q short –W 1:00 –R rusage[

mem

=2048] –J "

Myjob

" –o

hpc.out

–e

hpc.err

bash countDNA.sh

71Slide72

Job submission "options"

72

Option flag

or name

Description

-q

Name

of queue to use. On our systems, possible values are "short" (<=4

hrs

execution time), "long" and "interactive"

-W

Allocation

of node time. Specify hours and minutes as HH:MM

-J

Job name.

Eg

"

Myjob

"

-o

Output file.

Eg

. "

hpc.out

"

-e

Error file.

Eg

. "

hpc.err

"-RResources requested from assigned node. Eg: "-R rusage[mem=1024]", "-R hosts[span=1]"-nNumber of cores to use on assigned node. Eg. "-n 8"Slide73

Why use the correct queue?

Match requirements to resources

Jobs dispatch quicker

Better for entire clusterHelp GHPCC staff determine when new resources are needed

73Slide74

Questions?

How can we help further?

Please check out books we recommend as well as web references (next 2 slides)

00/00/2010

Information Services,74Slide75

Shell script Books

Shell script books which may be helpful

http://shop.oreilly.com/product/9781118983843.do

  

Linux Command Line and Shell Scripting Bible, 3rd Edition http://shop.oreilly.com/product/9781118004425.do    Linux Command Line and Shell Scripting Bible, 2nd Editionhttp://shop.oreilly.com/product/9781782162742.do    Linux Shell Scripting Cookbook, 2nd Edition http://shop.oreilly.com/product/9780764583209.do   Beginning Shell Scripting

75Slide76

Shell script References

http://en.wikipedia.org/wiki/

Shell

_

script http://linuxcommand.org/writing_shell_scripts.php http://www.freeos.com/guides/lsst/ http://www.steve-parker.org/sh/sh.shtml

http://linuxconfig.org/bash-

scripting

-tutorial

00/00/2010

Information Services,

76