/
Basicconceptsofnumericalmethods Number representations The smallest addressable unit is Basicconceptsofnumericalmethods Number representations The smallest addressable unit is

Basicconceptsofnumericalmethods Number representations The smallest addressable unit is - PDF document

liane-varnes
liane-varnes . @liane-varnes
Follow
437 views
Uploaded On 2015-01-20

Basicconceptsofnumericalmethods Number representations The smallest addressable unit is - PPT Presentation

Integer usually 2 bytes one bit reserved for the sign Then the absolute v alue of an inte ger is at most 2 15 1 32 767 Often also double integers 4 bytes Largest value 2 31 1 2 147 483 647 brPage 2br Real number usually 4 bytes sign coef64257cie ID: 33493

Integer usually bytes

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Basicconceptsofnumericalmethods Number r..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

BasicconceptsofnumericalmethodsNumberrepresentationsThesmallestaddressableunitisusuallyan8bitbyte(exceptinsomewordbasedma-chines,likeCray).Integer,usually2bytes,onebitreservedforthesign.Thentheabsolutevalueofaninte-gerisatmost2151=32767.Oftenalsodoubleintegers,4bytes.Largestvalue2311=2147483647. Realnumberusually4bytes.-sign-coecient(mantissa),normalizedintherange0.1{1-exponent:sign+absolutevalueor2'scomplementE.g.onaPCthemantissaofasingleprecisionrealnumberconsistsof23bits.Thebitfollowingthedecimalpointisalways1,andneednotbestored.Precision24bitsorlog102247decimals.Exponentpart8bits,e=127(bias)+trueexponent;1e254.Rangeofvalues21261038{212831038.Doubleprecision:abouttwiceasmanysigni cantdigitsandwiderrangeofvalues.OnaPCtheprecisionisabout15decimals,andrangeabout10308:::10308.(Ifthisrangeisnotsucient,youshouldprobablyrethinkyouralgorithm...) FiniteprecisionPrecisionand\roatingpoint(realnumber)representationsaredescribedbymachinecon-stants."Machineepsilon"isthesmallestnumberthataddedtooneproducesasumbiggerthanone:=minfxj1+x�1g:OnaPCthemachineconstantofdoubleprecisionrealnumbersis=2:21016.NB!thisisverymuchbiggerthanthesmallestrepresentablepositivevalue. PC'sandsomeothersystemsusetheIEEEextendedarithmeticthatincludestwospecialvalues,Inf(in nity)andNaN(NotaNumber).DivisionbyzerogivesInf,whichcanbe,withcaution,usedinfurthercalculations.How-ever,itindicatesthatthealgorithmisnotquitedecent,andinsomesystemstheprogrammaycrash.Youshould ndthesourceofthepotentialproblem.Operationslike0=0,0InfandInfInfareindeterminate,andthevalueisNaN.Allfur-theroperationswithsuchavaluewillalsogiveNaN.Thatcertainlymeansthatyoural-gorithmisnotworkingproperly. ErrorsIftheexactvalueisa,theabsoluteerrorofitsapproximatevalue~aisa=~aa:Therelativeerrorise=a a=~aa a:Oftenaisnotknown,butacanbeestimatesbye.g.itsstatisticalproperties.Anesti-mateoftherelativeerroristhenea ~a: Rounding:1.4901,1.51.5512,1.61.50022.5002Truncation:theleastsigni cantpartofthenumberisdiscarded.ThisisthewayrealnumbersareconvertedtointegersinFortran. ErrorpropagationinarithmeticoperationsAdditionInadditionalsotheerrorsareadded.Ifthesignsoftheerrorsarerandom,theerrorspartlycanceleachothers.1:57+0:76=2:33:Calculatethesumbyroundingthenumberstoonedecimalplace:1:6+0:8=2:4Therelativeerrorsofthetermsare1:61:57 1:57=0:019;0:80:76 0:76=0:053:Therelativeerrorofthesumis2:42:33 2:33=0:030; Therelativeerrorofthesumcanneverexceedthelargestrelativeerrorofthepositiveterms.Changetheexamplealittle:1:57+0:74=2:31:1:6+0:7=2:3Nowtherelativeerrotsofthetermsare1:61:57 1:57=0:019;0:70:74 0:74=0:054:Therelativeerrorofthesumis2:32:31 2:31=0:004:Additionisasmoothingoperation,ifthesignsoftheerrorsofindividualtermsareran-dom. Additioncancauseproblemsifthemagnitudesofthetermsareverydi erent.Ifthepre-cisionis7decimals,1:0+3108=1:0.Whenevaluatingalongseriesitmaybeusefultocalculate rstthesumofthesmallestterms,orclumpthetermsintogroupsinsuchawaythatthesumineachgroupisofthesamemagnitude. SubtractionInmathematics,subtractionisnotessentiallydi erentfromaddition.Whencalculatingwith niteaccuracythedi erenceiscrucial.Twoapproximationsof:1=3:160494and2=3:142857.Expressthevaluesusingthreedecimalsandsubtract:exactvalueapproximateabs.errorrel.error13.1604943.160-0.00049-0.0001623.1428573.1430.000140.00005120.0176370.017-0.00064-0.03612Therelativeerrorofthedi erenceismuchbiggerthantheoriginalerrors,becausethemostsigni cantdigitsofthemantissaspartlycanceleachothersleavingasmallernum-berofsigni cantdigits(catastrophiccancellation).Bewaresubtractionofnearlyequalvalues! Forexample,whenxissmall,thefollowingexpressionmaygiveproblems:p 1+x1;Ifx1,thesquarerootcanbereplacedwiththe rsttermsofitsTaylorexpansion:p 1+x11+1 2x1=1 2x:Theexpressioncanalsobeconvertedtoanotherform:p 1+x1=x 1+p 1+x: Example: ndtheintegralI=10x x+10dxI+1+10I=10x+1 x+10+10x x+10dx=10x(x+10) x+10dx=10xdx=1 +1;whichgivesarecurrencerelationI+1=1 +110I: I0=10dx x+10= 10ln(x+10)=ln11ln100:0953:I1=1 1100:09530:0470;I2=1 2100:04700:0300;I3=1 3100:03000:0333;I4=1 4100:03330:0833:????Thedenominatorhasalmostaconstantvalue10;henceI1 1010xdx=1 101 +1:Thustheproblemisthesubtractionofnearlyequalquantities.