Abdullah Sheneamer MSCS Graduate Candidate FALL 2012 DCSPM Develop and Compile Subset of PASCAL Language to MSIL 1 Abdullah Sheneamer Master project presentation 11xx2012 Outline ID: 189968
Download Presentation The PPT/PDF document "Master Project" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Master ProjectAbdullah SheneamerMSCS Graduate CandidateFALL 2012
DCSPM: Develop and Compile Subset of PASCAL Language to MSIL
1
Abdullah Sheneamer Master project presentation
11/xx/2012Slide2
OutlineIntroduction to MSILRelated WorksWhy PASCAL to MSILPASCAL CompilerLexical Analyzer Design
Symbol Table DesignParser and MSIL Design ImprovementsEvaluationsLesson LearnedFuture WorkConclusion2Abdullah Sheneamer Master project presentation11/xx/2012Slide3
Introduction to MSILMicrosoft intermediate language(MSIL) is the lowest-level human readable programming language defined by the Common Language Infrastructure (CLI) specification and .NET Framework(MSIL) includes instructions for loading, storing, initializing, and calling methods on objects, as well as instructions for arithmetic and logical operations.3Abdullah Sheneamer Master project presentation
11/xx/2012Slide4
Related Works11/xx/2012Abdullah Sheneamer Master project presentation4 “The Design and Implementation of C-like Language Interpreter” [XX11]
The authors presented a paper designs and implements a C-like language interpreter using C++ based on the idea of modularity. The function of lexical analyzer is to read character strings from the source program, split them into separate words, and constructs the internal expression of these words, that is, TOKEN. The basic idea of lexical analyzer design is: first, to judge the start and the end position of a word; second, to judge the attribute of a word. After a word is separated, the next thing is to determine its attribute “Simple Calculator Compiler Using Lex and YACC” [Upad11] The author presented a paper containing the details of how one can develop the simple compiler for procedural language using Lex (Lexical Analyzer Generator) and YACC (Yet Another Compiler-Compiler). Lex tool helps write programs whose control flow is directed by instances of regular expressions in the input stream. Slide5
Why PASCAL to MSIL - Allow PASCAL to run on .NET platform - Study how compiler in .NET environment work - PASCAL can now be run on modern machines - MSIL is platform independent - JIT compilers can be optimized for specific machines and architectures5Abdullah Sheneamer Master project presentation
11/xx/2012Slide6
PSCAL Compiler Compilation process: takes a PASCAL source code and produce (MSIL) Microsoft intermediate language.Execution process: MSIL must be converted to CPU-specific code, usually by a just-in-time(JIT) Compiler . Native code is computer programming (code) that is compiled to run with a particular processor (such as an Intel x86- class processor) and its set of instructions.6Abdullah Sheneamer Master project presentation
11/xx/2012Slide7
7Abdullah Sheneamer Master project presentation11/xx/2012Slide8
Compilation Process Lexical AnalysisParser & MSILSymbol Table
Error HandlerPASCAL Source Code
8
Abdullah
Sheneamer
Master project presentation
11/xx/2012
MSIL Code OutputSlide9
Lexical Analyzer Design 11/xx/2012Abdullah Sheneamer Master project presentation9After reading next character from input stream ;
State 0 : identify the current token and decide the next state ;State 1 : Handle identifiers and keywords.State 2: Handle Number .
State 3 : Handle one – character token or two –character token .
State 4,5 : Handle Comments “\\” or “\*”, skip the line start with “\\” or skip the data between “\*” and “*\”.Slide10
Lexical Analyzer Design (cont.)11/xx/2012Abdullah Sheneamer Master project presentation10Begin -/-1 lexbuf
=“”2- state=0;INITIAL0WhiteSpace/ No ActionLetter Or @ Or _/Place it in lexbuf
Letter Or Digit /Place it in
lexbuf
ID
1
Anything Else/ 1- return that last char into the input stream. 2- search the
lexbuf
in Symbol.3- insert it as ID if not found otherwise get the row number P. 4- build the token as: [code=
sympol
[
p,token
],[
attr
=p]
5.
Enqueue
the token and set
lexbuf
=“”.
Anything Else/ 1- return that last char into the input stream. 2- Build the token as : [code: NUM,
attr
: value]
3.
Enqueue
the token and set
lexbuf
=“”.
NUM
2
Digit/Place it in
lexbuf
Letter Or @ Or _/Place it in
lexbufSlide11
Lexical Analyzer Design (cont.)11/xx/2012Abdullah Sheneamer Master project presentation11
4/10/2012Abdullah Sheneamer Master projectINITIAL0Unrelated
Chararcter
1- Return last char into input stream.2- Build the token: [ Code=ASCII(first char in
lexbuf
);
attr
=-1]
3-
lexbuf
=“”; state=0;
4- Return the token to the parser.
One or Two Char
3
Sequence is”//”/ state=4;
Anything else/Place it in
lexbuf
Sequence is”*/”/
lexbuf
=“”; state=0;
Sequence is”/*”/
lexbuf
=“”; state=5;
Other character: 1- Place it in
lexbuf
. 2- Get the code for the two
charcter
token in
lexbuf
. 3- Build the token:[code = obtained code;
attr
=-1]. 4-
lexbuf
=“”; state=0. 5- Return the token to the parser
Multiple line comment
5
Single line comment
4
New line/
lexbuf
=“”; state=0;
Anything else/Place it in
lexbufSlide12
Symbol Table Design11/xx/2012Abdullah Sheneamer Master project presentation12Every key word is a token and has a unique integer code
The identifier token has a code 256 The number token has a code 257For every special character is a token and has an integer token code equals its ASCII number. Tokens of two characters have unique to Codes Token CodeKeyword
300
Begin
323
If
302
For
305
Switch
376
While
Token Code
Tow – Characters Tokens
406
!=
407
==
408
<=
409
>=Slide13
Parser and MSIL Design 11/xx/2012Abdullah Sheneamer Master project presentation13The parser is used the most of PASCAL Grammar BNF [22]
Such as nested if/else and if logic expression statement.Slide14
Parser and MSIL Design (Cont.)11/xx/2012Abdullah Sheneamer Master project presentation14Slide15
Parser and MSIL Design (Cont.)11/xx/2012Abdullah Sheneamer Master project presentation15Slide16
Improvements11/xx/2012Abdullah Sheneamer Master project presentation16Two Improvements in DCSPM Compiler: 1- Lexical Analysis Improvement
Array ListDictionarySlide17
Improvements (Cont.)11/xx/2012Abdullah Sheneamer Master project presentation172- MSIL Code Output ImprovementSimple Pascal Code:
begina:=0; b:=1; c:=2;if( a== 0) then begin a:= b+c; end; end;end. IL_0000: ldc.i4.0 IL_0001: stloc.0 IL_0002: ldc.i4.1 IL_0003: stloc.1 IL_0004: ldc.i4.2 IL_0005: stloc.2 IL_0006: ldloc.0 IL_0007: ldc.i4.1 IL_0008: ceq IL_000a: stloc.3 IL_000b: ldloc.3 IL_000c:
brfalse.s IL_0012
IL_000e: ldloc.1 IL_000f: ldloc.2 IL_0010: add
IL_0011: stloc.0
IL_0012: ret
IL_0000: ldc.i4.0
IL_0001: stloc.0
IL_0002: ldc.i4.1
IL_0003: stloc.1
IL_0004: ldc.i4.2
IL_0005: stloc.2
IL_0006: ldloc.0
IL_0007: ldc.i4.1
IL_0008:
ceq
IL_000a: ldc.i4.0
IL_000b:
ceq
IL_000d: stloc.3
IL_000e: ldloc.3
IL_000f:
brtrue.s
IL_0015
IL_0011: ldloc.1
IL_0012: ldloc.2
IL_0013: add
IL_0014: stloc.0
IL_0015: retSlide18
Evaluations11/xx/2012Abdullah Sheneamer Master project presentation181- Array list data structure vs. Dictionary data structureSlide19
Evaluations (cont.)11/xx/2012Abdullah Sheneamer Master project presentation19
CollectionOrderingContiguous Storage?
Direct Access?
Lookup Efficiency
Manipulate
Efficiency
Notes
Dictionary
Unordered
Yes
Via Key
Key:
O(1)
O(1)
Best for high performance lookups.
ArrayList
User has precise control over element ordering
Yes
Via Index
O(n)
O(n)
Best for smaller lists
Complexity of Array list vs. DictionarySlide20
Evaluations (cont.)11/xx/2012Abdullah Sheneamer Master project presentation202- Parser phase testSlide21
Evaluations (cont.)11/xx/2012Abdullah Sheneamer Master project presentation213- Initial and Improved nested If/else MSIL CodeSlide22
Evaluations (cont.)11/xx/2012Abdullah Sheneamer Master project presentation22Size of Initial and Improved nested if/else MSIL CodeSlide23
Lessons Learned11/xx/2012Abdullah Sheneamer Master project presentation23ildasm.EXE: Converts IL to human readable code tool
C:\Program Files\Microsoft SDKs\Windows\v7.0A\binILASM.EXE: Converts human readable code to IL toolC:\WINDOWS\Microsoft.NET\Framework\v1.1.4322Or C:\Windows\Microsoft.NET\Framework\v2.0.50727Date Time and Time SpanDateTime Start = DateTime.Now; lex(); TimeSpan Elapsed = DateTime.Now- Start;
speed = "Time Elapsed of Lexical Analysis: " +
Elapsed.TotalMilliseconds + "ms";
Stopwach
class
System.Diagnostics.Stopwatch
stopwatch = new
System.Diagnostics.Stopwatch
();
Stopwatch
stopwatch
= new Stopwatch();
Stopwatch.Start
();
lex
();
stopwatch.Stop
();
speed = "Time Elapsed of Lexical Analysis: " +
Elapsed.TotalMilliseconds
+ "ms";Slide24
Lessons Learned (cont.)11/xx/2012Abdullah Sheneamer Master project presentation24
Nested if/else logic statement Slide25
Future Works11/xx/2012Abdullah Sheneamer Master project presentation25Many statements and data structures of Pascal language are yet to be supported and related MSIL generated:
1- complicated case statement.2- if logic of a complex condition with multiple levels 3- assert statement 4- exit statement 5- goto statement6- repeat statement 7- next statement 8- complicated one dimensional array, 9- two dimensional array data structure 10- queue data structure 11- stack data structureSlide26
Conclusion11/xx/2012Abdullah Sheneamer Master project presentation26The DCSPM compiler is useful to legacy Pascal to run on modern machines and its MSIL is a platform independent. MSIL code is verified for safety during runtime and MSIL can be executed in any environment supporting the CLI (Common Language Infrastructure).
One dimensional array has two cases when compiling to MSIL. First, when the array has one element or 2 elements will be the same looks like the MSIL of other statements ( if/else/while….etc) The initial lexical analysis is using array list data structure in symbol table and the improved lexical analysis which is using a dictionary data structure in symbol table too. So, when I had tested the two situations by Stopwatch class.Slide27
Conclusion (cont.)11/xx/2012Abdullah Sheneamer Master project presentation27A batch timer.cmd file to calculate time of MSIL results. Improved nested if/else statement faster than initial nested if/else statement, although both of them have the same results. The experiences learned in this project can serve as a foundation for developing new programming language.Slide28
Demo & Questions11/xx/2012Abdullah Sheneamer Master project presentation28http://cs.uccs.edu/~gsc/pub/master/asheneam/src/COMPILER/bin/Debug/Slide29
Bibliography[MC5tk]: http://msdn.microsoft.com/en-us/library/c5tkafs1(v=vs.71).aspx[XX11]: Xiaohong Xiao and You Xu “The Design and Implementation of C-like Language
Interpreter” Proceedings of 2nd International Symposium on Intelligence Information Processing and Trusted Computing (IPTC), pp. 104-107, 2011[Upad11]: Mohit Upadhyaya “Simple Calculator Compiler Using Lex
and YACC” Proceedings of 3rd
IEEE Interenational Conference on
Elecronic
Computer Technology (ICECT), Vol. 6, pp. 182-187, 8-10 April 2011
[DLNYM]: C# To Program By H.M
Deitel
&
P.J.Deitel
&
J.Listfield
& T.R. Nieto &
C.Yaeger
&
M.Zlatkina
.
[L97]: Compiler Construction principles and practice by
Kennth
C.louden
[MN11]: Data Structure using Java By
D.S.Malik
&
P.S.Nair
.
[L06]: An introduction to formal languages and automata. Fourth Edition. Peter Linz
[ASU11]: Compilers Principles, Techniques and Tools (2nd Edition)
Alfred V.
Aho
,
Monica S. Lam
,
Ravi
Sethi
,
Jeffrey D.
Ullman
[AL09]: Develop a Compiler in Java for a Compiler Design Course Abdul
Sattar
and
Torben
Lorenzen
[Assembly11]:
Guide to assembly language [electronic resource] : a concise introduction / James T.
Streib.
Streib
, James T. London ; New York : Springer, c2011.
[WFRBE89-90]: Using a Stack Assembler Language in a Compiler Course by Dr. Gerald
Wildenberg
St . John Fisher College, Rochester, NY Bristol Polytechnic, England (1989-1990 )
29
Abdullah
Sheneamer
Master project presentation
11/xx/2012Slide30
Bibliography (cont.)11/xx/2012Abdullah Sheneamer Master project presentation30[ LS56]: Expert .NET 2. IL assembler/ Serge Lidin.
Lidin, Serge. 1956- Berkeley, CA[CodeProject]: http://www.codeproject.com/Articles/3778/Introduction-to-IL-Assembly-Language[MHt8e]: http://msdn.microsoft.com/en-us/library/ht8ecch6(v=vs.71)[ K08]:Pro C# 2008 and the .NET 3.5 Platform, Fourth Edition[ CodeMSIL]: http://www.codeguru.com/csharp/.net/net_general/il/article.php/c4635/MSIL-Tutorial.htm
[WikiPascal
]: http://en.wikipedia.org/wiki/Pascal_(programming_language
)
[
PagesCs
]:
http://pages.cs.wisc.edu/~fischer/cs536.s08/lectures/Lecture02.4up.pdf
[
MArraylist
]:
http://msdn.microsoft.com/en-us/library/system.collections.arraylist.aspx
[MKx37]:
http://msdn.microsoft.com/en-us/library/kx37x362.aspx
[
WikiExpr
]:
http://en.wikipedia.org/wiki/Microsoft_Visual_Studio_Express#Visual_C.23_Express
[
DllAssem
]:
http://dll-repair-tools.com/dll-files/fusiondll-the-assembly-manager
[
learnExp
]:
http://www.learnvisualstudio.net/start-here/lesson-1-1-installing-visual-c-2010-express-edition/
)
[
SeasPascal
]:
http://www.seas.gwu.edu/~hchoi/teaching/cs160d/pascal.pdf
[
GeekClass
]:
http://geekswithblogs.net/BlackRabbitCoder/archive/2011/06/16/c.net-fundamentals-choosing-the-right-collection-class.aspx
[
DotArray
]:
http://www.dotnetperls.com/arraylist
[
Ecma
]:
http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-335.pdf