Syntax Analysis parsing methods questions

Whether you're a newbie or an experienced programmer, any questions, help, or just talk of any language will be welcomed here.

Moderator: Coders of Rage

Post Reply
User avatar
Mr. Achie
Chaos Rift Newbie
Chaos Rift Newbie
Posts: 31
Joined: Mon Feb 16, 2009 2:23 pm
Current Project: Basic 2d game engine
Favorite Gaming Platforms: Playstation 1
Programming Language of Choice: C++, Java
Location: The Netherlands

Syntax Analysis parsing methods questions

Post by Mr. Achie »

Hi all!

I have a question regarding the top-down and bottom-up approach being used during the Syntax Analysis of a compiler. I couldn't find a clear answer on this (even in the dragonbook).

What I don't understand:
Are these top-down/bottom-up methods being used to actually construct a parse tree? Or are these being used to check whether an input string matches a grammar, by going through an existing parse tree? And if the latter, where does the tree come from? How was it generated?

Thank you very much!
User avatar
bbguimaraes
Chaos Rift Junior
Chaos Rift Junior
Posts: 294
Joined: Wed Apr 11, 2012 4:34 pm
Programming Language of Choice: c++
Location: Brazil
Contact:

Re: Syntax Analysis parsing methods questions

Post by bbguimaraes »

Hi! Good to see a different question around here ;)

It all starts with the grammar. Say you have a simple grammar:

Code: Select all

<a> ::= "a" <a> | ""
Suppose we have the string "aa". In a top-down approach, we would have the following steps:

Code: Select all

Start with the root of the grammar:
Parsing: <a>
String: |aa$

Expand the first nonterminal:
Parsing: ["a" <a>|""]
String: |aa$

Try to match the first element. Success:
Parsing: "a" <a>
String: a|a$

Expand the first nonterminal:
Parsing: "a" ["a" <a>|""]
String: a|a$

Try to match the second element. Success:
Parsing: "a" "a" <a>
String: aa|$

Expand the first nonterminal:
Parsing: "a" "a" ["a" <a>|""]
String: aa|$

Try to match the third element. Success:
Parsing: "a" "a" ""
String: aa$|

No more nonterminal elements and string completely parsed. Success.
By contrast, in a bottom-up approach:

Code: Select all

Take the first element of the string:
Parsing:
String: a|a$

Try to find a rule that matches. Success:
Parsing: "a" <a>
String: a|a$

Take the next element of the string:
Parsing: "a" <a>
String: aa|$

Try to find a rule that matches. Success:
Parsing: "a" "a" <a>
String: aa|$

Take the next element of the string:
Parsing: "a" "a" <a>
String: aa$|

Try to find a rule that matches. Success:
Parsing: "a" "a" ""
String: aa$|

String completely parsed and no more nonterminal elements. End.
In both approaches, we end up with a parse tree that looks like this (I don't know if this is always the case, I don't think so, but I'm not certain):

Code: Select all

   <a>
  /   \
"a"   <a>
     /   \
   "a"   <a>
          |
          ""
As you can see, in a top-down approach, we start from the nonterminal elements of the grammar and build the tree by expanding the nodes down. In a bottom-up approach, we start with the terminal elements and build the tree up.

This is a really rough explanation, but since you said you read the dragon book, I felt there was no need to be 100% correct in notation and terminology. Also, this was a very simple example, with no surprises (e.g.: errors or backtracking) because it wouldn't add anything to the explanation of top-down vs. bottom-up.

Hope that helps.
User avatar
Mr. Achie
Chaos Rift Newbie
Chaos Rift Newbie
Posts: 31
Joined: Mon Feb 16, 2009 2:23 pm
Current Project: Basic 2d game engine
Favorite Gaming Platforms: Playstation 1
Programming Language of Choice: C++, Java
Location: The Netherlands

Re: Syntax Analysis parsing methods questions

Post by Mr. Achie »

Thanks for the reply! It is definitely more clear now. :)
User avatar
Mr. Achie
Chaos Rift Newbie
Chaos Rift Newbie
Posts: 31
Joined: Mon Feb 16, 2009 2:23 pm
Current Project: Basic 2d game engine
Favorite Gaming Platforms: Playstation 1
Programming Language of Choice: C++, Java
Location: The Netherlands

Re: Syntax Analysis parsing methods questions

Post by Mr. Achie »

I have one more (hopefully last :oops: ) question. It guess it is unnecessary to open a new topic though.
The question is regarding the Syntax Directed Translation.

The dragonbook explained that the front end consisted of the following phases:
Lexical analysis - Syntax Analysis - Semantic Analysis - Intermediate code generation.

According to a stanford handout:
In other words, the parsing process and parse trees are used to direct semantic analysis and the translation
of the source program. This can be a separate phase of a compiler or we can augment our conventional grammar with information to control the semantic analysis and translation. Such grammars are called attribute grammars.
Source: http://www.personal.kent.edu/~rmuhamma/ ... /phase.htm

So my question is:
I assume that the separate phase they are talking about is just the semantic analysis itself correct? But now it is pretty much happening during the parsing (syntax directed translation).
User avatar
bbguimaraes
Chaos Rift Junior
Chaos Rift Junior
Posts: 294
Joined: Wed Apr 11, 2012 4:34 pm
Programming Language of Choice: c++
Location: Brazil
Contact:

Re: Syntax Analysis parsing methods questions

Post by bbguimaraes »

Well, I'm not aware of any compiler doing this, but I think it is common on other applications of parsing. The idea is that during the syntactical analysis, if you see this:

Code: Select all

x = 1;
the output can be:
  • if the semantic analysis is tied to the syntactic analysis, produce the actual output of the compiler (either IR or machine language).
  • if the two processes are independent, produce some kind of intermediary representation, like "assignment(identifier(x), value(1))", which is later yielded to a semantic analyzer, producing the same output as above.
So the difference here is the coupling between the two operations. In the first case, syntactic and semantic analysis are directly coupled. Usually, that means the same piece of code that identifies the syntactic elements generates the semantic output. In the second case, two separate pieces of code, usually two distinct (OS) processes, handle the each phase. The output of the syntactic analyzer is piped to the semantic analyzer.

Basically, the second case separates the two processes in two layers of abstraction, with the usual advantages and disadvantages that come with this separation (which we are all acquainted with).
User avatar
Mr. Achie
Chaos Rift Newbie
Chaos Rift Newbie
Posts: 31
Joined: Mon Feb 16, 2009 2:23 pm
Current Project: Basic 2d game engine
Favorite Gaming Platforms: Playstation 1
Programming Language of Choice: C++, Java
Location: The Netherlands

Re: Syntax Analysis parsing methods questions

Post by Mr. Achie »

Thank you for all the help, bbguimaraes!
Post Reply