Una introducción a los algoritmos del Parsing. Pregunta inicial… ¿Cómo se puede determinar si un código escrito en un lenguaje de programación tiene sintaxis.

Una introducción a los algoritmos del Parsing

Pregunta inicial… ¿Cómo se puede determinar si un código escrito en un lenguaje de programación tiene sintaxis correcta?

Leftmost Derivations  Una derivación a izquierda de una cadena sólo permite resolver en cada paso la variable más a la izquierda  aaBA => aaBa NO hace parte de una derivación a izquiereda.  Si w está en L(G) entonces w admite una derivación a izquierda (Teorema 4.1.1).

Ambiguedad  Una gramática G es ambigua si existe w en L(G) que admite dos derivaciones a izquierda. s aS | Sa | a Es ambigua porque aa admite dos derivaciones a izquierda: S => aS =>aaS => Sa =>aa Un Lenguaje es inherentemente ambiguo, si todas las gramáticas que lo generan son ambiguas.

Ejemplo 4.1.2 s bS | Sb | a Genera el lenguaje b*ab*. Es ambigua porque bab admite dos derivaciones a izquierda: S => bS =>bSb=>babS => Sb =>bSb=>bab b*ab* se puede generar por las gramáticas no ambiguas: S bS | aA A bA | S bS | A A Ab | a Existe una correspondencia biyectiva entre los árboles de derivación y las derivaciones a izquierda (a derecha).

Grafo de una gramática. S aS | bB | B aB | bS | bC C aC | S aS bB aaS abB abaB bbS bbC aaaS aabB aa abaB abbS abbC baaB babSbabC bbaS bbbB bb bbaC bb

Recorrido transversal descendente

EJEMPLO EJEMPLO Dada la gramática: AE: V = {S, A, T} Σ = {b, +, (, )} Σ = {b, +, (, )} P: 1. S → A P: 1. S → A 2. A → T 2. A → T 3. A → A + T 3. A → A + T 4. T → b 4. T → b 5. T → (A) 5. T → (A) analizar la cadena (b + b)

(b) (b) b (T)((A)) b (T)((A)) (b+T) ( b+b) (b+T) ( b+b) T(T+T)((A)+T) T(T+T)((A)+T) (A) (A+T) (A+T+T)(T+T+T) S A (A+T+T+T) S A (A+T+T+T) b+T b+T(b)+T T+T(A)+T(T)+T((A))+T A+T (A+T)+T (T+T)+T A+T (A+T)+T (T+T)+T(A+T+T)+Tb+T+T A+T+TT+T+T(A)+T+T(T)+T+T (A+T)+T+Tb+T+T+T A+T+T+TT+T+T+T(A)+T+T+T 1A+T+T+T+TT+T+T+T+T A+T+T+T+T+T (A),T+T,A+T+T T+T,A+T+T,(T) T+T,A+T+T,(T),(A+T)

Breadth-First Top-down Parsing Algorithm input: context-free grammar G = (V, Σ, P, S) string p  Σ* string p  Σ* queue Q queue Q 1. initialize T with root S INSERT(S, Q) INSERT(S, Q) 2. repeat 2.1. q ≔ REMOVE(Q) 2.1. q ≔ REMOVE(Q) 2.2. i ≔ 0 2.2. i ≔ 0 2.3. done ≔ false 2.3. done ≔ false Let q = uAv where A is the leftmost variable in q. Let q = uAv where A is the leftmost variable in q. 2.4. repeat 2.4. repeat 2.4.1. if there is no A rule numbered greater than i then done ≔ true 2.4.1. if there is no A rule numbered greater than i then done ≔ true 2.4.2. if not done then 2.4.2. if not done then Let A → w be the first A rule with number grater than i. Let j Let A → w be the first A rule with number grater than i. Let j be the number of this rule. be the number of this rule. 2.4.2.1. if uwv ∉ Σ* and the terminal prefix or uwv matches a 2.4.2.1. if uwv ∉ Σ* and the terminal prefix or uwv matches a prefix of p then prefix of p then 2.4.2.1.1. INSERT(uwv, Q) 2.4.2.1.1. INSERT(uwv, Q) 2.4.2.1.2. Add node uwv to T. Set a pointer from 2.4.2.1.2. Add node uwv to T. Set a pointer from uwv to q. uwv to q. end if end if 2.4.3 i ≔ j 2.4.3 i ≔ j until done or p = uwv until done or p = uwv until EMPTY( Q) or p = uwv until EMPTY( Q) or p = uwv 3. if p = uwv then accept else reject

Recorrido descendente profundo

AE: V = {S, A, T} Σ = {b, +, (, )} Σ = {b, +, (, )} P: 1. S → A P: 1. S → A 2. A → T 2. A → T 3. A → A + T 3. A → A + T 4. T → b 4. T → b 5. T → (A) 5. T → (A) p= (b + b) S A [S,1] T [A,2] b [T,4] (A) [T,5] (T) [(A),2] [(T),4] (b) ((A)) [(T),5] (A+T) [(A),3] [(A+T),2] (T+T) (b+T) [(T+T),4] (b+b) [(b+T),4] (b+b)

Depth-First Top-down Algorithm input: context-free grammar G = (V, Σ, P, S) string p  Σ* string p  Σ* stack S 1.PUSH( [S, 0], S) 2.repeat 2.1 [q, i] = POP(S) 2.1 [q, i] = POP(S) 2.2dead-end = false 2.3repeat Let q = uAv where A is the leftmost variable in q. Let q = uAv where A is the leftmost variable in q. 2.3.1 if u is not a prefix of p then dead-end = true 2.3.2 if there are no A rules numbered greater i then dead-end = true 2.3.3 if not dead-end then Let A → w be the first A rule with number greater than i. Let A → w be the first A rule with number greater than i. Let j be the number of this rule. 2.3.3.1 PUSH( [q, j], S) 2.3.3.2q = uwv 2.3.3.3i = 0 end if until dead-end or q  Σ * until q = p or EMPTY(S) 3.if q = p then accept else reject

AE: V = {S, A, T} Σ = {b, +, (, )} Σ = {b, +, (, )} P: 1. S → A P: 1. S → A 2. A → T 2. A → T 3. A → A + T 3. A → A + T 4. T → b 4. T → b 5. T → (A) 5. T → (A) p= (b )+ b S A [S,1] T [A,2] b [T,4] (A) [T,5] (T) [(A),2] [(T),4] (b) ((A)) [(T),5] (A+T) [(A),3] [(A+T),2] (T+T) (b+T) [(T+T),4] ((A)+T) [(T+T),5] [(A+T),3] (A+T+T) [(A+T+T),2]

Algoritmos Ascendentes  Reducción: Dado w encontrar las w’ tales que w’=>w. En este caso w’ es una reducción de w.  Pattern Matching Scheme: Se descompone w en w=uv, los sufijos de u se comparan con los lados derechos de las reglas.  Un “matching” se obtiene cuando se encuentra u=u 1 q y una regla A  q entonces w se reduce a u 1 Av.

Reducción de (A+T) u v Regla Reducción (A+T)  ( A +T) S  A (S+T) ( A+ T)  ( A+T ) A  A+T (A)  (A+T ) A  T (A+A) ( A+T)

Algoritmo ascendente transversal AE: V = {S, A, T} Σ = {b, +, (, )} Σ = {b, +, (, )} P: 1. S → A P: 1. S → A 2. A → T 2. A → T 3. A → A + T 3. A → A + T 4. T → b 4. T → b 5. T → (A) 5. T → (A) p= (b + b) (b+b) (b+T) (T+b) (A+b) (T+T) (b+A) (A+b), (T+T), (b+A) (S+b) (T+T), (b+A), (S+b) (A+T) (T+T), (b+A), (S+b), (A+T) (T+A) (b+A), (S+b), (A+T), (T+A) (b+S) (S+b), (A+T), (T+A), (b+S) (A+T), (T+A), (b+S), (S+T) (A+A) (T+A), (b+S), (S+T), (A+A),(A) (A) (T+S) (b+S), (S+T), (A+A),(A), (T+S) (S+T), (A+A),(A), (T+S) (A+S) (A+A),(A), (T+S), (S+A) (S+T) (S+A) (A), (T+S), (S+A),(A+S) (S) T (T+S), (S+A), (A+S), (S), T (S+A), (A+S), (S), T (S+S) (A+S), (S), T, (S+S) T, (S+S) A (S+S), A A S S

Breadth-First Bottom-up Parser Input: context-free grammar G = (V, Σ, P, S) string p  Σ* queue Q 1.Initialize T with root p INSERT(p,Q) 2.repeat q ≔ REMOVE(Q) q ≔ REMOVE(Q) 2.1. for each rule A → w in P do 2.1. for each rule A → w in P do 2.1.1.if q = uwv with v  Σ* then 2.1.1.if q = uwv with v  Σ* then 2.1.1.1 INSERT(uAv, Q) 2.1.1.2Add node uAv to T. Set a pointer from uAv to q. end if end for until q = S or EMPTY(Q) 3.If q = S then accept else reject

(T+b) AE: V = {S, A, T} Σ = {b, +, (, )} Σ = {b, +, (, )} P: 1. S → A P: 1. S → A 2. A → T 2. A → T 3. A → A + T 3. A → A + T 4. T → b 4. T → b 5. T → (A) 5. T → (A) (A+b) (T+T) [ (T, 2, +b) ] [ (T+b, 4, ) ] (T+A) (A+T) [ (A+b, 2,) ] (T+A) (A+b)

Depth-Bottom-up Parsing Algorithm input: context-free grammar G = (V, Σ, P, S) with nonrecursive start symbol string p  Σ* stack S 1. PUSH([λ, 0, p], S) 2. repeat 2.1 [u, i, v] ≔ POP(S) 2.1 [u, i, v] ≔ POP(S) 2.2 dead-end ≔ false 2.2 dead-end ≔ false 2.3 repeat Find the first j > i with rule number j that satisfies Find the first j > i with rule number j that satisfies i) A → w with u = qw and A ≠ S or i) A → w with u = qw and A ≠ S or ii) S → w with u = w and v = λ ii) S → w with u = w and v = λ 2.3.1. if there is such a j then 2.3.1. if there is such a j then 2.3.1.1. PUSH ([u, j, v], S) 2.3.1.1. PUSH ([u, j, v], S) 2.3.1.2. u ≔ qA 2.3.1.2. u ≔ qA 2.3.1.3. i ≔ 0 2.3.1.3. i ≔ 0 end if end if 2.3.2 if there is no such j and v ≠ λ then 2.3.2 if there is no such j and v ≠ λ then 2.3.2.1. shift(u, v) 2.3.2.1. shift(u, v) 2.3.2.2. i ≔ 0 2.3.2.2. i ≔ 0 end if end if 2.3.3 if there is no such j and v = λ then dead-end ≔ true 2.3.3 if there is no such j and v = λ then dead-end ≔ true until (u = S) or dead-end until (u = S) or dead-end until (u = S) or EMPTY(S) until (u = S) or EMPTY(S) 3. if EMPTY(S) then reject else accept

AE: V = {S, A, T} Σ = {b, +, (, )} Σ = {b, +, (, )} P: 1. S → A P: 1. S → A 2. A → T 2. A → T 3. A → A + T 3. A → A + T 4. T → b 4. T → b 5. T → (A) 5. T → (A) u i v 0 (b+b) [, 0, (b+b) ] ( 0 b+b) (b 0 +b) [ (b, 4, +b) ] (T 0 +b) [ (T, 2, +b) ] (A 0 +b) (A+ 0 b) (A+b 0 ) [ (A+b, 4, ) ] (A+T 0 ) [ (A+T, 2, ) ] (A+A 0 ) (A+A) 0 (A+T 2 ) [ (A+T, 3, ) ] (A 0 ) (A ) 0 [ (A), 5, ] T 0 [ T, 2, ] A 0 [ A, 2, ] S 0

Notas Bibliográficas  Ambigüedad: Floyd[1962], Cantor[1962], Chomsky and Schutzenberger [1963].  Lenguajes Inherentemente ambiguos: Harrison[1978], Ginsburg and Ullian[1966].  Depth-first: Dennig, Dennis and Qualitz[1978].  Referencia Clásica: Knuth: “The Art of Computer programing: Vol I Fundamental Algorithms”

Una introducción a los algoritmos del Parsing. Pregunta inicial… ¿Cómo se puede determinar si un código escrito en un lenguaje de programación tiene sintaxis.

Presentaciones similares

Presentación del tema: "Una introducción a los algoritmos del Parsing. Pregunta inicial… ¿Cómo se puede determinar si un código escrito en un lenguaje de programación tiene sintaxis."— Transcripción de la presentación:

Presentaciones similares

Sobre el proyecto

Feedback

Iniciar la sesión

Autorizarse a través de una red social:

Una introducción a los algoritmos del Parsing. Pregunta inicial… ¿Cómo se puede determinar si un código escrito en un lenguaje de programación tiene sintaxis.

Presentaciones similares

Presentación del tema: "Una introducción a los algoritmos del Parsing. Pregunta inicial… ¿Cómo se puede determinar si un código escrito en un lenguaje de programación tiene sintaxis."— Transcripción de la presentación:

Presentaciones similares

Sobre el proyecto

Feedback