Formal Languages


Introduction to Formal Languages and Lambda Calculus


Key Concepts Introduced by Noam Chomsky

Noam Chomsky developed a theory of formal languages, which classifies languages into four categories based on their complexity and the type of automata that recognize them.

The Chomsky Hierarchy: Language Classes

  1. Regular Languages (Type 3)

    • Recognized by deterministic finite automata (DFA).
    • Defined using regular expressions.
    • Limitations:
      • Can only describe simple patterns (e.g., sequences, repetitions).
      • Lack the power to express nested or hierarchical structures.
    • Example: Identifying whether a string contains "ab" using a regular expression.
  2. Context-Free Languages (Type 2)

    • Recognized by non-deterministic pushdown automata (PDA).
    • Most programming languages fall under this category.
    • Defined using context-free grammars (CFG).
    • Recognition Process:
      • Given a string, an algorithm determines if it belongs to the language.
      • Involves parsing strategies with varying complexity.
    • Example: Arithmetic expressions with balanced parentheses. Parsing HTML or XML.
  3. Context-Sensitive Languages (Type 1)

    • Recognized by bounded Turing machines, where the tape is limited in both directions.
    • More powerful than context-free grammars but computationally expensive.
    • Example: Grammar rules where production depends on surrounding symbols. Natural language processing (NLP) requiring context-sensitive grammar.
  4. Recursively Enumerable Languages (Type 0)

    • Recognized by Turing machines with unlimited tape.
    • The most general and powerful type, covering all computable languages.
    • Example: Any problem solvable by a computer with no constraints. AI computations.

The Role of Recursion in Language Definition


Defining Grammars with Backus-Naur Form (BNF)

BNF is a formal notation used to define the syntax of programming languages through recursion.

Example: Defining Numbers Using BNF

<digit> ::= 0 | 1 | 2 | ... | 9
<number> ::= <digit> | <digit><number>

Interpretation:

Key Point:
To prevent infinite recursion, a base case is essential (e.g., defining the digit separately).

Bad Example: Incorrect BNF Definition

<number> ::= <number><digit>

Why is this incorrect?

Correcting the Issue:
To fix this, the definition should include a termination condition by allowing <number> to be a single <digit> without further recursion.


Application of Formal Grammars in Programming Languages

Example:
The syntax of Scala can be fully described using BNF notation, highlighting the flexibility of context-free grammars.

BNF in Programming Language Syntax

Key Takeaways:

Introduction to Lambda Calculus


What is Lambda Calculus?

Lambda calculus is a formal system in mathematical logic and computer science for expressing computation based on function abstraction and application. It serves as the foundation for functional programming languages and provides a simple yet powerful model of computation.

Key Characteristics:

Historical Background

Lambda calculus was introduced by Alonzo Church in the 1930s as a foundation for formalizing computation. It was initially developed as part of an attempt to understand the nature of functions and mathematical logic.

Real-World Impact: