Andy Melnikov (nponeccop) wrote,
Andy Melnikov
nponeccop

HNC hacking guide - part 3

Кодогенератор. Получилось более сумбурно, но и сам генератор столь же сумбурен. Буду потом улучшать.


# The Generator

The C++ Generator transforms typed terms into a C++ AST.

The main challenges are:

* a soundness theory - to refuse to generate invalid C++ programs (see Funarg Problem and other
safety issues)
* a semantic theory - to establish a correspondense between C++ idioms and terms
(hard parts are IO, assignments, imperative control, resource management including RAII,
class members)
* an equational theory of terms - some terms can start generating invalid programs after
seemingly equivalent transformations.

The easier but stil non-trivial to implement parts are:

* an idiomatic encoding of HOFs, local functions and polymorphism
* design of C++ AST
* design of actual encoding algorithm

Fortunately, the easier parts are already implemented.


## The Encoding

Here are some ideas of how HN constructs are translated into C++ constructs:

Scopes -> Scopes or structures.
Function arguments -> Function arguments
Type variables -> Template typenames
Parametrized data types -> Template structures.
Top-level functions -> free functions.
Local functions -> static or non-static member functions.
Local variables -> non-static members or local variables.
Functions with free variables -> non-static member functions
Functions without free variables -> top-level functions or static member functions.
Polymorphic functions -> template functions.

We determine an encoding for each identifier, and then use the encoding both when identifier
is declared and when it's used.

For definitions, identifier type is encoded in different fields of CppFunctionDef (see `Bar.ag`).
Conversion of identifier type into text is done in the pretty printer.

For expressions, each identifier is converted according to its type by an attribute grammar
during code generation phase (see `AG\ExpressionBuilder.ag`, `AG\Qualifiers.ag`,
`CPP.Intermediate.CppQualifiers` and `CPP.BackendTools`).
The pretty printer only assembles pieces of text instead of proper AST in case of expressions,
so the design should be improved at this place: more work should be moved to the pretty printer.

## The C++ AST

C++ AST is represented using CPP.Intermediate.CppDefinition data type

```Haskell
data CppDefinition
= CppFunctionDef
{
functionLevel :: Int
, functionTemplateArgs :: [String]
, functionIsStatic :: Bool
, functionContext :: Maybe CppContext
, functionReturnType :: CppType
, functionName :: String
, functionArgs :: [CppVarDecl]
, functionLocalVars :: [CppLocalVarDef]
, functionRetExpr :: CppExpression
}

```
functionLevel is an indentation level used for pretty printing.

functionTemplateArgs are formal template arguments for function templates. If the list
is empty, a definition is either a non-template function or a variable.

functionIsStatic is only meaningful for defintions which are non-toplevel functions. It means
whether a function is static or non-static.

'a context' means 'a closure environment'. Implemented using the *_impl structure.

functionContext contains the _impl struct. May be present or not.

functionReturnType is variable type for variables and function return type for functions.

functionName is self-descriptive. Valid for both functions and variables.

functionArgs contain function formal arguments. If empty, it means that the definition is
a variable.

functionLocalVars mean local vars which are represented by C++ local variables. Some HN
local variables are represented by members of the context structure.

functionRetExpr is self-descriptive.

```Haskell
data CppContext
= CppContext
{
contextLevel :: Int
, contextTemplateArgs :: [String]
, contextTypeName :: String
, contextVars :: [CppLocalVarDef]
, contextMethods :: CppProgram
, contextDeclareSelf :: Bool
, contextParent :: Maybe String
}
```

contextLevel is same as functionLevel

contextTemplateArgs means formal (and by coincidense actual) template arguments for the _impl structure.

contextTypeName is the name of the struct (e.g. "foo_impl")

contextVars are data members of the struct

contextMethods are a list of methods of the struct

contextDeclareSelf means whether `typedef foo_impl self;` type declaration in the structure is necessary. It
is necessary if any method calls a static method.

contextParent may contain a name of the parent _impl struct. It is not used yet, as the
"parent" functionality is the only place that is not yet complete. It should be used if a
context method calls a static method of a parent context.
Tags: fp, hn0, programming
Subscribe

  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 0 comments