Blog Archives

Frameworks: the Better Half of Code Generation

We define a Framework (FW) as a software structure with generic functionalities which can be adapted or extended in order to provide a specific application.

They can be mistaken with libraries but the approach is totally different. These are the FW key distinguishing features:

  • Inversion of Control. The flow of control is dictated by the FW and not by the application using its services. This is also known as the Hollywood principle: “don’t call us, we’ll call you”.
  • Extensibility. Certain functionalities of the FW are not “closed” – as it happens in libraries -, on the contrary they are designed to be extended in order to solve the particular problem.
  • Default Behavior. FWs have a default behavior, which defines the flow of control. As seen before, the extensions allow us to adapt this default behavior to the particular problem that each application solves.
Frameworks are the base of plug-in architectures and software-ecosystem oriented systems (facebook, twitter, amazon…), but, above all, is the perfect base architecture for the generated code.

Technically, depending on the programming language, there are different ways to implement them. If it’s allowed by the language, abstract classes, partial classes, delegates and generic types are the best features to do it. Otherwise there are other strategies such as interpreters, intermediate functions – kind of visitor pattern- to solve the extensible call and others.

Benefits in code generation
These are some of the features that recommend the use of FWs:

  • Extensible. The work is divided in two main parts, which makes easier the definition of the code generator.
  • Structured. Designing the generic part on one side and extensions on the other, provides a more structured architecture.
  • Security. Handmade code and generated code are in different files, and this mitigates the risk of loss or replacement.
  • Reusable. Once the FW is defined it can be used in different projects, again making easier the development of the generator.
  • Readability. The separation of the generic flow of control and the extensions facilitates code reading.
Conclusion
When code generation is approached for the first time, it is often thought that the system must be 100% generated. After that, we realize that this is not only a complex approach, but also a design error and an unbalanced distribution of work.

To achieve a well-designed architecture and a good distribution of work frameworks are the better half of our generated systems.

Finally, it’s important to remark that designing the generated code as a framework structure is also a good practice, because it allows us to implement – by extending – those manually features that we don’t want to automate and it provides systems that can be extended by a software ecosystem.
 

View all Academic Posts.
 
Advertisements

Language Workbench

In previous posts we addressed what DSLs are and why they are useful and necessary in software development. Once we decided to base our development on them, we need a tool to design and use them. This tool is technically known as Language Workbench (LW).

A LW is composed of two main modules:

  • Language design.
  • Use of the language. Programming.
Probably, in the future, it will split into two different tools, the reason is because, inside or outside organizations, there will be two different roles: language designers and language users (programmers).

Language design
A LW provides utilities to define the different building parts of a language:

  • Abstract syntax. The grammatical/conceptual structure of the language. It’s also known as meta-model.
  • Concrete syntax. The human-readable representations of these concepts. They can be textual and/or graphical representations. In other words, it’s the definition of the visual interface for the developers.
  • Static semantics. Define rules and restrictions that the language must conform (besides being syntactically correct).
  • Dynamic semantics. It is mainly the translation into traditional languages though, as we will mention later, here resides the greatest potential of this development methodology.
Use of the language
Once the building parts are defined, the tool is able to interpret them and provide a development environment (IDE). Besides editing, depending on how sophisticated is the tool, it can provide utilities such as: code completion, static validation, syntax highlighting, different views and even debug support.

This environment will also allow us to generate the code and, sometimes, integrates the target-application building process.

Future Potential
We could sum up that this new development process is similar to the traditional one but with the benefits of DSLs and code generation, which is the huge advance argued by researchers and supporters of this methodology.

Agreeing on this, for us the great potential, yet to be discovered, is the fact that a program is no longer a set of statements but a knowledge representation. Once we define the concepts and rules, semantics may be able to offer much more services than just code generation.
 

View all Academic Posts.
 

DSLs

Domain-Specific Languages (DSLs) are programming languages designed to define, in a more accurate and expressive way, particular domains, whether technical or business domains.

They are named like this as opposed to General Purpose Languages (GPLs – Java, C #, C + +, etc.), providing a narrower but more accurate approach. Their goal is: covering only the domain for which are designed, but with the most suitable grammatical structures and / or graphic abstractions .

Those languages can be analysed under the point of view of two different perspectives: as an evolution from code generation or as an evolution from GPLs.

From Code Generation to DSLs
There are different ways, more or less sophisticated, to generate code: macros, table-structured data, dynamic generation, parsing, CASE tools, etc, but none of them as powerful as a language (textual or graphical), which define, in a formal way, linguistic structures, human readable representations and semantics .

Therefore, we can see DSLs as the most powerful way of Code Generation.

From GPLs to DSLs
GPLs are powerful because they can be used to solve all the problems (Turing Complete) but in many cases are poorly expressive due to the big gap between the problem domain (real world) and the solution domain (source code). In those cases programming and maintenance is difficult because it’s not easy to understand (read and write) what the program tries to solve. For instance, we can compare the definition of a Web UI and its HTML code: the expressive gap is huge.

DSLs bridge those gaps.

Features and benefits of the DSLs

  • Higher level of abstraction. They define more complex concepts, more abstract and therefore more intentional, more expressive.
  • Less degrees of freedom. Normally they are not Turing complete. They allow to define the domain and nothing but the domain with the rules that govern the domain, which makes them very powerful (on that domain, of course).
  • Productivity. Programming with them is efficient and more streamlined.
  • Software quality. They abstract away technical complexity reducing errors. That complexity is usually solved by the generator.
  • IDE Support. Validations, type checking, code completion, etc. This is a huge advance compared with abstractions via APIs or Frameworks.
  • Platform Independent.
  • And all the benefits of code generation.
DSLs are common in real life; throughout history they have been created in maths, science, medicine…Now is time to use them in software development.
 

View all Academic Posts.
 

10 Benefits of Code Generation

Let’s get to the point. Here is the list:

  • SW Quality: In every field: performance, reliability, security
  • Standarization: not only in source code but in user interface, database structures….
  • Centralization: global policies such as error handling, exception management, data display format, data validation, permissions check, etc. are centralized on the generator. This kind of policies are also known as cross-cutting concerns and is an issue tackled by Aspect Oriented Programming (AOP) in traditional programming. Centralization avoids the issue.
  • Refactoring: related to the previous benefit, code refactoring is easy an safe.
  • Productivity: Lower cost and lower time-to-market (or release time).
  • Analytical skills: code generation requires a deeper domain analysis before implementing the solution via the generator.
  • Design Skills: requires a good architect with a wider view.
  • Healthy Growth: prevents architecture degradation.
  • Team Member Integration: development culture or rules induction are facilitated by code generation.
  • Level of abstraction: programming in a more abstract way, besides easier to understand (is more intentional), opens the door to new features such as: unit test generation, self-documenting, data population, semantics, machine reasoning and others.
Code generation is not easy, implementing a generator requires time and effort, and even more if it’s a language workbench but, clearly, benefits are huge.
 

View all Academic Posts.
 

Best Practices in Code Generation

The most common mistake when we generate code is seeing it as a black box, thinking that the important thing is “what it does” and not “how it does it”. This is wrong. Here again, quality matters.

Here are some of the features a good generated code should have:

  • Independent: manual code and generated code must be on a different file or artifact, otherwise there is a risk of losing the first one if we have to re-generate the code (and we will).
  • Immutable: it must not be changed, for two reasons: is dangerous, as every unknown code, and for the same reason as the previous case.
  • Readable: that means meaningful variable and function names, comments, indentation, organized in folders, files, etc.  The generated code should be presentable to be visited by developers: to know how it works and, why not?, to learn from it. We should always be proud of the code we generate.
  • Extensible: for different reasons you may need to implement manually some particular functionalities, so generated code should leave some doors open. The best way is to design the generated code under a Framework approach where manual code can extend only some allowed functionalities and in a safe environment.
  • Structured: raising the level of abstraction requires a good knowledge of the field you are dealing with. Bad structured code can be a symptom that the domain is not completely under control. A good code generation requires a good architect.
  • Robust: generated code can fail, of course. Error handling, exception management, input validation, internal validations, etc must be always included in code. This kind of security policies can be easily implemented when generating code and it must be one of the reason of its quality.
  • Powerful: having said the above, we should see code generation as a way to write a more powerful code, that means that we can use some strategies in the generated code that we would never use when hand made (usually for maintenance reasons).
To sum up, best practices in code generation are a mixture of traditional best practices and a wider way of thinking.
 

View all Academic Posts.
 

Why Code Generation?

Code generation is not a new style or technique, it’s the path followed by programming languages to deal with complexity, from binary coding to first, second and next generation languages. It’s what compilers have been doing for ages.

The key subject here is “deal with complexity”, the more complex is your problem more abstract has to be your thinking. In other words, you need to raise the level of abstraction. And this rule applies equally to the tool used to solve the problem: the programming language.

Therefore, we can state that “raising the level of abstraction is the goal pursued by the evolution of programming languages”.

The common languages used today to solve the problems( Java, C#, C++, Delphi…), are known as general purpose languages (GPLs), and here is the problem, they are “general purpose”, it means they can solve “all” the problems but from a global perspective. They can solve them from a level of abstraction wide enough to reach the solution but not as high as we need on each particular problem. There is a gap between the level of abstraction we use to deal with the problem and the level of abstraction we use to solve it via GPLs.

How can we bridge that gap? Obviously, with code generation.

As a conclusion, to tackle properly a problem, we need to find a particular language to define the solution in the level of abstraction each particular problem requires. In order to make that solution computable, we need to generate code, usually in the nearest lower level: GPL’s.

Those particular languages are known as DSLs: Domain-Specific Languages, but this subject will be addressed in another post.

As a conclusion, we can note that this approach does not only apply to business problems but also to solving technical problems. For example, the new challenges offered desktop-web applications require a more abstract approach that integrates all technologies: HTML, CSS, JavaScrpit, AJAX and others.
 

View all Academic Posts.