Language Designers

Once you have a tool to design languages in an agile way, we face the most difficult task: designing them.

The designing of a good language is the main part of the process as it will be the tool of the developers and it will support the semantic intelligence of the system.

Furthermore, since the concepts of the language structure the model itself, before the design it is necessary to know deeply the domain being modeled.

Features of a good language

  • High level of abstraction. The higher the level the more powerful will be the language and higher the semantic meaning of the concepts. Also, a high level of abstraction denotes a deep knowledge of the domain to be modeled.
  • Simple. It should be easy to use and read. A simple language is often synonymous with a high level of abstraction.
  • Different levels of complexity. While it should be simple, it should also allow ways to define deeply details by those who need it.
  • Aesthetically pleasing.
  • Semantically powerful. In order for a language to be productive, it is only required to provide its concepts with a graphical representation and translations to traditional languages, but if we really want it to be powerful, we must give the concepts other semantic interpretations: auto documentation, auto validation, inference rules, etc.
Requirements of a good designer
From the features of a good language we can derive the requirements:

  • Business oriented. Getting a deep knowledge of the domain, in order to design the language, requires a high concern in all the processes that govern it.
  • Abstraction skills. Once the domain is known, analytical skills are required in order to identify, with the highest possible level of abstraction, its purest essence.
  • Focused on semantics. The language has to be designed under the perspective of finding concepts with a high capacity to represent knowledge.
  • Qualities focused on simplicity and aesthetics.
Conclusion
This new development paradigm requires a particular profile to design languages, where not only the former analytical skills are important but also the capability to provide usability and knowledge-representation capacity to the language.

At first, it may seem a complex task but, after all, is part of the evolution of technology where the profiles that contribute most are those with greater capacity for abstraction.
 

View all Academic Posts.
 
Advertisement

Interview. Kite Invest

Here is our intervew on Kite Invest.

Image

Kite Invest promotes and develops business opportunities through constant awareness in the manner in which Foreign Capital Inflow is received by some of the most prestigious financial and banking institutions.

Knowledge Representation and Software Automation

Knowledge representation is a discipline devoted at representing the real world information in a way that can be interpreted by machines to solve, by inference, complex problems.

It has traditionally been an Artificial Intelligence subject and it has recently become very popular for its use in the field of semantics. The Semantic Web project, led by the W3C, is a clear example of it.

Although there are many approaches to represent the knowledge, all of them seek to define: the concepts, the relations and the rules that define the information. The different concepts allow us to classify the information and through the relations and rules we can infer (reason) from it. Therefore, instead of having only “flat” information we will have also meta-information to process it.

From the perspective of languages
In previous posts we saw how a language is defined, in a Language Workbench, by the abstract and concrete syntax and the static and dynamic semantics. We can see that definition as the meta-information that makes a program a knowledge representation and therefore we can use the potential provided by this discipline.

Usually Language Workbenches are seen as code generators but, from a knowledge representation and semantics point of view, they can offer many more services. We list some of them:

  • Generate test batteries.
  • Generate data population for performance analysis.
  • Self-documenting.
  • Auto-validation.
  • Statistical analysis of the data and programs.
  • Behaviour analysis of users, customers, etc..
  • Simplify importing and exporting data. For example XBRL: eXtensible Business Reporting Language.
  • Link with standard ontologies. For example FIBO: Financial Industry Business Ontology.
  • Simplify integration with other systems.
  • Machine reasoning.
Conclusion
KRKnowledge representation is focused mainly on data processing: structuring search engines information, semantic analysis applied to Big Data, definition of ontologies related with different businesses, and others.

Software development methodologies that provide programs those capabilities are a gateway to the future with a huge potential.
 

View all Academic Posts.
 

Language Workbench

In previous posts we addressed what DSLs are and why they are useful and necessary in software development. Once we decided to base our development on them, we need a tool to design and use them. This tool is technically known as Language Workbench (LW).

A LW is composed of two main modules:

  • Language design.
  • Use of the language. Programming.
Probably, in the future, it will split into two different tools, the reason is because, inside or outside organizations, there will be two different roles: language designers and language users (programmers).

Language design
A LW provides utilities to define the different building parts of a language:

  • Abstract syntax. The grammatical/conceptual structure of the language. It’s also known as meta-model.
  • Concrete syntax. The human-readable representations of these concepts. They can be textual and/or graphical representations. In other words, it’s the definition of the visual interface for the developers.
  • Static semantics. Define rules and restrictions that the language must conform (besides being syntactically correct).
  • Dynamic semantics. It is mainly the translation into traditional languages though, as we will mention later, here resides the greatest potential of this development methodology.
Use of the language
Once the building parts are defined, the tool is able to interpret them and provide a development environment (IDE). Besides editing, depending on how sophisticated is the tool, it can provide utilities such as: code completion, static validation, syntax highlighting, different views and even debug support.

This environment will also allow us to generate the code and, sometimes, integrates the target-application building process.

Future Potential
We could sum up that this new development process is similar to the traditional one but with the benefits of DSLs and code generation, which is the huge advance argued by researchers and supporters of this methodology.

Agreeing on this, for us the great potential, yet to be discovered, is the fact that a program is no longer a set of statements but a knowledge representation. Once we define the concepts and rules, semantics may be able to offer much more services than just code generation.
 

View all Academic Posts.
 

DSLs

Domain-Specific Languages (DSLs) are programming languages designed to define, in a more accurate and expressive way, particular domains, whether technical or business domains.

They are named like this as opposed to General Purpose Languages (GPLs – Java, C #, C + +, etc.), providing a narrower but more accurate approach. Their goal is: covering only the domain for which are designed, but with the most suitable grammatical structures and / or graphic abstractions .

Those languages can be analysed under the point of view of two different perspectives: as an evolution from code generation or as an evolution from GPLs.

From Code Generation to DSLs
There are different ways, more or less sophisticated, to generate code: macros, table-structured data, dynamic generation, parsing, CASE tools, etc, but none of them as powerful as a language (textual or graphical), which define, in a formal way, linguistic structures, human readable representations and semantics .

Therefore, we can see DSLs as the most powerful way of Code Generation.

From GPLs to DSLs
GPLs are powerful because they can be used to solve all the problems (Turing Complete) but in many cases are poorly expressive due to the big gap between the problem domain (real world) and the solution domain (source code). In those cases programming and maintenance is difficult because it’s not easy to understand (read and write) what the program tries to solve. For instance, we can compare the definition of a Web UI and its HTML code: the expressive gap is huge.

DSLs bridge those gaps.

Features and benefits of the DSLs

  • Higher level of abstraction. They define more complex concepts, more abstract and therefore more intentional, more expressive.
  • Less degrees of freedom. Normally they are not Turing complete. They allow to define the domain and nothing but the domain with the rules that govern the domain, which makes them very powerful (on that domain, of course).
  • Productivity. Programming with them is efficient and more streamlined.
  • Software quality. They abstract away technical complexity reducing errors. That complexity is usually solved by the generator.
  • IDE Support. Validations, type checking, code completion, etc. This is a huge advance compared with abstractions via APIs or Frameworks.
  • Platform Independent.
  • And all the benefits of code generation.
DSLs are common in real life; throughout history they have been created in maths, science, medicine…Now is time to use them in software development.
 

View all Academic Posts.
 

XV Madri+d Investment Forum

We’ve been selected to present at XV Madri+d Investment Forum, a forum for technology companies organized by BAN Madrid (Business ANgels Madrid) and Madri+d foundation.

Image

Let’s talk!.

10 Benefits of Code Generation

Let’s get to the point. Here is the list:

  • SW Quality: In every field: performance, reliability, security
  • Standarization: not only in source code but in user interface, database structures….
  • Centralization: global policies such as error handling, exception management, data display format, data validation, permissions check, etc. are centralized on the generator. This kind of policies are also known as cross-cutting concerns and is an issue tackled by Aspect Oriented Programming (AOP) in traditional programming. Centralization avoids the issue.
  • Refactoring: related to the previous benefit, code refactoring is easy an safe.
  • Productivity: Lower cost and lower time-to-market (or release time).
  • Analytical skills: code generation requires a deeper domain analysis before implementing the solution via the generator.
  • Design Skills: requires a good architect with a wider view.
  • Healthy Growth: prevents architecture degradation.
  • Team Member Integration: development culture or rules induction are facilitated by code generation.
  • Level of abstraction: programming in a more abstract way, besides easier to understand (is more intentional), opens the door to new features such as: unit test generation, self-documenting, data population, semantics, machine reasoning and others.
Code generation is not easy, implementing a generator requires time and effort, and even more if it’s a language workbench but, clearly, benefits are huge.
 

View all Academic Posts.
 

Entrepreneur Program “Soy Emprendedor”

We’ve been selected for the program “Soy Emprendedor, Soy de la Mutua”, a Mutua Madrileña program for startups.

Image

Let’s enjoy it!

Best Practices in Code Generation

The most common mistake when we generate code is seeing it as a black box, thinking that the important thing is “what it does” and not “how it does it”. This is wrong. Here again, quality matters.

Here are some of the features a good generated code should have:

  • Independent: manual code and generated code must be on a different file or artifact, otherwise there is a risk of losing the first one if we have to re-generate the code (and we will).
  • Immutable: it must not be changed, for two reasons: is dangerous, as every unknown code, and for the same reason as the previous case.
  • Readable: that means meaningful variable and function names, comments, indentation, organized in folders, files, etc.  The generated code should be presentable to be visited by developers: to know how it works and, why not?, to learn from it. We should always be proud of the code we generate.
  • Extensible: for different reasons you may need to implement manually some particular functionalities, so generated code should leave some doors open. The best way is to design the generated code under a Framework approach where manual code can extend only some allowed functionalities and in a safe environment.
  • Structured: raising the level of abstraction requires a good knowledge of the field you are dealing with. Bad structured code can be a symptom that the domain is not completely under control. A good code generation requires a good architect.
  • Robust: generated code can fail, of course. Error handling, exception management, input validation, internal validations, etc must be always included in code. This kind of security policies can be easily implemented when generating code and it must be one of the reason of its quality.
  • Powerful: having said the above, we should see code generation as a way to write a more powerful code, that means that we can use some strategies in the generated code that we would never use when hand made (usually for maintenance reasons).
To sum up, best practices in code generation are a mixture of traditional best practices and a wider way of thinking.
 

View all Academic Posts.
 

Why Code Generation?

Code generation is not a new style or technique, it’s the path followed by programming languages to deal with complexity, from binary coding to first, second and next generation languages. It’s what compilers have been doing for ages.

The key subject here is “deal with complexity”, the more complex is your problem more abstract has to be your thinking. In other words, you need to raise the level of abstraction. And this rule applies equally to the tool used to solve the problem: the programming language.

Therefore, we can state that “raising the level of abstraction is the goal pursued by the evolution of programming languages”.

The common languages used today to solve the problems( Java, C#, C++, Delphi…), are known as general purpose languages (GPLs), and here is the problem, they are “general purpose”, it means they can solve “all” the problems but from a global perspective. They can solve them from a level of abstraction wide enough to reach the solution but not as high as we need on each particular problem. There is a gap between the level of abstraction we use to deal with the problem and the level of abstraction we use to solve it via GPLs.

How can we bridge that gap? Obviously, with code generation.

As a conclusion, to tackle properly a problem, we need to find a particular language to define the solution in the level of abstraction each particular problem requires. In order to make that solution computable, we need to generate code, usually in the nearest lower level: GPL’s.

Those particular languages are known as DSLs: Domain-Specific Languages, but this subject will be addressed in another post.

As a conclusion, we can note that this approach does not only apply to business problems but also to solving technical problems. For example, the new challenges offered desktop-web applications require a more abstract approach that integrates all technologies: HTML, CSS, JavaScrpit, AJAX and others.
 

View all Academic Posts.