Natural Languages and Programming Languages

19. August 2008 09:36 by Mrojas in General  //  Tags: ,   //   Comments (0)

I am a firm believer in program understanding and in that our computer skills will allow us to develop programs that will understand programs and maybe in the future even write some of them :).

 I also belive that natural languages ans programming languages are two things with a lot in common.

These are just some ideas about this subject. 

"A language convertion translates one languate to another language, while a language-level upgrade moves an application from an older version of a language to a modern or more standardized version of that same language. In both cases, the goal is to improve portability and understanbility of an application and position that application for subsequent transformation", Legacy Systems, Transformation Strategies by  William M. Ulrich.

An natural language convertion is exactly that. Translating one language to another language.

Natural language processing and transformation have a lot in common with automated source code migration. There is a lot of grammar studies on both areas, and a lot of common algorithms.

I keep quoting:

"Comparing artificial language and natural language it is very helpful to our understanding of semantics of programming languages since programming languages are artificial. We can see much similarity between these two kinds of languages:

Both of them must explain "given'" languages.
The goal of research on semantics of programming languages is the same as that of natural language: explanation of the meanings of given language. This is unavoidable for natural language but undesirable for programming language. The latter one has often led to post-design analysis of the semantics of programming languages wherein the syntax and informal meaning of the language is treated as given( such as PL/I, Fortran and C ). Then the research on the semantics is to understand the semantics of the language in another way or to sort out anomalies, ommisions, or defects in the given semantics-which hasn't had much impact on the language design. We have another kind of programming languages that have formal definitions, such as Pascal, Ada, SML. The given semantics allow this kind of programming language to be more robust than the previous ones.

Both of them separate "syntax'" and "semantics'".
Despite these similarities, the difference between the studies of natural and artificial language is profound. First of all, natural language existed for thousands of years, nobody knows who designed the language; but artificial languages are synthesized by logicians and computer scientists to meet some specific design criteria. Thus, `` the most basic characteritic of the distinction is the fact that an artificial language can be fully circumscribed and studied in its entirety.''

We already have developed a mature system for SYNTAX. In 1950's, linguist Chomsky first proposed formal language theory for English, thus came up with Formal Language Theory, Grammar, Regular Grammar, CFG etc. The ``first'' application of this theory was to define syntax for Algol and to build parser for it. The landmarks in the development of formal language theory are: Knuth's  parser, and YACC-which is a successful and ``final''application of formal language theory.
"

from Cornell university http://www.cs.cornell.edu/info/projects/nuprl/cs611/fall94notes/cn2/cn2.html
Jing Huang


I also will like to add a reference from an interesting work related to pattern recognition a technique used both in natural language processing (see for example http://prhlt.iti.es/) and reverse engineering.
This work is from Francesca Arcelli and Claudia Raibulet from Italy and they are working with the NASA Automated Software EngineeringResearch Center
http://smallwiki.unibe.ch/woor2006/woor2006paper3/?action=MimeView