5

Architecture

This chapter presents the design architecture of numEclipse. It is intended for people interested in using this tool as a research vehicle. Reading this chapter will also be beneficial for programmers writing a toolbox. Those users not interested in programming or writing engine extension could skip this chapter. Like any interpreter the design of this application can be divided into two major pieces. The front-end which deals with the scanning parsing of the input program back-end which actually executes the code. This chapter is also organized along the same lines. Here, we also show how to develop deploy a custom execution engine. A number of open source tools mathematical libraries are used in the development of numEclipse. We will also talk about their role interfaces with the application.

5.1 Front-end

An interpreter front-end performs two tasks. It scans the input to identify the tokens and it parses the input into an Abstract Syntax Tree (AST). Traditionally, compiler/interpreter developers have used lex and yacc like tools to generate the lexer and parser programs from the language specification, i.e., grammar. We used the similar approach and rather than writing the lexer and parser from the scratch, we used SableCC1. This amazing tool is based on a sound object-oriented framework. Given a grammar, it generates tree-walker classes based on an extended visitor design pattern. The interpreter is built by implementing the actions on the AST nodes generated by SableCC. In the following, we present a code snippet showing how the interpreter is actually invoked by the application.

Listing 5.1

Reader strReader = new StringReader (input);

Lexer lexer = new Lexer (new PushbackReader

     (new BufferedReader(strReader)));

Parser parser = new Parser(lexer);

Node ast = parser.parse();

ast. apply(interpreter);

Writing the language specification (Grammar) is the most complicated part of a language development. Once the grammar is solidified, generation of a Lexer and Parser classes using SableCC is just a click of a button. Most of the effort in developing this application involved writing the class “Interpreter”. It extends “DepthFirstAdapter” class which is generated by SableCC. This adapter class is the AST tree walker mentioned earlier.

5.2 Back-end

So what constitutes the back-end? The back-end is where the action happens. The back-end starts from the class which extends “DepthFirstAdapter” (i.e., Interpreter) class. This tree-walker class has the action code for each node encountered in the AST during parsing of the input program. Here is the list of actions that happen in this class.

1. Creating variables,

2. Storing variables in a symbol table,

3. Evaluating expressions,

4. Executing statements, and

5. Calling functions.

The m-script, as mention in earlier chapters, does not require you to declare a variable. You can just start working with a variable and the interpreter will figure out the value and type of the variable. This poses some implementation challenges but provides a lot of flexibility to the end-user. In the previous chapter on programming, we referred to interface class “LinearAlgebra”. It provides methods to create different type of variables. The implementation of the interface classes constitutes the execution engine. These classes not only provide the methods to create the different variables but also provide the basic arithmetic operations to evaluate complex expressions. Good understanding of the functions of these classes is essential in order to implement an alternative execution engine. The symbol table is implemented as a hash table. In fact, it contains three hash tables for ordinary symbols, global symbols and constants. “Symbol” is another object which is used to wrap any variable before it is stored into the symbol table. Each instance of the interpreter window gets its own symbol table, so you can only see the symbols in the memory view which are tied to the active interpreter. The symbol table extends the “Observable” class so that the memory view could register as an “Observer” and show the changes as they happen. The Symbol class implements the Serializable interface, so that the variables could be easily saved and retrieved from a file. This enabled us to save a session in a file.

Expression evaluation depends entirely on the basic arithmetic operations on different data types supported by numEclipse. As mentioned earlier, these operations are defined within the implementation classes which form the execution engine. Statements are discussed previously in chapter 3. They are very similar to any other programming languages like C or FORTRAN. The correct execution of these statements is the responsibility of the Interpreter class. This functionality is fixed and cannot be modified for obvious reasons. numEclipse has a number of built-in functions and it offers the ability to integrate user-defined functions. On the start-up, the application loads all the m-script and java functions into a library. All functions, built-in or user-defined, are loaded through a common mechanism using Java Reflection APIs. The library manager also keeps track of the “dll” files added by the user as described in the previous chapter. Java reflection is known to be slow in loading a class/method but in numEclipse all functions are pre-loaded in a hash table so the cost of calling the functions is not so high. The library manager maintains a precedence rule for the function calls. It looks up a function in the following order.

1. user-defined m-script function in the numEclipse project,

2. user-defined java function within the referenced projects in the eclipse workspace,

3. user-defined m-script function added to the preferences,

4. user-defined java function added to the preferences,

5. built-in java function and

6. built-in m-script function.

At the moment, this order is fixed but in future we might allow the user to change this precedence rule through preferences. This completes an overview of the interpreter back-end for more insight one needs to go over the source code.

5.3 User Interface

The very first user interface of numEclipse never saw daylight. It was built on Java Swing. It was quickly realized that it does not really serves the objectives of this project. The intention behind numEclipse is not just an interpreter but rather a comprehensive development environment for scientific computing. However, MATLAB or GNU Octave do provide the possibility to add functions in other programming languages but they do not provide any integration with the development tools as such. We decided to re-write numEclipse as an eclipse plug-in and this approach opened up a whole new world of opportunities for us. In previous chapters, we showed how to write a java or C function within eclipse and how to quickly test and deploy them with numEclipse. This seamless integration would not have been possible without the eclipse platform.

We decided to follow the software engineering approach for scientific application development. So, we introduced the notion of a numEclipse project. This gives a project oriented and role based development of scientific application. We created a new perspective for the project development. We also added a wizard to create a new project. The perspective contains three new components, i.e., interpreter window (editor), memory view and history view. The interpreter window is basically an editor in eclipse’s terms. We do not know of any other interpreter implementation within eclipse so we developed the interpreter (editor) from scratch. The design of this interpreter is still in development and there are a lot of opportunities for improvement. The memory and history views were rather easy to develop. They use the observer-observable design pattern to update their corresponding information. At the moment the interpreter window is very much hocked to the actual interpreter and we are trying to come up with a better design to introduce separation of concerns. This might set precedence for future interpreter plug-ins for eclipse. Another user interface component is the numEclipse preferences as we saw in the previous chapters. This enables us to define numEclipse related configurations like constants, libraries and gnuplot.

5.4 Gnuplot Interface

Our initial intent was to write Java2D/Draw2D based plotting APIs. But we quickly realized that this would be an enormous task and that there is no point in reinventing the wheel. There are already a number of open source projects providing excellent APIs for plotting. Our objective was to choose something similar to MATLAB. We started looking at PLPlot first, it is a set of plotting functions written in C. This project also provides java binding to the C functions Unfortunately, this project is more geared towards linux/UNIX users. We initially compiled the JNI enabled dll on WindowsXP and came across a lot of problems. PLPlot functions have their own windows management; once a graph is plotted by a java program through binding, it has no control over the plot window. Also we discovered that you could only have one plot at a time which is not acceptable for our purpose. Finally, we decided to take the approach of Octave and provided an interface to gnuplot. It is an excellent tool for scientific plotting. It has been developed over a long period of time. We are using version 4.0 and it is very mature and stable. We provide this interface to gnuplot as a built-in toolbox. We are hoping that some users will try to write their own toolboxes for other visualization APIs or applications.

Gnuplot is an application rather than a set of APIs. It provides a command line user interface. This posed another challenge for integration. But fortunately, gnuplot also accepts the user commands through a named pipe. Now, you would understand why we need to define the path to the gnuplot execution file within the numEclipse preference. In the following, we show the code snippet used to invoke gnuplot and create a link.

Listing 5.2

Process p = Runtime.getRuntime()exec(gnuplot);

PrintStream out = new PrintStream(p.getOutputStream());

Once a link is established, to send a command to gnuplot we use the following.

Listing 5.3

String command = ….

out.println(command);

out.flush();

So you see the integration with gnuplot is very straightforward. Most of effort involved writing methods which translated numEclipse commands into gnuplot commands. On top of that, we had to store the data and states in between the commands. To store the temporary plotting data, we create temporary files in user area allocated by the operating system. These files are short lived and scratched at the end of the session. In order to get more information, one needs to walk through the source code, i.e., org.numEclipse.toolbox.Plot.

5.5 Execution Engine

In this section, we will show how to develop and deploy an execution engine. The intent is to show the process with a simple example rather than building a sophisticated engine. Let’s give a different meaning to matrix computation. We redefine the matrix addition, subtraction and multiplication using the following formulae.

image

The “mod” stands for the modulo operation. The result of “a mod b” is the remainder term, when “a” is divided by “b”. The symbols ⊕, Θ, ⊗ are used here only to distinguish, otherwise the arithmetic operator symbols remain the same within numEclipse. A good implementation of these operators will take a lot of effort. We will make a quick implementation to prove the concept.

We use the “DefaultMatrixImpl “ Class which implements the interface “IMatrix” as described on the project website (http://www.numeclipse.org/interface.html). We refactor the class and copy it as “ModuloMatrixImpl” Class. Then we modify the following methods.

Listing 5.4

Public IMatrix mult(IMatrix m) {…}

Public IMatrix plus(IMatrix m) {…}

Public IMatrix minus(IMatrix m) {…}

We do not show the code of these methods as the change is extremely simple. We apply the following utility function on each element of the resultant matrix before we return the value.

Listing 5.5

private IComplex modulo 10(IComp1ex z) {

  double re = z.getReal();

  double im = z.getImag();

  IComplex z = new DefaultComplexImpl(re % 10, im % 10);

  return z;

}

Then, we refactor the “DefaultLinearAlgebraFactory” class and copy it as “ModuloLinearAlgebraFactory” class. Then, we modify the following methods as shown.

Listing 5.6

public IMatrix createMatrix(IComplex[][] c) {

  return new ModuloMatrixImpl(c);

}

public IMatrix createMatrix(double[] d1, double[] d2) {

  return new ModuloMatrixImpl(d1, d2);

}

public IMatrix createMatrix

   (double[][] d1, double[][] d2) {

  return new ModuloMatrixImpl(d1, d2);

}

public IMatrix createMatrix(IMatrix m) {

  return new ModuloMatrixImpl(m);

}

public IMatrix createMatrix(int m, int n) {

  return new ModuloMatrixImpl(m, n);

}

public IMatrix createMatrix(String[][] str) {

  return new ModuloMatrixImpl(str);

}

public IMatrix createMatrix

   (Hashtable hash, int m, int n) {

  return new ModuloMatrixImpl(hash, m, n);

}

public IMatrix createMatrix(BigDecimal[][]b) {

  return new ModuloMatrixImpl(b);

}

The change is very simple, all we did is change the call to the new constructor in the class “ModuloMatrixImpl”. Notice that the change is minimal; we did not modify any other data type. We also did not modify the structure of the matrix data type. We only modified the way addition, multiplication and subtraction of two matrices work.

In order to deploy this simple engine, we export these two classes into a jar file (say modulo.jar). Then, add the jar file to the library with numEclipse Preferences. The application will ask you to restart the workspace, allow the application to automatically restart. Right-click anywhere in the interpreter window, a pop-up menu will appear, select the new execution engine. Now you are ready to test the changes. In the following, we show some calculations with default engine and then we show we show the results with new engine.

Listing 5.6 (Default implementation)

> > A = [4 5; 3 9];

> > B = [3 0; 79];

> > A + B

ans =

7.00005.000

10.0000 18.0000

> > A − B

ans =

1.00005.0000

− 4.00000 0.0000

> > A*B

ans =

47.0000 45.0000

72.0000 81.0000

Listing 5.7 (Modulo implementation)

> > A = [4 5; 3 9];

> > B = [3 0; 79];

> > A + B

ans =

7.00005.0000

0.00008.0000

> > A − B

ans =

100005.0000

− 4.00000 0,0000

> > A * B

ans =

7.00005.0000

2.0000 1.0000

Once the new engine is loaded through the preferences, you can switch back and forth from one engine to another just with a click of a mouse button. However, there is a catch, a variable created with the default engine will use the arithmetic operations defined in the default engine. So in other words, just because you switched the engine does not mean that you will be able to apply the new operations with existing variables in the workspace memory. You should clear the memory and create the variables again to use the new operations. In future, we might add a utility to convert the variables as you change the engine. In this section, we showed how to create a simple engine and how to deploy it.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset