formula translation

Formula Translation

The Project

This article describes the translation of several types of mathematical equations into a sequence
of basic arithmetic operations.
Such a process is also known as "parsing" and is necessary when plotting functions or using them in spreadsheets.
Years ago, I designed the algorithm for my equations grapher Graphics-Explorer.
However, at a second glance, this Delphi source code is rather hard to read.
So I have rewritten the code into a more comprehensible form.

The Delphi-7 project consists of 3 units:

Form1 and unit1 make a simple excerciser to test the translation.
xlate also holds debug code that may be switched on and off.
This code allows for the display of the tables generated in the translation process.

Click [ here ] to load the excersiser program.
Click [ here ] to load the complete Delphi-7 project.
Click [ here ] for the complete source listing of the translate unit.

Below is a snapshot of the exerciser, showing the function y = x*sin(3x)

Donations

Designing software is time consuming.
I offer this software for free.
However, if you make commercial use of it, or appreciate it otherwise,
be so kind to make a donation to support my website.

Formulas, Equations and Functions

A formula is a calculation which contains unknown numbers.
These numbers are represented by a character.
An example is the formula for the volume V of a sphere having radius R :....... V = (4/3)pR³
Here, p is a constant (3,14.......) and R is a variable.
By substituting p with enough digits for the needed accuracy and R
with the length of the radius, we obtain V for this sphere after the necessary calculations.

In the above example, R is the independent variable (we may give it any positive value) and
V is the dependent variable.
When the context is unknown, it is customary to call 'x' the independent and 'y' the dependent variable.

The form .....y = ....some formula with x...... is called a function
For each value of x we may calculate y.
Each pair (x,y) may represent a dot in a coordinate system.
All dots together make the graph of the function.

The form ....x = ....some formula with y........is called an inverse function.
A substituted value of y yields a value of x.
Notice that an inverse function is the result of mirroring a normal function in the line y = x.

The functions above cannot represent graphs as a circle, where a single value of x may have 2 values of y.
In that case a so called parametric function may be used.
The tric is to have an intermediate variable (in our case V) and make both x and y a function of V.
Using my translator you may simply write:

y = 5sin(v) ; x = 5cos(v) ........for a circle with it's center at (0,0) and radius 5 In rare cases as graphs of trifoliums and the like, functions do not exist.
These graphs are the result of equations like

.....x.....y.... = ....x....y...... Left and right of the equal sign is a formula with x and y.
The graph is the set of all dots (x,y) where the '= ' is true: the same value left and right of the equal sign.
My translator and drawing procedures also support this so called 'implicit' functions, however the processing time
is longer.
So, writing y = 2x - 5 yiels the same graphic as y - 2x = 5 but the latter is slower.

All functions (normal, inverse, parametric, implicit) use the same translator and result in the same tables.
The drawing procedures however are different.

To summarize the supported functions / equations:

type	name	example	meaning
1	normal	y = 3x² -2x + 1	parabolic function
2	inverse	x = 3sin(y)	inverse sine function
3	parametric	y = 3sin(x) ; x = 3cos(x)	circle
4	implicit	x² + y² = 9	same circle as in type 3

Operators and priorities

A calculation like 1 + 3*2 + 5*3² cannot be made from left to right as operations have different priorities.
Also calculations between (.......) have priority as in 3*(10 -6) where 10 - 6 has to be calculated first.
The following table shows the priorities of the operations supported

name	symbol	priority	example
function	...	7	log(x)
exponentiation	^	6	x^3
unairy -	-	5	-sin(x)
multiplication	*	4	3*x
division	/	4	x/5
addition	+	3	x+5
subtraction	-	3	x/5
transfer	=	2	y = 17
separator	;	1	y = 1; x = 2

Note: the ; operator is a dummy with the lowest priority.
It keeps the x and y variables separated in parametric functions.

An intrinsic (type 4) function like........x^2 + y^2 = 15x + 22y is changed into .....v = x^2 + y^2 - 15x + 22y
before the translation process.
The - operator is assigned a priority of 2, the = operator a priority of 1.
So, if a set of (x,y) satisfies the equation then v = 0.

Determining the type of equation

The translator is called by

procedure

var

byte

The formula text is s.
The returned xcode byte will be zero for a successfull translation and unequal to zero for a particular error.
Before the translation starts the text is converted to lower case characters and copied in string "basetext".
Also a blank character is appended to the string to mark the end. This facilitates the translation.
xlateEquation(..,..) then calls procedure detecttype to find out which type was presented.

Detecttype counts the number of x , y, = and ; characters, recording also their position in the string
while ignoring blanks.
This is accomplished by an array with separate columns for the x, y, = and ; characters and plenty of rows.
If an x is found at position 1 of the string, the number 1 is written in the next free position of the x column.
If a ; is found at position 15 in the string, then 15 is written in the column for the ;
The height of the column is the amount of characters x, y, = , ; encountered in basetext.

Detection of a type 1 function:

Detection of a type 2 function:

Detection of a type 3 function:

Detection of a type 4 equation:

Please refer to the source code for details.
Array cp[x...; , rows] holds the positions of the characters x,y,=, ;
Array cc[x..;] holds the number of characters in each column of cp[ ]

Data Structures

The numbers for all calculations are stored in an array REG[1..67] of double.
Following table shows the assignments:

register number	purpose
1..29	intermediate results during calculations
30..59	constants found in basetext string
60	x
61	y
62	v
63	a
64	b
65	c
66	pi
67	e

a,b,c are constants. Their value may be changed without the necessity to retranslate the formula.
pi = 3.14.... and e is the base of the natural logarithm.

operators and function are assigned an operation code:

code	operation	name
1	+	addition
2	-	subtraction
3	*	multiplication
4	/	division
5	^	exponentiation
6	-	unairy -
7	=	transfer
8	;	separator
9	not used	.
10	sin	sine
11	cos	cosine
12	tan	tangent
13	asin	inverse sine
14	acos	inverse cosine
15	atan	inverse tangent
16	sqrt	square root
17	log	logarithm base 10
18	ln	logarithm base e
19	exp	power of e
20	abs	absolute
21	int	rounded to integer

The Postfix Table

This is an interim table in the translation process.
Procedure makePFT builds the table.
It's entries are of type TPostfix:

type TPostfix = record
                 reg      : byte;
                 opcode   : byte;
                 priority : byte;
                end;
				
const maxPFT     = 60;	

var PFT       : array[1..maxPFT] of TPostFix;
    pfti      : byte; //#entries in PFT

Building the postfix table is the next step after the equation type was determined.
Besides of building the table, the makePFT procedure:

In TPostfix,

The priority is the sum of the biaspriority (variable biasprio) and the priority of the operation.
The biaspriority is increased by 10 after an ( opening parenthesis and decreases by 10 after a ) closing parenthesis.
So, x + 3 has priority 4, but (x + 3) has priority 14.

Examining the character string of an equation we notice the structure

Then a third column is added to the table to indicate the priority of the operation.
The objective is later to select the table entry with the highest priority to build a final table
holding all operations in descending priority.

Here is an example what the Postfix table (PFT) looks like after translating
y = 10x - 7(x-3)^2:

reg	opcode	priority
y	=	2
[30]	*	4
x	-	3
[31]	*	4
x	-	13
[32]	^	6
[33]	...	0

Note: constants are stored in registers 30,31,....as they occur in the string, so
[30] = 10, [31] = 7, [32] = 3, [33] = 2

Let's take a closer look at the way the table is built.
The characters of basetext are looked at by groups:

'a' ...'z'

'0'..'9', decimal point

(

)

+ * / ^

space

;

The equation string basetext is scanned character by character.
A case statement handles execution of a character according to it's group above.
Action depends however on the current process or previous findings.
The latter is indicated by variable scan where

type Tscan = (scNone,scAlphaScan,scNumScan,scConstant,scVariable,scFunction,
              scOperator,scOpen,scClose);

scNone:	nothing happened....{at start, after = or ; }
scAlphascan:	scanning the name of a variable or function
scNumscan:	scanning a number
scConstant:	numeric constant found
scVariable:	variable found
scFunction:	function found
scOperator:	operator found
scOpen:	( found
scClose:	) found

So, a character read from the basetext string is presented to a case statement to test for it's group.
Then for each group a new case statement tests for the state of the scan variable.
From this point in the code, helper procedures are called to build the Postfix table.
These are the helper routines that do the construction work:

procAlpha:	test alpha string for being variable or function. Get register number or operator code.
procNum:	test numeric string for being a constant, enter constant in register
procUminus:	handle unairy - as in ...( -x + 4)
procOpen:	increase biaspriority by 10
procClose:	decrease biaspriority by 10
multInsert:	insert a * operator for multiplication, as between ) (

The nested case statements take care of the syntax checking.

Example:
procAlpha will be called if scan = scAlphascan ...{we are reading characters from basetext
that might become a variable or function} and we encounter:
a blank, ( , ) , numeric character, ; or an operator.
We look at the first code lines of the case statement:

 for i := 1 to length(basetext) do
  begin
   c := basetext[i];
   case c of
    '(' : begin
           case scan of
            scNumScan : procNum;
            scAlphaScan : procAlpha;
           end;
           case scan of
            scClose,
            scVariable,
            scConstant : multinsert;
           end;//case
           procOpen;
          end;// '('

    ')' : begin
           case scan of
            scOpen,
            scOperator,
            scfunction : errorcode := 7;
            scNumScan  : procNum;
            scAlphaScan: procAlpha;
           end;//case
          procclose;
          end;//')'
		  ......................

Note: errorcode 7 is the code for syntax error.
In this case we may have found ......sqrt) , which is illegal.

Note: after the procAlpha procedure the scan variable is either scVariable or scFunction, whatever was detected.
After the procNum procedure, scan = scConstant.
after multInsert , scan = scoperator

Refer to the source code of the xlate unit for the details such as actions of the helper procedures
inside procedure makePFT.

Building the Director table.

This is the final table of the translation process.
Procedure makeDRT builds the table.
The xlateEquation procedure calls makeDRT if the Postfix table is built sucessfully, which is the case
if the errorcode = 0.

type TDirector = record
                  opcode : byte;
                  dest : byte;
                  src1 : byte;
                  src2 : byte;
                 end;
				
const maxDRT = 60;	

var DRT    : array[1..maxDRT] of Tdirector;
    dix    : byte;//index for DRT during calculations
    valid  : boolean;//during calculations
    DRTtop : byte; //#entries in DRT

The director table (DRT) holds all operations sorted to priority from high to low.
dest is the destination register, src1 and src2 are the source registers.
For the Postfix table above { y = 10x - 7(x-3)^2 } , the DRT is:

opcode	dest	src1	src2
-	[1]	x	[32]
^	[1]	[1]	[33]
*	[2]	[30]	x
*	[1]	[31]	[1]
-	[2]	[2]	[1]
=	y	...	[2]

The next figure shows the calculations and intermediate results on a downward the time scale.
[1] means : scratch register number 1.

Procedure makeDRT searches the PFT for the operation having the highest priority.
If necessary, a scratch register is assigned to hold intermediate results.
The operation and registers are filled in in the DRT and the operation is removed from the PFT.
makeDRT has 3 helper procedures:

function getfreeReg : byte	reserve and return a free scratch register for intermediate results
procedure freeReg(k : byte)	free Register k for future assignment
procedure ShiftUpPFT(n : byte)	shift [n+1] --> [n], [n+2] --> [n+1] ......

variable regreserve has bit i set for a free register i.
Initially, regreserve is preset to $3ffffffe for free scratch registers 1..29.

Say, line j of the PFT holds the operation with the highest priority.
Then source operand 1 (src1) of the DRT entry is set to reg of line j of the PFT.
Source operand 2 (src2) is set to reg of line j+1 in the PFT.
The opcode in DRT is set to the opcode of line j in the PFT.
What about the dest (destination) register in the DRT?

The situation is coded into variable regcode and then a case statement examining regcode
takes the appropriate action.
Note: in case of a function or unairy - , the reg field in the PFT equals 0.
The operand is found as field reg on the next PFT entry.
Functions and the unairy - only use src2, src1 = 0.
After shifting up the PFT to eliminate the operation just added to the DRT, the PFT reg field is set to the destination register.

Let's break down the above equation step by step, breaking down the PFT while building the DRT.
{y = 10x - 7(x-3)^2 }

Postfix table PFT

director table DRT

reg	opcode	priority
y	=	2
[30]	*	4
x	-	3
[31]	*	4
x	-	13
[32]	^	6
[33]	...	0

opcode	dest	src1	src2
-	[1]	x	[32]

Scratch register 1 is assigned for the result.

Postfix table PFT

director table DRT

reg	opcode	priority
y	=	2
[30]	*	4
x	-	3
[31]	*	4
[1]	^	6
[33]	...	0

opcode	dest	src1	src2
-	[1]	x	[32]
^	[1]	[1]	[33]

Postfix table PFT

director table DRT

reg	opcode	priority
y	=	2
[30]	*	4
x	-	3
[31]	*	4
[1]	..	0

opcode	dest	src1	src2
-	[1]	x	[32]
^	[1]	[1]	[33]
*	[2]	[30]	x

Postfix table PFT

director table DRT

reg	opcode	priority
y	=	2
[2]	-	3
[31]	*	4
[1]	..	0

opcode	dest	src1	src2
-	[1]	x	[32]
^	[1]	[1]	[33]
*	[2]	[30]	x
*	[1]	[31]	[1]

Postfix table PFT

director table DRT

reg	opcode	priority
y	=	2
[2]	-	3
[1]	..	0

opcode	dest	src1	src2
-	[1]	x	[32]
^	[1]	[1]	[33]
*	[2]	[30]	x
*	[1]	[31]	[1]
-	[2]	[2]	[1]

Postfix table PFT

director table DRT

reg	opcode	priority
y	=	2
[2]	..	0

opcode	dest	src1	src2
-	[1]	x	[32]
^	[1]	[1]	[33]
*	[2]	[30]	x
*	[1]	[31]	[1]
-	[2]	[2]	[1]
=	y	...	[2]

And the translation is completed.

Please use the exerciser in debug mode to see the PFT and DRT of other equations and functions.

Syntax

Syntax checking is done during the construction of the PFT.
The main rule is that operators and constants, variables or functions must alternate.
Two successive operators such as - - 5 are illegal. Instead, write - (-5) which is OK.
Also legal is : y = - x or y = - cos(5x), where the - is a unairy operator.
Keep in mind the difference between - 2^4 { = - 16} and (-2)^4 { = 16}
After a function , a ( must follow. Example: sin(3x).
The * multiplication operator is inserted between )..( , numeric constants and variables or functions
and between ( or ) and functions, numeric constants or variables.
Examples:

Parenthesis may be nested 20 levels deep.
A ( must be matched by a ) at each side of = or ; operators.

The following is illegal:

A translation produces an error code. Zero indicates a successfull translation.
If the errocode is nonzero, function getmessage(errorcode) gives a string with the decription of the error.

Calculations

Now with a completed director table it is easy to make calculations.
Just perform all basic operations in the table from top to bottom.
After the translation, the values of a, b, c still may be changed. Retranslation is not needed.
For a type1 function, before calculation we have to set the value of x by calling setX(...).
Then procedure calculate(var OK : boolean) is called to do the job.
The OK flag will be true after an error free calculation and false after an error
such as

After the calculation the getY function returns y.

For each operator or function there is a small procedure that makes the calculation.
A try....except....structure takes care of any arithmetic errors.
Variable dix is the index into the DRT and it runs from 1 to DRTtop.

For a type2 function we have to set y using the setY(..) procedure and getX returns the calculated x.

For a type3 function, use setV(..) and later the getX and getY functions to get x and y.

For a type4 function both x and y have to be set and the result is returned by calling getV.
Remember that a type4 equation, such as x² + y² = 25, is modified to
v = x² + y² - 25, where the - operation has priority 2 and the = has priority 1.
So, v will be zero if the equation is true for a (x,y) pair.

The exerciser

The exerciser is programmed in form1 and unit1.
It provides a TEdit to enter equations and buttons to start translation or set a,b,c constants.
Also there is a checkbox to switch on debugmode, a label to display results and a 640 * 480 pixels paintbox
for the display of the registers for numeric constants, the postfix- and the director tables.
The exerciser has been kept very simple intentionally.
The coordinate system in paintbox1 has (0,0) at pixelposition (320,240),
x runs from - 8....+8 and y from -6....+6. The scale is 40 pixels per cm.

Please look at fxlate2.html for a description of the paint procedure for each equation type.