Math Interpreter

Introduction

For various reasons, I needed to code a math evaluator in Java to evaluate, well, math. It takes in basic math expressions as strings, and evaluates the basic operators, + - * / and (.

This is designed to be a recursive descent parser, a top-down parser where each "section" or "function" evaluates a single grammar representation of the expression, and at the bottom, it calls the top (if there are brackets) recursively.

In this article, I will explain how this is done. Firstly, let’s use the BoDMAS rule. There are three "levels" of math to evaluate, factors (unary expressions), terms (multiplication) and expressions (addition and subtraction). I will explain this in detail below.

"Eating"

To provide some context, I will explain the basis of the code, called "eating" or "consuming". It’s a very simple concept, it moves the position we are analysing forward and stores any relevant information we need.

Consider the method:

                        private boolean consume(int charToEat) {
        // skip empty, speed up
        while (ch == ' ') {
                nextChar();
        }
        // check if it has reached the char to consume
        if (ch == charToEat) {
                nextChar();
                return true;
        }
        return false;
}

private void nextChar() {
        ch = (++pos < str.length()) ? str.charAt(pos) : -1;
}

                    
What this does, really, is allow us to search for a character we want. The variable str, ch and pos are local variables to the MathInterpreter object, which represents the string being evaluated, the current character, and the current position respectively.

Every time the consume() or nextChar() is called, do note that we are moving the parsing to the next character and moving the position forward.

Factor

'Factor' is a unary expression, a string consisting of a leading unary add or unary subtract symbol, followed by numbers and decimals. This is the most basic unit of an math expression.

Examples: 9 , 9.99 , -5 , -1000.8, +888

In this section, I will explain the 'parseFactor' method and how it identifies and evaluates a factor.

Firstly, we check if there are any leading unary operators

if (consume('+')) {
        // unary additon
        return parseFactor();
}
if (consume('-')) {
        // unary sub
        return -parseFactor();
}
                
                    
If it does, we return the negated(actual value) by calling parseExpression recursively on the contents of the parentheses

Now we create a local pointer to store the real numerical value in Java as a double, and also store the starting position (to call the substring later).

                        double x;
int startPos = this.pos;

                    

Next, we assume that our final product, the 'parseExpression' method, will completely evaluate all math, and use it to evaluate anything within parentheses. After this evaluation, it will return the final result as a 'factor' in the data type 'double' (64-bit floating point), and move the 'current position' of the parsing to past the closing bracket of the parenthesis.

                        if (consume('(')) {
        // parentheses
        x = parseExpression();
        consume(')');
}

                    

With the code above, we can safely assume that all brackets are accounted for. Now, we assume that we are NOT working with brackets, as it has already been resolved. We try to get the unary expression and convert it to a 'double' in Java. This is done by pushing the 'current position' forward constantly using the 'nextChar()' method until the current character is not a numerical value (0-9) or the decimal symbol (.). The final position is then used with the initial position to obtain the substring, which is then parsed to a Java 'double' using Java's built-in library.

                        else if ((ch >= '0' && ch <= '9') || ch == '.') {
        // consume numbers
        while ((ch >= '0' && ch <= '9') || ch == '.') {
                nextChar();
        }
        x = Double.parseDouble(str.substring(startPos, this.pos));
}
                    
The x is then returned. This would give us a unary expression. Look at the animation I made below for a visual explanation of how this function works. The animation below will assume the current position can only be at the start of each unary expression. You will understand why this assumption is valid after the whole article.

Press any of the buttons to start the animation at that position.

-555 + 7.326 - 777 * -7.345
Check if its '0-9' or '.'

Terms

'Term' is an expression of factors multiplied or divided together.

Examples: 5*7*8, -7*6.5/3, 2/3, 0.2/-7 .

In this section, I will explain the 'parseTerm' method and how it identifies and evaluates a term. It uses 'parseFactor' and assumes it was successful. Do note that this assumes that there are no non-unary '+' and '-' symbols, as it is resolved at a higher-level.

Firstly, it executes 'parseFactor' once, setting a local variable to obtain that factor's value.

                    double x = parseFactor();
                    

Next, it looks for any multiply '*' or divide '/', to consume it. Thereafter, it executes 'parseFactor' to get the next factor (note the position is always pushed forward after consume or parseFactor is called). The value of the new factor is then used in multiplication or division with the prior variable, 'x'. When there are no more '*' or '/', the method returns 'x', a double with all the multiplication and division processes completed.

                        while (true) {
        if (consume('*')) {
                // multiplication
                x *= parseFactor();
        } else if (consume('/')) {
                x /= parseFactor();
        } else {
                return x;
        }	
}

                    

Here are some animations to show how 'parseTerm' works visually.

Press any of the buttons to start any of the animations.

888*999*100/370
Check if its '0-9' or '.'
x = 0

Expressions

'Expression' is any mathematical statement containing + - * /.

Examples: 5*7+5*8, , 2/-3+2, 0.2/-7+0.3+0.2 .

In this section, I will explain the 'parseExpression' method which evaluates the whole expression. It uses 'parseTerm' and assumes it was successful. Since 'parseTerm' calls 'parseFactor', and 'parseFactor' passes expressions in brackets back into this function, there is no need to consider bracket cases, as it will be resolved recursively.

Firstly, it executes 'parseTerm' once, setting a local variable to obtain that term's value.

                    double x = parseTerm();
                    

Next, it searches for any add '+' or minus '-', symbol and consumes it. Thereafter, it executes 'parseTerm' to get the next resolved term (note the position is always pushed forward after consume or parseTerm is called). The value of the new term is then added to or subtracted from the local variable, 'x'. When there are no more '-' or '+' symbols, the method returns 'x', a double with all addition and subtraction functions completed.

                        while (true) {
            if (consume('+')) {
                    // addition
                    x += parseTerm();
            } else if (consume('-')) {
                    // subtraction
                    x -= parseTerm();
            } else {
                    return x;
            }
    }
}

                    

Here are some animations to show how 'parseExpression' works visually.

Press any of the buttons to start any of the animations.

888*999*100/370
Check if its '0-9' or '.'
x = 0

Putting it together

A string is put through the parser, it firstly runs 'parseExpression', which immediately calls 'parseTerm'. 'parseTerm' then immediately calls 'parseFactor'. 'parseFactor' either finds a parenthesis expression and calls 'parseExpression' with the contents of the parentheses, or finds the first unary expression (or 'factor'), which in either case, would return a singular Java 'double', and pushes the 'current position' to the end of the parenthesis or 'factor'

Now, at 'parseTerm', since it has obtained the first factor, it repeatedly consumes '*' and '/', pushes the 'current position' past the 2 operator and executes another 'parseFactor' for each '*' and '/' it consumes after it consumes it. This continues until it finds non-unary addition and subtraction, as they are not considered factors, and are neither '*' nor '/', upon encountering those two symbols, parseTerm will return a double equal to the evaluation of the term.

Back at the top, 'parseExpression' now has a 'term' from the first 'parseTerm'. Thereafter, it repeatedly consumes '-' and '+', pushes the position past the 2 operators and executes another 'parseTerm' for each '+' and '-' it consumes. If there are only the 5 operators ('+', '-', '*', '/' and '()'), this would be the end of the expression.

Conclusion

A recursive parser simply splits grammar into levels, and arranges them in the order of resolution. Thereafter, if there is any recursive grammar syntax, (e.g. parenthesis), the lowest level can call the highest level to recursively solve it.