Into Java - Part V

by Simon Grönlund

Up to this point we have rushed through quite a few basic topics on both Java and Object Oriented Programming (OOP). That is, we may now build ourselves apps with a GUI, or at least with a frame since we do not know how to interact with the stuff within the frame yet. And that will wait for a while because I would like to pace down and discuss some commonly used classes and some programming techniques.

Today we will continue with the MyFirstFrame from the previous column. Hence, please make yourself a new folder and copy MyFirstFrame.java into this folder. We will make it Find/Replace enabled, but only through the terminal window. This will guide us through the commonly used String and StringBuffer classes, point us to some pitfalls and introduce us to another programming concept, using recursion.

Recursion vs. Iteration
The easiest way to grasp programming is to think of some task being done one step after another. You may for example Obviously this is an iteration of a few lines taking place. Line 2 and 3 can of course be both lengthy and complicated, but it is still an iteration. Most people has no problem to see how the program above works, mainly it is the application we made ourselves in the February Java column
 * 1) open a new text file
 * 2) read one line from the file
 * 3) add the line to a text area
 * 4) repeat from No 2 as long as there is more lines
 * 5) close the file
 * 6) do some finishing tasks

It is not always that simple to write a computer program, many times we must break the task apart into simpler problems, but still similar to the first task. Think of a long text line and you would like to find a set word, e.g. "Java". You may let the method indexOf (in the String class) do the workload, it will let you know both if the word exist in a certain line and where the first occurrence is. But note, it will only tell you the first occurrence.

Then the iteration above is no good, since if you ask for an occurrence and is given one, how do you ask for the next one on the same line in an easy way? If there is another one? It can be solved of course, but not elegantly.

Recursion is the answer! That is, the method will call itself repeatedly if necessary, down to a point that is called "the trivial case". What is the trivial case? That may differ, but in the "find the word" case it simply is the answer "No", there is no more occurrence of "Java". Other times it is more tricky, but there is always a trivial case, else the recursion (the re-call to itself) will never end.

So, in our case, if we found an occurrence of "Java" we simply call the method again with the rest of the text line. And if there still is another occurrence another recursion will take place with the new remainder of the line and so forth. If there is no further occurrence, the method will simply go on with the line after the recursive call. That is, it will return from the trivial case.

The return
Let me stress that any method will "return" to the point it was called from. Look at the MyFirstFrameDriver example. Every call to the frame.readFile(...) will finally be completed and the "thread" comes back to the do/while loop for another turn. Since the while loop ends directly after the frame.readFile(...) call, it will start from the beginning again with the print command. do { try { System.out.print("Specify a valid file name: "); /* send user input to readFile */ frame.readFile(in.readLine); } catch (IOException e) {} // some IO error is ignored } while(true); // loops until the window is closed If recursion is used it might look like this pseudo code public DataType myMethod(DataType input) { if ( trivial case ) { return input; } else { newInput = ( recomputed input ); temp = myMethod(newInput); return temp; }    }

If the trivial case is found immediately the method will simply return as usual. But imagine that first a recursive call was made (let us call it B), and then another one (called C), to what point will the method that is now called three times return?

Obviously the last call to myMethod will return from the trivial case (line 3) back to C. But C did the last call from line 6 that says temp = myMethod(newInput); Obviously we will continue with line 7 and return to the one called C, that was B.

Back to line 6 in B then and on to line 7 that will return to the one called B, that was the original caller. And in the original caller of myMethod we will continue with the next line.

Which value was then returned this somewhat lengthy way? Imagine we got the value any from the trivial case, then any was returned into C and any was assigned to temp that seems to be the variable returned in C at line 7. Hence any is returned from C and is in turn assigned to temp in B, thus B will return any to the original caller.

Obviously the recursion might be used to compute most problems in a top-down style. But to implement these methods we have to analyse the problem carefully and start from bottom-up and both find the trivial case and start with it. Then we may continue with the different cases to consider. In a future column we will use recursion in a few ways. Today we will use recursion as mentioned above, a simple re-call with the remainder of the line and no return value. Even with no return value we will go back in the same style, from C to B to the original caller.

A recursion pitfall
In spite of working this elegantly recursion still causes some overhead since the CPU has to "remember" quite a few internal states and stuff, as to know which way to go back and reinstate everything as it looked like at the time of the recursive call. Thus a rule of thumb is "avoid recursion when there is an obvious solution by iteration." -- Algorithms + Data Structures = Programs, by Niklaus Wirth.

The Find & Replace
{| width="25%" cellpadding="5" align="Right" bgcolor="Silver"

final
The final modifier prohibits any later change to the class or variable.

Variables that do not change, like gravity, speed of light etc., are often declared final and if you like them to stick around in all instances of the class, declare them static final.

Classes that contain an immutable value, like strings, are final and you cannot change their values after creation.
 * }

Today we will add to the code we used last time. I found it convenient to change the class names to FindAndRepl and FindAndReplDriver. Remember to change the names within the class to reflect the new file names. Or, better, make the FindAndRepl.java extends the MyFirstFrame class. Since we will use the two classes String and StringBuffer, please look them up in your Java API and briefly acquaint yourself with them.

String vs. StringBuffer
Since the String class is final we cannot alter the string contained within a String object after its creation. A common mistake by programmers is to write this kind of code Obviously they believe that the temp</tt> and EOL-mark (end-of-line) will be added to str</tt>, but that is not the case. Actually every loop will instantiate a new str object, and let the old object be a case for the garbage collection. The new object will upon creation be a compound of the old str</tt>, the temp</tt> and the EOL. Is that bad then? Yes, since creation of new objects causes CPU overhead.

The correct, and a lot faster, code shall make use of StringBuffer. Let str</tt> be declared as a StringBuffer and then use the method append</tt>. StringBuffer str; String temp; while ((temp = bf.readLine) != null) { str.append(temp + "\n"); }    temp = str.toString; // convert back to String Any time you plan to have a string that shall be altered with or added to, use StringBuffer. But as you shall see, sometimes the most convenient methods to make use of resides in String and then we have to swap between String and StringBuffer. That is, as stated before, planning by pencil saves you time and sweat.

The Find & Replace Methods
In our renamed FindAndRepl class we have to add a new method, that we may name findString. It is possible to add it anywhere within the class boundary, so let us add it after the  readFile</tt>. Further, make it a nice habit to comment your code right away. If you can state the behaviour of your method in a few easily read lines, then that is a sign of you have probably found the right algorithm. If it is not an easily understood comment, your code is most likely not working as expected, as said: "the words dimly spoken, are the ones dimly thought"*. In fact, quite a few programmers write the comments first, to see what is expected from the methods, and later implement them from the comments. /*     * Get the document and finds the occurrences of the * the specified string. Second argument is the optional * replace String. This method calls a helper method. */    public void findString(String find, String replace) { String doc = text.getText; doc = helpFind(doc, find, replace); text.setText(doc); } We plan to call this method with two arguments, the string to find and the replacement string, both are objects of String type. The method will return nothing since it operates directly on the JTextArea through  text</tt>.

Fetch the text displayed in the text area as a huge (depends on the size of the displayed text) String object, assign it to the String variable doc</tt>.

Next statement is very common when using recursion, we call another method, a helper method. This time we do that since the helper method will make use of three arguments, the document to search, the string to find and the replacement string. It is the document to search that is the clue to get this recursion to work.

Initially the document will contain the complete text displayed in the text area. But if we find the string searched for, we will re-call the helper method with the remainder of the document, not the complete one. At last we will find no more occurrence of the string to find and that will be the trivial case that is returned from.

We cannot use findString</tt> to make re-calls to, since it is not possible to call it with the remainder of the document.

Imagine the helpFind</tt> works as expected, it returns the complete document with the found strings marked or the replacements done. Then, assign the return value to doc</tt> and set that altered document as the text displayed at the text area. And this method is done.

The helper method
Again, a helper method is very commonly used when working with recursion, but not only then. Whenever you find repeated code it is time to think about using a helper method, duplicated code shall be removed and hidden in such a helper method. Further such methods may help not only one but several methods within your classes.

Since a helper method normally is not called from outside the class we may specify it private</tt> and thus hide it. But it is good to ponder over private, protected</tt> or public</tt>, since the visible scopes are not the same.

Here comes the code: /*     * A helper method (hence private) that will do the * recursion. It is called from findString. If no replacement * argument is given, that is 'replace' is null, this method * will replace the found string with itself uppercased. */    private String helpFind(String txt, String fnd, String rpl) { int fIndex = txt.indexOf(fnd); if (fIndex < 0) { // no more occurrences of fnd is found return txt; // the trivial case } else { // index points to the first occurrence of fnd /* gets the first part (the substring) of the text, * up to the string searched for */ StringBuffer buff = new StringBuffer(txt.substring(0, fIndex)); /* Here comes the replacement */ if (rpl.equals("")) { // no replacement argument buff.append(fnd.toUpperCase); } else { buff.append(rpl); }            /* Here comes the recursive call, with the remainder * of the given string txt. Now buff will hold * the "middle value" of the text, that is from the * former index+length_of_fnd and up to the index of             * this turn in helpfind', and appended will be              * the value returned from the recursive call. */            /* Note that it is possible to break statements * almost at any place of your choice, Java is very * forgiving. It is still one statement, ending * with a semicolon. Note the matching pairs of             * parentheses as well. */            buff.append(                 helpFind(txt.substring(fIndex + fnd.length), fnd, rpl)                ); /* Now buff holds the same middle value and the * remainder of the text appended to it. */            return buff.toString; }    } First we will test if it is the trivial case, that is if  indexOf </tt> returned -1 which is "the string is not found". If so, return the string since it is the remainder and have to be appended upon the former part(s) of the complete document.

If no occurrence is found in the very first instance, the helper method will return immediately to the<tt> findString </tt> method.

But imagine we had the document "My very first test of this very fine recursive lesson will end now." And we are looking for "very" and there is no replacement string given. What will happen? The <tt>fIndex</tt> will be 3, since "very" starts at character number 3 of the string. Remember index 0 (zero) is counted as the first character. Hence we dive into the else clause: Since we have used the first part of the sentence, including the string to look for, we would like to send the remainder of the string, that is found from index <tt>fIndex</tt> plus the length of the <tt>fnd</tt> (since we do not like to get that part added again). Further we use the very same arguments again,<tt>fnd</tt> and <tt>rpl</tt>, since they will be the same arguments throughout the operation. The <tt>fIndex</tt> will be 20, since "very" starts at that index of the actual txt. The<tt> fIndex </tt>will be -1 since no further occurrence of "very" is found, that is "the trivial case". The if clause will be true and we return<tt> txt </tt>unchanged.
 * 1) The <tt>txt</tt> will be the complete sentence
 * 1) The StringBuffer <tt>buff</tt> will be "My ", since<tt> substring </tt> will take anything from character zero up to, but not including,<tt> fIndex</tt>.
 * 2) Since <tt>rpl</tt> equals the empty string, no replacement text was given, we will append to buff the fnd string, turned to upper case so we can see it found.
 * 3) The <tt>buff</tt> is now the "My VERY" string, held as a StringBuffer object.
 * 4) Now we will append the result from another call to<tt> helpFind</tt>. What are the arguments to use?
 * 1) The <tt>txt</tt> is now " first test of this very fine recursive lesson will end now."
 * 1) The new StringBuffer buff will be " first test of this ".
 * 2) "very" is turned into upper case.
 * 3) The <tt>buff</tt> is now the " first test of this VERY" string.
 * 4) Append the result from another call to <tt>helpFind</tt> with the remainder of txt.
 * 5) The<tt> txt </tt>is now " fine recursive lesson will end now."
 * 1) Now we can go backwards, from 3 to 2d, that will append the unaltered string to<tt> buff</tt>, that then will look like " first test of this VERY fine recursive lesson will end now." (Compare with 2c)
 * 2) One step further, from 2d we go to the conversion of the StringBuffer to String and return the result into 1d. The resulting buff in 1d will look like "My VERY first test of this VERY fine recursive lesson will end now." (Compare with 1c)
 * 3) The last step will be from 1d back to the original caller,<tt> findString</tt>, after a conversion of<tt> buff </tt>to a String.
 * 4) Finally, in<tt> findString </tt>we replaces the value in<tt> doc </tt>with the new result and put it visible in the text area.

I hope this little step-by-step guided tour will help you grasp the idea behind recursion. And that you have got some ideas on how to use the String and StringBuffer classes.

Only one thing remains. How do we get it to work?

First of all, add another line to this class: import java.awt.*; Then, turn to the FindAndReplDriver.java and add a few lines to it The lines give you an opportunity to enter an optional search string and an optional replacement string. Then <tt>findString</tt> is called with those arguments. Note, you can see the result from reading the file in the window right away, and after you entered the optional arguments you will see the next result.

The if clause may contribute to your confusion, what are we asking for really? We would like to dive into the if-block if, and only if, anything was entered and assigned to <tt>f</tt>. But it looks like we ask for the opposite, that <tt>f</tt> is the empty string. Yes, we are, but note the little exclamation mark, '!'. An exclamation mark works as a logical not that turns false into true and turns true into false. So whenever <tt>f</tt> is not equal to the empty string, we turn that false into true and, voila, the clause behaves our way.

You may compare this use of the exclamation mark with the != as opposed to ==.

Further, note that we do not test if <tt>(f == "")</tt> that will always return false. That is because that test tests if the object <tt>f</tt> has the same reference as <tt>""</tt> has. The method <tt>equals</tt> in the String class will compare both string objects character by character. Compare with this code snippet String str = "Simon"; String txt = "Simon"; if (str == txt) { /* this will never happen */ }    txt = str; // second "Simon" is discarded if (str == txt) { /* this is true */ } Since the first two objects are two different objects with different references, they will not be located at the same place in your computers memory, and hence the references are not equal. The internal state of the object is not compared, but <tt>equals</tt> will dig into the inner state of them both.

Later we assign the <tt> str </tt> reference to<tt> txt</tt>, and then their references may be compared. Because the one reference, held by two variables, refers to only one object, the variables are of course equal with respect to the state as well. And, "Yes!" I know this seems stupid! But believe me, both the mistake with comparisons is very common, and very often the comparison of references is needed. Hence, commit the two different comparison styles to memory, please.

User interaction
To be able to do consecutive searches on the former result, change the first <tt>if</tt> to a <tt>while</tt> loop. But then we have to add another inquiry for an optional string to find. The final code for today will look like

Summary
Recursion may be a most powerful tool, but as the careful reader may have noticed, this find&replace lesson could easily have been done using iterative coding style. Method <tt>indexOf(String str, int fromIndex)</tt> can be helpful then. So, what algorithm will you use then? Pick the one easiest to code, unless you have a good argument why not to do that.

String and StringBuffer are two classes heavily used in almost every application, please take your time and study the Java API carefully.

Complete code to FindAndRepl.java and to FindAndReplDriver.java

* If correctly translated.