Into Java - Part XX
Last time we briefly discussed threads and made a little demo application. This time we will continue using threads and at the same time implement the tiniest web browser ever seen. The goal of this instalment is to introduce networking. One of the main reasons Java ever surfaced was Internet use, not just its networking capacity but that Java programs are small and portable. These days Java also has matured to be very powerful for networking.
A Uniform Resource Locator is better known under the acronym URL. Java has a class java.net.URL that encapsulates a lot of mechanisms around the common URLs. A URL can be broken down into:
- port number
On the Internet there are several protocols to choose from. Http and ftp are the most commonly known and these basic protocols are of course handled by Java.
Host is the name of the host, e.g. www.os2ezine.com. Some hosts use much longer host names such as www.software.ibm.com or service.software.ibm.com. The rightmost two parts of the string is used by the global DNS system to find a IP-number to that organization. The parts to the left specify the resource within that organization, which is responsible for the exact IP-number of that resource.
Port number is optional since there are many default port numbers specified, by default http uses port 80. Optionally a port number can be used, e.g. http://www.ncsa.uiuc.edu:8080/demoweb/url-primer.html, where port 8080 shall be used instead of 80.
Path is the path relative to the resource. That is, an external user never knows the exact path within the host system but the relative path from the resource's base directory.
The resources are typically web pages, but can in fact be almost any object provided by the host: JSP pages, Java servlets, CGI scripts, multimedia streams, database queries, etc.
Optionally a fragment may be added to the resource, e.g. index.html#download where download is a fragment added to the resource delimited by the sharp sign.
The path, resource and fragment are together considered a "file" by the URL class.
The Java URL class can be constructed from a single string of a complete URL, which seems to be the easiest way. If you are getting several resources at the same host, you'd better use one base URL object and construct new objects each resource with the URL(URL context, String spec) constructor.
Let us look at the methods of URL. Many of them only return the parameters of the URL object, but openStream seems interesting, as does openConnection (that we will investigate next instalment) as well as getContent which is a killer.
The tiny browser
Recall the humble file reader we implemented in the fourth instalment? Now we shall make it over and fresh it up to be fully GUI driven and capable of browse the web. But as always we start from the bottom which is the driver class. First the constructor:
The constructor takes no parameter and the first part is known from former instalments. The only new part is how we make use of a tiny ini-file. Note that the ini file seems to have been saved as an object stream that we can read from and directly create the size and location. Then why do we create a dimension for the size and a point for location above that ini-file reading block? Because the first time there is no ini file, or maybe it got corrupt. Somehow setSize and setLocation need valid values.
I do not recommend these kinds of hard coded ini files though, they are hardly expanded and prone to other errors. Better are key-value pairs that can be parsed, e.g. two objects in conjunction: String:"location" Point:[100,20], etc. Though, I recommend object files since they are fast and they are easily used without unreadable and muddled code.
We continue with the doExit method that is called at exit, as seen in the window adapter above. And we look at the tiny main method at the same time.
The doExit method is the counterpart to the ini reader, it collects the parameters' dimension and location and saves them to an object stream. With a key-value concept I would create the key string for each value and save it followed by the value.
That was the outermost class. Now we will look into the BrowserPane that is added to the frame's content pane.
First there are a few private member variables, of which thread may be the most interesting one this time. We will come to it in the next class. The rest of the constructor adds to the bloated GUI <grin>. And as you see the URL input text field and the STOP button both get an actionListener, the object itself. (Note that the image to the right is highly compressed to save space.)
Let us continue with the methods needed.
The stream reader will be located in the thread, hence we need a way to append text to the text area. And since the lines read from the stream are stripped of their EOL characters, we add that again, a system dependent one.
Since we implement the ActionListener interface, we have to add the method actionPerformed. Two objects can send action events to this method, the URL input field and the stop button. The URL input field enables the so far disabled stop button, clears the text area and creates the thread object. That is it. When all that is done this object's thread will return and wait for user input, no matter what the other thread does.
The stop button simply stops the other reading thread and disables itself.
With this class done we continue with the threaded reader.
First we import the necessary packages, including java.net. The swing package is only used to raise an error dialog if needed. The constructor tries to create an URL object from the URL string given. Note that we have to write fully qualified URL names, including http://. You are free to implement tests to solve that issue.
This class implements the Runnable interface and hence has to instantiate its own thread. (This time we could have implemented this class to extend Thread as well, I chose this way only because most of the time we extend other classes and add the thread to that.) We let the fresh object start itself. Recall that the start method orders a JRE call to the method run.
As with the former file reader we need a stream reader, preferably a buffered reader. From the URL object we ask for the stream to read from, which we wrap with the buffered reader.
Exactly as we are used to, we continue with reading new lines until no more lines are read and the stream is finished. Each turn in the while loop we append the text to the tiny browser's text area. As soon as the stream is read, we close the stream and we leave the run method causing this thread to die naturally.
If the stop button was pressed, we see that the boolean continueLoading is set to false and will break the while loop as well, the stream closes and the thread dies.
This way we construct a new threaded reader every time we load a new URL. Of course we can do better with the system resources, but as threads are rather cheap to instantiate and we must create a new URL object anyway, I do not consider that much to talk about. There are worse things to avoid.
Have a nice time surfing.
Next time we will see a much more elegant web browser made much easier than this one, but harder to understand. And we will dig into the URLConnection class. CU around.
The complete IntoJava archive (updated a few days after a new issue is published).