Into Java - Part XXI
You did not do much surfing with TinyBrowser last time, ahem? I do not think so, ignoring HTML-tags is not for everyone. Did anyone set off implementing their own HTML-parser? Not a tag stripper (since such an application is already made by [Don Hawkinson]), but a parser. I suppose not, and of course there is someone else who has already done that. Today we will see that it's there in your computer.
Furthermore we will look at URLConnection, another class encapsulating common services for a web resource, such as asking for its' size, date etc. In fact, the URL class sometimes uses URLConnection to do some tasks.
A better Java browser
There is not much to say this time as we will use ready made classes from the standard Java and Swing libraries. Admittedly, details aren't always clearly spelled out in the Java API and hence, many times it is hard to see how to make use of hidden resources. Most of the time you'll find information in Java news letters, at Java web pages, from Java books or from friends.
Let us start with the constructor, of which some lines are familiar from earlier Into Java instalments.
This browser will not be multithreaded as the last time. Of course you can easily add Java capabilities, such as multithreading, remembering pages visited and such stuff but that would expand this article into more than I want it to be.
The topmost area is the text field to enter URLs into. We add a listener to it so that a hit on Enter will make the browser grab the page. This time we make no use of such things as JOptionPane dialogs, but it is not hard to add one.
The next text area is new to us, the JEditorPane. I will not say anything about it, other than that it is a powerful class and you are free to explore it's capacity on your own. Anyway, as you can see it has one useful method, the addHyperlinkListener, which we feed with another listener. Finally we add the GUI stuff to the frame.
The next thing to do is the two listeners. Until now we have always used ready made listeners from adapter classes, but we are perfectly free to make listeners on our own.
Making your own listener is not hard as you can see. But it can be confusing to see another two classes inside another class. This technique is called inner classes. Remember though that every time we have used an adapter it has always produced a class file named xxx$1.class, numbered sequentially. That is an anonymous inner class. But this one differs slightly, it will result in classes like Browser$URLListener.class, that is not that anonymous. "But", you may argue, "the name of the adapter inner class is WindowAdapter, isn't it?" No, actually WindowAdapter is the super class and our anonymous class inherits from its' ancestor and overrides one or more of the ancestor's methods.
Inner classes are used due to convenience, not that they are really necessary. Any inner class may be implemented as a regular class, but not always as elegant as this time and sometimes it can be really cumbersome. The foremost reason to use inner classes is that they have full access to the instance that made them, any method or variable no matter if it is declared private or not. This simple browser is an obvious example of that, they immediately refer to the GUI parts of the Browser object. The only drawback is that you cannot supply parameters to its constructor when the inner class implements an interface such as our two inner classes.
However, the URLListener tells the JEditorPane to set a new page, and also feeds that method with the URL you typed into the text field. Reading the Java API says that the string given to setPage will be parsed for which type of parser to use. If the string of your URL is a valid URL the JEditorPane will determine what to do with it. Confusing? Yes, a bit. It is not that clear that a lot of stuff will be done automatically. Unfortunately the JEditorPane does not support the PNG image format, so it is hard to read the OS/2 eZine web site with this browser.
The next inner class implements the HyperLinkListener interface of the javax.swing.event package. That interface has only one method that we need to implement, the hyperlinkUpdate(HyperlinkEvent e) method. That is, The JEditorPane, or any of its subcomponents, can distinguish a hyperlink from other text and if a link is clicked this inner class is notified. The code might looks confusing, but please, view the Java API and some clouds will vanish.
If everything with the link clicked is okay, a new page will be set by the JEditorPane mechanism and the URL text field will also be updated to reflect the new page viewed.
Let us finish with the driver method, lean as usual.
So far you have a better browser. Still there are lots of features to add, and I suppose you will not do that. Still the idea with JEditorPane, the HyperLinkListener interface, and how to make neat inner classes is exposed through this short example.
Since most of the time you will use the splendid networking capabilities of Java for other reasons, such as connecting to internet resources, and you may not always with the means to download or read such "files" Java gives you the URLConnection class that enables you to easily get additional information such as file sizes, creation or update dates and so forth. More than that, if you are allowed to, that is if you have access to the host and the particular resource, you can set several properties using an URLConnection object. We will only look into getting information though.
Anyway, there are four steps we need to follow when using the obtained URLConnection object:
- Set properties (optional). This must be done prior to any output to the url you are communicating with.
- Connect to the resource using the connect() method. This step retrieves the resource's header information, whether you will query the resource for information or not.
- Query the resource using the different get-methods. If they are insufficient you may use the getHeaderField method that comes in two flavours, we will use one of them soon. To query headers you need to know more on what different protocol headers can tell you and then you have to read the correct RFC (Request For Comments) that are published on the net.
- Finally you can get an InputStream using the getInputStream method. This is exactly the same stream as using the openStream method on an URL object as we did last month.
URLConnection may be instantiated in two ways, by the constructor that takes a URL object as a parameter, or by getting a URLConnection object from an existing URL object using the url.openConnection() method. We will use the latter but the two ways are equal.
This time we will add to the Browser and use URLConnection whenever we want to connect and query the resource. As the Browser class is not multithreaded, this enhanced version will stay single-threaded since we want to keep it simple.
We will add two import lines, one new GUI feature, add to the inner class URLListener and finally add a new method. Let's start from the top.
We have to add the two bottom import lines, java.util and java.net since we will use StringBuffer from the former and both URL and URLConnection from the latter.
Further we declare a JCheckBox to be used when we want to view the information on a resource. This check box is instantiated and added to the GUI as follows. The top and bottom lines are from the former Browser constructor, only the middle two lines are new.
So far so good. You can very well compile and see the outcome, but no feature is yet implemented. Let us start with expanding the inner class URLListener with a few lines.
Now we have added an if-else control, either the new check box is selected, or not. If it is selected we make a call to the new but not yet implemented method, feeding it with the URL written in the text input field. Otherwise we will use the former, ready-made implementation. And this is it. Let us move on to the new method.
This is only a helper method, hence it is private. As I said, it gets the URL text string as an argument. On the next line, a well known convenience string, the end-of-line character.
Next comes a new line, somewhat confusing. If, and only if, we have earlier used the JEditorPane to view a web page, its inner EditorKit is set to one that can handle html. That kit can handle plain text, but not very well, comment out this line and watch. The solution is to set an EditorKit that can handle text/plain before adding text to the pane. And of course we will clear the pane.
We are used to the URL class from last month but here we only use the URL object to get us an URLConnection object. At the same time the fresh connection queries for the header information. Let us look at the continued method.
We will start with a few lines with variable declarations and instantiation of a StringBuffer. Recall that a long time ago we discussed how much computer resources we saved by not concatenating String objects, that is by far the CPU hungriest way to deal with character arrays. Instead we will use a StringBuffer and append shorter strings to it. Finally we will, as seen at the bottom code line, get its content as a String and feed the JEditorPane object with that one.
Another reason is that JEditorPane does not have an "append" method as JTextArea does, there is only the setText method that has to get the full text at once. Or worse, adding new concatenated text strings growing bigger and bigger until there is nothing more to concatenate to it (I shudder at the thought.)
Starting with the first header field line we append both the line and its key word to the buffer. And that loop will continue until there is no more to read from the resource's header. During this looping around, nothing is added to the pane, but do not worry, the looping is done within hundredths of a second. That is because the header is already downloaded upon connection time.
Surfing around watching resource header fields will disclose that there is no given order to the lines of a header, nor do all keywords need to be used. Some of them are used but do not carry a value. Also try to connect to non-existent resources, or resources you are not allowed to get to and so forth. That way you will see different replies, all of them must be parsed and taken care of by a full blown browser or whatever application you will build or it will malfunction sooner or later.
A final word in this method. As you can see we do not take care of any exception here, on the contrary, we throw such ones to the method that called this method. Why? Only because that method promised to take care of such things as exceptions. And why multiply existing code? Lazy programming is nice sometimes.
Now we are done for today. Next time we will look at sockets, what a socket is and briefly how to use it. But we will also build ourselves a little multi-multi-threaded chat-server. Why a server before the client? Because a client needs to have a server, but a server does not necessarily need a client, although that seems boring. How will we solve that and keep it busy? Stay tuned.
I must finally admit that there has not been that much Java for me the last few weeks since I took an advanced C++ course that really tires me. I now realize that I love Java a lot more than I thought. For example, I estimate the time for a beginner to do the exercise we did today to about two days with a lot of hints. At least if it is going to be correctly made with correct use of pointers, references, function pointers and so forth. Not forgetting the destructors. With Java less than one day is sufficient, with sparse hints.
Implementing today's example took me no more than three hours (including taking the screen shots and reducing their file sizes), and the only error I made was forgetting to import two packages when extending with the URLConnection part. Shame on me <grin>. I take that as an example of how easily things are done in Java, any experienced Java developer shall be able to make such small applications within a few hours. But I really doubt I would have done this that fast in C++, by far "no!" Not even with a mere and humble terminal based application.
But admittedly, C++ is much faster on heavy, mathematical computations since you have (if you have) full control of what is taking place. And it starts a lot faster than the JVM does. But again, what is your purpose with the application? Fast implementation, or fast computations, or GUI, or secure applications not leaking memory? There are a lot more languages other than Java and C++ to choose from. Pick the one best suited for the stuff you are working with. But rest assured, Java will stay alive for many years to come.
The complete IntoJava archive (updated a few days after a new issue is published).