In the source code directory you will find a directory called swerve that contains the source of the server and some simple test cases. For details on the source code see the section called The Organisation of the Code in Chapter 9. To build the server just type make in this directory. This will run the SML compiler, assuming that it is in your PATH.
The resulting heap and executable files can be found in the main subdirectory. You will need to edit the main/swerve script to set the path variables. The libdir variable is the absolute path where the heap file is found. This will be the main directory until you install the server somewhere else. The smlbin variable is the absolute path where you have installed the SML/NJ binaries.
There is a server configuration in the tests/tree_norm subdirectory. This contains some test cases for manual testing of the server. (The "norm" refers to testing the normal behaviour of the server). The configuration file is norm.cfg. Before you can run the server you must edit the ServerRoot parameter to be the absolute path to the tree_norm directory. You may want to change the port number too.
The test tree implements the notable URL paths shown in Table 8-1.
Table 8-1. The Notable Norm URLs.
Path | Purpose |
---|---|
/ | This returns index.html in the htdocs directory. |
/secure | This returns index.html in the htdocs/secure directory. This is for testing user authorisation. |
/sub | This returns a list of the files in htdocs/sub directory. Selecting page.html tests some image handling on a complex page. |
/hw | This runs a built-in handler that just returns a "hello world" message. |
/sleep?3 | This runs the "sleep" built-in handler that delays for the number of seconds given in the query. This is used to test simultaneous requests to the same handler. |
/icons | The fancy indexing of directories refers to icons with this path. The icons are in the kit/icons directory. |
/tmp | This path points to the /tmp directory on your system. On mine, having run KDE and sometimes Gnome, there are lots of weird files to exercise the fancy indexing of the server. |
/cgi/env | This path runs the cgi-bin/printenv shell script. This is the traditional example CGI script that prints all of the environment variables that the server passes to CGI scripts. |
/cgi/test-post | This runs the cgi-bin/test-post shell script. This is similar to printenv except that it echoes stdin so that posted form data can be seen. |
/cgi/form-handler | This runs the cgi-bin/form-handler Perl script. This script runs a simple form that asks for your name and echoes it. |
/cgi/line-counter | This runs the cgi-bin/line-count Perl script. This is used to test the uploading of a text file. The script returns the number of lines in the file. |
To start the server go to the tests/tree_norm directory and use the run script. The configuration file sets the log level to Debug. The debug messages will appear in the var/errors log file. There are enough for you to trace the operation of the server. Note that the file continues to grow over successive runs of the server unless you delete it.
After you stop the server you should check that the tmp directory is empty and the var directory only contains the log file.
The test plan so far has been casual. Most of the tests consist of just poking at the links. The testing of multiple requests and authorisation is more involved. The following sections describe the test plan. I've tested it with both Netscape 4.76 and Konqueror 2.1.1.
Try out each of the following steps from the main menu, in any order.
Go the main page at http://localhost:2020/ (or whatever port you have chosen). You should see "Welcome to the Swerve Web Server" along with a menu.
Click on "hello world". You should see a single line of text saying "hello world". This uses the text/plain MIME type.
Click on "5 second sleep". The browser should twiddle its thumbs for 5 seconds and then you see "Slept for 5 seconds". This is more useful for multiple request testing below.
Click on "sub-directory". This shows a fancy index of the htdocs/sub directory. There should be a folder icon with the empty directory. Click on it to see the absence of contents. Clicking on the "Up to higher level directory" will take you back.
Examine the README and image files. Clicking on page.html will show a page containing all of the image files.
Click on "/tmp". This will show an index of your /tmp directory. Mine contains various weird files such as sockets for KDE and X11. For example clicking on the socket in .X11-unix/X0 will result in a "Not Found" error since the file is inaccessible. You can probably find a file that is not readable by you. Clicking on it should result in the "Access Denied" error.
Click on "printenv". This should return a list of all of the CGI variables passed to the script. The HTTP_USER_AGENT variable will describe your browser. The SERVER_SOFTWARE variable describes the version of the server. Your PATH and SHELL should also appear. If you click on "printenv with a query" you will see that the QUERY_STRING variable appears. It should contain name=fred%20flintstone&page=3.
Click on "simple form". This returns the testform.html page. This contains a simple form that requests you to enter your name. If you enter Fred Flintstone and click Send the result should show that stdin contained the string given=Flintstone&family=Fred&Submit=Send.
Click on "real CGI form". This form is generated on-the-fly by the form-handler Perl CGI script. When you send your name the page is updated with the name.
Click on "file line counter". This shows a form that invites you to enter a file name. This should be the path of a text file that is around 100KB long. This is plenty large enough to ensure that the uploaded file is saved to disk in the tmp directory. You can check this in the debug messages in the var/errors file. Look for the line with "TmpFile allocate file" followed by the path to the temporary file. Check that the reported length of the file is correct.
Congratulations. It basically works.
This testing is fairly slight. The first part relies on the browser using multiple connections to load the page.html file in the sub-directory of the section called Basic Testing. Both Netscape and Konqueror will open several simultaneous connections to the server to load all of the images.
Stop the server and edit the norm.cfg file to set MaxClients to 1. Restart the server and follow the sub-directory link from the main menu. You should see problems such as missing icons. If you click on the page.html file then images are broken. Click on Reload a few times. Two of the images are missing. If you study the log file it will not show any connection requesting the missing images. This is because the requests are going to extra connections which are being refused by the server. If you increase MaxClients to 2 then only one of the images will be missing. At 3, all of the images reappear.
This test has demonstrated that the server rejects connections according to the MaxClients parameter. If you study the log file you will see multiple simultaneous connections. Using Konqueror to reload the image page I get this pattern of Info messages:
2001 Sep 19 05:31:26 Info: New connection from 127.0.0.1:2525 2001 Sep 19 05:31:26 Info: New connection from 127.0.0.1:2526 2001 Sep 19 05:31:26 Info: New connection from 127.0.0.1:2527 2001 Sep 19 05:31:26 Info: End of connection from 127.0.0.1:2525 2001 Sep 19 05:31:26 Info: End of connection from 127.0.0.1:2527 2001 Sep 19 05:31:26 Info: End of connection from 127.0.0.1:2526 2001 Sep 19 05:31:26 Info: New connection from 127.0.0.1:2528 2001 Sep 19 05:31:26 Info: End of connection from 127.0.0.1:2528 |
This shows three simultaneous connections followed by a separate one for the background image.
A different concurrent path through the server can be tested using the sleep built-in handler. For this you will need two browser windows side-by-side showing the main menu. Click on 5 second sleep in both browser windows rapidly one after the other. The first window will return the "Slept for 5 seconds" message after 5 seconds. The second will return it after 10 seconds. This is because the handler is single-threaded. The second request waits in a queue for the first one to finish. Then it starts its own 5 second delay. This test demonstrates multiple simultaneous requests routed through the resource store to the same handler and handled in turn.
In the tests/tree_norm/conf directory you will find a password file, admin.pwd, and a group file, admin.grp, for testing the authorisation. (See the section called The Node Parameters). Here is the password file.
fred:rocks wilma:pebbles barney:bowling betty:mink |
Here is the group file.
Rubble: barney betty |
These files are used by the authorisation configuration in the htdocs/secure/.swerve file. Here is the file.
# This allows fred directly and barney via the group. # Wilma should be barred. AuthType = Basic; AuthName = "admin"; AuthUserFile = conf/admin.pwd; AuthGroupFile = conf/admin.grp; AuthAllowUsers = fred; AuthAllowGroups = Rubble; |
This allows users fred, barney and betty.
From the main test page (see the section called Basic Testing) click on "secure admin". You will be prompted for a user name and password. You should be able to demonstrate that you cannot gain access for users fred, barney and betty except with the correct password and user wilma cannot gain access at all. Since your browser probably remembers the user name and password for the realm (from the AuthName parameter) you will need to stop and restart your browser after each successful try.
Now for the big question. How does it perform? Not bad actually. To test it I used the httperf program which is available from Hewlett-Packard Research Labs[1].
The httperf program can generate a load for the server at a fixed number of connections per second. It reports the number of connections per second that the server has actually achieved along with some statistics such as the number of concurrent connections, milliseconds per connection. The program must be run on a different machine to the server to get an accurate measure of its performance. This is because it consumes a large amount of CPU time itself in order to keep its timing accurate. But for these tests I've run it on the same machine as server.
These tests have been run on an AMD Athlon 1GHz processor with PC133 RAM. The kernel is Linux 2.2.19. The machine was idle while running the tests. The performance is compared with a standard installation of Apache 1.3.19. Both servers have logging disabled so that disk writes don't slow them down.
These performance figures were made after the improvements from profiling described in the section called Profiling the Server.
The tests fetch the index page of the tree_norm directory. This totals 957 bytes for Swerve and 1092 bytes for Apache.
The first test just tests the sequential performance. The client makes connections one after the other as fast as possible and the number of connections per second achieved is reported. The results are shown in Table 8-2.
The speed of Swerve here is terrible. But notice how it is managing exactly 20 milliseconds per connection. The default time slice for the threading in CML is 20 milliseconds. This points to a bug either in the server or the CML implementation that needs further investigation.
The next test has the client generating new connections at a fixed rate for a total of 4000 connections. If the server can keep up then it will serve them all and the number of concurrent connections it handles will be low. If it can't keep up then the number of concurrent connections will rise until it hits a limit which is 1012 connections on my machine. This limit comes from the maximum number of sockets that the client can open at one time. If this limit is reached then the performance figure can be ignored and the server be deemed to be completely swamped.
The Swerve configuration has a Timeout of 10 seconds and MaxClients of 1100. The LogLevel is Warn so that no connection messages are written to the log file. I wait at least 10 seconds between tests to let the time-outs in the Swerve server complete. This starts each run from a clean state.
Table 8-3 shows the figures (connections/second) for Swerve. When the server starts falling behind I run multiple tests to get an idea of how variable the performance is. The httperf command line used is
httperf --timeout=15 --client=0/1 --server=brahms --port=2020 --uri=/index.html --http-version=1.0 --rate=350 --send-buffer=4096 --recv-buffer=16384 --num-conns=4000 --num-calls=1 |
Table 8-3. Concurrent Swerve Performance
Offered Rate | Actual Rate | Max Connections |
---|---|---|
100 | 100 | 8 |
130 | 130 | 11 |
150 | 150 | 28 |
170 | 170 | 14 |
190 | 186 | 21 |
190 | 181 | 19 |
190 | 175 | 16 |
190 | 170 | 18 |
200 | 195 | 20 |
200 | 191 | 19 |
200 | 192 | 30 |
200 | 189 | 21 |
210 | 197 | 21 |
210 | 198 | 24 |
210 | 195 | 22 |
210 | 193 | 26 |
220 | 212 | 39 |
240 | 206 | 33 |
240 | 203 | 32 |
240 | 170 | 29 |
240 | 200 | 29 |
260 | 160 | 28 |
260 | 201 | 64 |
260 | 197 | 30 |
260 | 138 | 36 |
280 | 206 | 41 |
280 | 202 | 31 |
280 | 219 | 35 |
280 | 202 | 57 |
300 | 215 | 44 |
300 | 216 | 91 |
300 | 228 | 41 |
300 | 229 | 42 |
320 | 182 | 52 |
320 | 195 | 52 |
320 | 173 | 33 |
320 | 230 | 34 |
350 | 123 | 906 |
350 | 189 | 96 |
350 | 238 | 40 |
350 | 154 | 47 |
You can see that the throughput increases linearly up to 190 conn/sec and then starts to falter. As the load increases it peaks at around an average of 210 connections per second. At a load of 350 conn/sec, connections were starting to time-out and the server was definitely overloaded.
Table 8-4 shows the figures for Apache.
Table 8-4. Concurrent Apache Performance
Offered Rate | Actual Rate | Max Connections |
---|---|---|
100 | 100 | 13 |
130 | 130 | 7 |
150 | 150 | 8 |
170 | 170 | 10 |
190 | 190 | 12 |
200 | 200 | 145 |
200 | 200 | 133 |
210 | 210 | 33 |
210 | 210 | 123 |
220 | 220 | 76 |
240 | 240 | 165 |
260 | 260 | 178 |
280 | 280 | 199 |
300 | 190 | 1012 |
300 | 300 | 132 |
300 | 245 | 142 |
300 | 300 | 189 |
320 | 215 | 760 |
320 | 202 | 1012 |
320 | 258 | 547 |
320 | 210 | 760 |
350 | 220 | 1012 |
350 | 277 | 629 |
350 | 132 | 1012 |
350 | 243 | 1012 |
With Apache, the through-put increases linearly up to 300 conn/sec and then starts to falter. At higher loads the throughput struggles to get up to around 270 conn/sec at best and at worst many cases of complete overload.
Comparing the two servers I can reasonably claim that the performance of Swerve is around 2/3 of Apache. That's not bad for a home-grown server written in SML.
I did some investigation of the internal timing of the Swerve server to see what improvements could be made. The performance figures in the previous section were obtained after the improvements from profiling.
I've had performance problems from the handling of time-outs. I need to have a flag that is set on a time-out and that can be tested. (See the section called Time-outs for more details). In an earlier implementation I created a new thread for each connection to wait for the time-out. The thread then set the flag. The trouble with this was that the time-out thread would hang around in the server after the connection was closed, until the time-out expired. This would result in thousands of threads in the server which clogged the CML run-time. The time taken to spawn a new thread rose to over 15 milliseconds. The current implementation, described in the section called The Abort Module in Chapter 9, only uses one thread and is much faster.
I've left some timing code in the server which can be enabled with the -T Timing command line option and a log level of Debug. The timing code uses the MyProfile.timeIt function to measure the wall-time execution of a function, in microseconds. (See the section called The MyProfile Module in Chapter 9). Here are some typical figures for the fetching of the index page of the tree_norm test suite. (The page has been fetched several times to get it into the Linux cache).
Timing abort request 18 Timing abort create 47 Timing Listener setupConn 67 Timing HTTP_1_0 get 618 Timing GenNode request 165 Timing HTTP_1_0 stream_entity 641 Timing HTTP_1_0 response 764 Timing HTTP_1_0 to_store 959 Timing Listener talk 1586 Timing Listener close 53 Timing Listener release 11 Timing Listener run 1733 Timing Listener died 74 Timing Listener connection 2263 |
The measurement points are described in Table 8-5. You should study the Chapter 9 for more information on what is happening in the server.
What the numbers tell me is that the server can process a request in 2.2 milliseconds and so should be able to handle 450 requests per second. But now if I run the server with 120 requests/second to get 3 or more concurrent connections I get numbers like these:
Timing abort request 17 Timing abort request 7 Timing abort create 184 Timing Listener setupConn 201 Timing HTTP_1_0 get 236 Timing abort request 7 Timing abort create 557 Timing Listener setupConn 564 Timing HTTP_1_0 get 150 Timing GenNode request 460 Timing abort create 465 Timing Listener setupConn 474 Timing HTTP_1_0 get 149 Timing GenNode request 226 Timing GenNode request 7 Timing HTTP_1_0 stream_entity 1890 Timing HTTP_1_0 response 2003 Timing HTTP_1_0 to_store 2495 Timing Listener talk 2740 Timing Listener close 66 Timing Listener release 39 Timing Listener run 3062 Timing HTTP_1_0 stream_entity 1695 Timing HTTP_1_0 response 1759 Timing HTTP_1_0 to_store 2477 Timing Listener talk 2633 Timing Listener close 35 Timing Listener died 140 Timing Listener connection 3678 Timing Listener release 13 Timing Listener run 3258 Timing HTTP_1_0 stream_entity 1723 Timing HTTP_1_0 response 1784 Timing HTTP_1_0 to_store 2347 Timing Listener talk 2501 Timing Listener close 32 Timing Listener release 47 Timing Listener run 3067 Timing Listener died 134 Timing Listener connection 3955 Timing Listener died 35 Timing Listener connection 3820 |
Now the performance has dropped to an average of 261 connections/sec. The time to setup a connection has jumped due to the increased time to setup the time-out. This comes from the extra overhead of the CML operations when there are more events and threads in the server. The time to send a file to the connection has doubled since this involves lots of message sending which is now slower.
Table 8-5. Timing Measurement Points.
Name | Description |
---|---|
abort request | This measures the time for the server in the Abort module to process a request to add a new time-out. |
abort create | This measures the time for the Abort.create function to run. It includes the time to send the message to the server without waiting for any reply. |
Listener setupConn | This measures the time to create the connection object when a new connection arrives. This mainly consists of setting a socket option and starting the time-out by creating an Abort value. |
HTTP_1_0 get | This measures the time to read in the GET request from the socket including parsing all of the headers. |
GenNode request | This measures the time for the resource store to forward the request through to the handler. It does not include the time for the directory node handler to run. |
HTTP_1_0 stream_entity | This measures the time for the stream_entity function to run which transfers the contents of the page to the socket. This includes the time for reading the page from disk. |
HTTP_1_0 response | This measures the total time to process a response. This includes the stream_entity time above along with the time to write the status and headers. |
HTTP_1_0 to_store | This measures the total time to process a request including the time to send it to the store and the time to process the response (above). |
Listener talk | This measures the total time to run the HTTP protocol for a connection including all request and response processing. |
Listener close | This measures the time to close the socket after the response has been sent. |
Listener release | This measures the time to clean-up any temporary files. |
Listener run | This measures the total time for a connection including all of the above Listener measurement points. |
Listener died | This measures the time to tell the connection manager that a connection has terminated. |
Listener connection | This measures the total time for a connection from after the socket's creation until its close. |
[1] | The URL is ftp://ftp.hpl.hp.com/pub/httperf |