Berkeley Sockets part I
Overview

rand()

One function you'll need for assignment 5 which you may not have seen is rand(), which returns a pseudo-random integer in the range 0 to RAND_MAX, and is declared in <stdlib.h>. Since that's not generally the range you want to deal with, you usually want to mod it by something.

Also, since you want pseudo-random numbers you presumably want them to be different from run to run. Like any PRNG, it requires an initial seed, which defaults to 1 if you don't provide it. To seed it, you use the srand function, which takes an unsigned int. A common and convenient seed is the return from the time function, since the clock value will be different every time you run your program (unless it takes less than a second to run). The monkey book uses the PID instead for some reason, which seems strange since it won't vary as much, so I'd recommend not doing that. One important note: do not call srand more than once in a program unless you have a specific reason to do so. I've seen many people who put an srand call inside a loop, which is a terrible idea; a loop which continuously calls srand(time(NULL)) then prints the value of rand() will print the same number over and over, only changing the output once every second, since the sequence will continuously reset.

The following is example code to simulate two (fair) 6-sided dice:

int die1, die2;
srand(time(NULL)); //seed PRNG
...
die1=rand()%6 + 1;
die2=rand()%6 + 1;

Once you've seeded with srand, you can call rand() as much as you want. (Within reason; it will repeat after about 2^32 calls, but that's fine for most purposes). If you ever need an even better PRNG, take a look at random() (and the corresponding srandom).

Sockets from 10,000 ft.

The overall idea behind sockets is to provide a reasonably high-level method of communicating across an arbitrary networks (LAN, Internet, etc.) without worrying about hardware. It also lets you use TCP or UDP (more on those later) with the same API. As is usually the case in C, there are a few ugly API details, but in the abstract they are very straightforward and uniform.

The ISO/OSI Model (optional)

To understand where sockets fit into the overall concept of the world, look at the chart on page 271 of the Monkey book. The ISO/OSI model is used a lot in distributed systems discussions, because it's a useful concept. The model has 7 layers (see the monkey book for a description of each):

The idea is that they are stacked in that order, getting lower level as they go down. The point of the model is that each layer only has to worry about the layer the layer directly below it; as an application programmer you just say "Send this to Bob" to the Presentation layer. It does whatever it needs to do, then says "Send this to Bob", and so forth down the layers, with each doing a few things to the data. When the data gets to the Network layer, that layer figures out how to map "Bob" to an address, then sends that info and the data down further. Eventually the data reaches the Physical layer (e.g. your Ethernet card) and it gets sent. Then the data is propagated back up the layers on Bob's computer, each layer unpacking the data, until eventually Bob's application layer gets the data that was originally sent, in it's original form.

An important note is that while the theory is all well and good, it's not really implemented this way in the real Internet. As the history of the Internet tends to go, the theory came along after the Internet was already well underway, and no-one wanted to bother to change over to the "standard"

Sockets fit in around the Transport/Network layers; it spans both, just as the application layer of the real Internet is actually all of the first three layers of the OSI model in one big glob.

Sockets from 100 ft.

There are essentially two flavors of socket programming: Connection-oriented (i.e. TCP) and Connectionless (i.e. UDP). Both are useful in different applications, so there's no clear "better" method. They use same ideas in creating and setting up the sockets themselves however, so we'll start there.

Sockets are identified the same way in both protocols: a tuple consisting of an IP address and a port. The use of ports is a way of subdividing a single IP address so that the same IP can provide a whole set of services, by using a different socket for each one. Ports numbers below 1024 are reserved, and many of them are "well known". For example, HTTP uses port 80, mail uses 25, ssh uses 22, telnet uses 23, etc. So each socket, when it is created, is "bound" to a certain port number, and someone who wants to communicate with it can use an IP address and port to find that socket uniquely.

Once you've created and bound a socket, the two types of socket programming ("paradigms" as the Monkey book calls them) become different. The first is the way most Internet services work:

The connection-oriented paradigm (pg 279):

Server Client
Create a socket with socket Create a socket with socket
Bind the socket to a port with bind  
Set up a connection queue with listen  
Establish a connection with accept Request a connection with connect
Read and write data with read and write Read and write data with read and write

This is the standard client-server model; the server waits for requests, and whenever a client makes a request the server deals with it in some fashion. A web server, at the basic level, is just a program that does all the things in the server column using port 80.

The connectionless paradigm (pg 301):

Server Client
Create a socket with socket Create a socket with socket
Bind the socket to a port with bind Bind the socket to a port with bind
Send and receive data with recvfrom and sendto Send and receive data with recvfrom and sendto

This the model of two programs designed to communicate back and forth with no requirements on ordering or integrity of data.

A couple of things to note about these models:

TCP vs. UDP

The main difference between the connection-oriented TCP (Transmission Control Protocol) and the connectionless UDP (User Datagram Protocol) lies in the reliability of the data stream. I don't like the standard analogies, so I'll make up my own: Say you are standing in the Strosacker balcony, and your friend is on the stage, and you want to pass notes (don't ask why you can't just shout, it's an analogy).

To simulate TCP, you get a long string, and throw one end down to your friend. Each of you holds one end tight, and to pass a note down you punch a hole in it, and thread it onto the string, where gravity pulls it down to your friend, who will get every note, all in the order that you sent them.

For UDP, you write all your messages on paper airplanes and throw them to your friend. Now, most will get there (assuming you make good airplanes), but occasionally, some will go off in a random direction, or crash right away, so your friend never gets them. What's more, some of them may do a lot of looping and hovering, and take a long time to arrive, while others will go directly, so there's no guarantee they will get there in the order you sent them.

While that might make UDP sound worthless, UDP is much faster since it doesn't have as much overhead. So for some applications, it's far better. The standard example is real-time multimedia streaming: if you drop a video frame, or two frames are switched, you probably won't even notice, but if you have to slow way down to accommodate all the TCP error prevention your frame-rate will be terrible.

Next Week: Sockets Up Close

Next week we'll get to the implementation details, but instead of just throwing a list of commands at you, I'll also demonstrate how powerful sockets are at the same time. So next week, we will implement a simplistic HTTP server and a client, which use the following pseudo-code (These are NOT the real arguments to these functions, this is a very high-level approximation of the code).

Server:

main {
  srv=socket()
  bind(srv, 80)
  listen(srv)
  while (1) {
    clnt=accept(srv)
	if (fork()==0) {
	  serveRequest(clnt)
	  exit()
	}
  }
}

void serveRequest(clnt) {
  request=read(clnt)
  if (request starts with "GET")
  write(clnt, "HTTP/1.0 404 Not Found\r\n\r\n")
  close(clnt)
}

Client:

main {
  client=socket()
  connect(client, hostname, 80)
  write(client, "GET /index.html HTTP/1.0")
  read(client, response)
  printf(response)
  close(client)
}

Stuart Morgan, 2003