Chapter 10: VoIP test applications

Using the VoIP framework discussed in the previous chapter, I created several test applications. Some of them were only intended to test a particular component or tested the interaction of components by letting an application send voice data to itself.

Apart from these relatively simple test programs, I also created two more interesting applications. The first one is an Internet telephony application, the other one is a simple 3D environment. These applications are discussed in this chapter, but first there is a section that covers some issues which apply to both programs.

10.1 General issues

The way the VoIP related functions are handled is the same in both applications. When VoIP has to be made possible, the applications start a separate thread. In this thread, the necessary components are initialised, they are placed into a VoiceCall instance and then the Step function is called continuously. When VoIP is no longer required, the applications signals this to the thread, which then interrupts its loop and exits.

Using this continuous loop technique, it is almost certain that at the end of one sampling interval, the next one is immediately started. This is definitely the case when the system is not currently busy serving other demanding applications.

When the system is more heavily loaded, it is possible that there is a small amount of delay between the end of one interval and the start of the next. This is illustrated in figure 10.1. Since the total delay will keep building up, this will have some side effects for the communication.




Figure 10.1: Delay between sampling intervals

For the recorded voice data, this can lead to gaps in the communication. When a packet is sent with RTP, the RTP timestamp is only increased with the amount of samples in the packet. This obviously does not take the added delay into account. Suppose that at a given time, there is a total amount d delay caused by the high load of the system. If a packet's timestamp is T, the actual time that the packet was sampled, is T + d. The receiver will not know about d, so he will try to play the voice data in the packet at the time represented by timestamp T. But because there is a delay of at least d, it is possible that the playback time for the packet has already passed, causing a gap in the conversation.

The delay has also an effect on the playback of voice data. Because of the extra delay, the playback time of a packet will be later than it should be. This will increase the overall delay of the communication, which is very undesirable.

To solve these problems a check is performed inside the loop. It compares the actual elapsed time with the time interval represented by the sum of the sample intervals. This way it measures the extra delay caused by the load of the system. If this delay gets too large, the communication is initialised again1.

10.2 An Internet telephony application

The first application which I made using the framework, was a simple Internet telephony application, of which only a MS-Windows version exists. The user-interface of the program is shown in figure 10.2.




Figure 10.2: An Internet telephony application

When the user wants to make a call to somebody, he presses the `Connect' button. Then, the application asks for the host to connect to and tries to establish a connection.

When there is an incoming call, this is signalled in the status window. The user can then answer the call by pressing the corresponding button.

The call is set up by a TCP connection. As long as this TCP connection exists between the two parties, they are able to talk to each other. When the TCP connection is torn down, the call is terminated. Note that the TCP connection is only a control connection and is not used to transfer the speech data. This is done with RTP, by the VoIP framework.

As was explained in the previous section, the application starts a separate thread when the VoIP part is needed. In the application, this is done as soon as the called person sends a confirmation over the TCP connection.

To my own opinion, the application worked quite well. The user-interface may not be very sophisticated, but the program allowed good quality conversation when sufficient bandwidth was present. The required bandwidth heavily depends on the compression scheme used.

10.3 A 3D environment

The previous application was an Internet telephony application, so there were only two persons communicating and there was no need for 3D effects. The application I created next did allow multiple participants and 3D effects.

Both a Linux and a MS-Windows version of the application exist. Figure 10.3 is a screenshot of the Linux version of the application; the MS-Windows version looks almost identical.




Figure 10.3: A 3D environment

The user-interface contains three major parts. The bottom half of the application window is a chat interface. If the voice quality is not good enough, the participants can still communicate by sending text messages to each other. At the right hand side of the window, there is a window which shows the participants in the 3D environment. The remaining part of the application window shows this environment.

When the application is started, the user first has to join a specific 3D environment. This is done by establishing a TCP connection with a server application. Each participant in the environment will have one such connection with the server. These connections are used to transfer the text messages for the chat window and to distribute the positions of each participant. Like with the telephony application, these connections are not used for the transmission of the speech data; for that purpose RTP is used.

For the transmission of the speech data, a user can either use unicasting or multicasting. If multicasting is used, the server provides a multicast address to which the user's application should send the data. One multicast group is used for the whole environment.

To reduce the amount of processing, the application checks which other participants are within a specific range. Depending on the transmission method used - unicasting or multicasting - the program only sends speech information to participants in range, or only accepts speech data from participants in range.

The application demands a lot more processing power than the Internet telephony application. Because of this, the quality can sometimes be a bit lower, but usually I found it comparable to that of the telephony application. Like with the Internet telephony application, the required bandwidth depends a lot on the compression scheme which is used.

10.4 Summary

To test the VoIP framework I created several programs, among which are an Internet telephony application and a 3D environment. The VoIP part of these applications is put in a separate thread which continuously calls the VoiceCall member function `Step'. Extra work is done to assure synchronisation between the participants.

The Internet telephony application is a relatively simple application which allows easy and good quality conversations over the Internet if enough bandwidth is available. The 3D environment application allows several persons to communicate with each other, with simple localisation effects being added to their speech signals. Both unicasting and multicasting can be selected to transmit the voice information. This application also allowed good quality communication. For both applications, the required bandwidth depends on the compression scheme which is used.


Footnotes:
  1. In fact, the routine also checks if the difference is not too low. This is possible because of inaccuracies in the system's clock. These, in turn, will lead to incorrect measurements of the delay, so when the measured delay is sufficiently negative, the communication is also initialised again.

Next: Conclusion
Previous: Chapter 9
Contents