Networking and Security Group

Scalable Machine Learning Network Traffic Classification in Userspace

Project Background and Aims

Accurate traffic classification is important in many scenarios. For example, traffic classified as real-time can be prioritised over non-real-time traffic, or traffic classified as malicious can be blocked. Port numbers can no longer be used to accurately classify traffic and packet payload inspection is not possible in all situations. DIFFUSE [1] uses machine learning techniques to classify traffic based on observable traffic characteristics, such as packet lengths or inter-arrival times. DIFFUSE is integrated with the IPFW/Dummynet [2] firewall and traffic shaper and runs inside the FreeBSD or Linux kernel. More recently the author of IPFW/Dummynet also developed a fast userspace version based on netmap [3].

The goal of this project is to integrate the existing DIFFUSE kernel code into the userspace version of IPFW/Dummynet, test the integrated system, and document the integration and tests. The code should be developed for Linux (and/or FreeBSD). Further goals of the project are to propose an efficient method to update the Difffuse userspace version in the future when updated versions of the kernel code become available, and (if time permits) to carry out and document a very preliminary performance evaluation of the developed system.

Project Skills

The project team will require:

Project management skills
Basic knowledge of IP computer networks
Basic knowledge of FreeBSD or Linux (or ability to quickly obtain this knowledge)
Basic knowledge of C programming language (or ability to quickly learn C)

Initial References

DIFFUSE: http://www.caia.swin.edu.au/urp/diffuse/
IPFW/Dummynet: http://info.iet.unipi.it/~luigi/dummynet/
netmap-ipfw, https://github.com/luigirizzo/netmap-ipfw