Statistical traffic classification by Boosting Support Vector Machines

Gómez, Gabriel - Belzarena, Pablo

Resumen:

In recent years, traffic classification based on the statistical properties of flows has become an important topic. In this paper we statistically analyze the data length of the first few segments exchanged by a transport ow. This traffic classification method may be useful for early traffic identification in real time, since it takes into account only the beginning of the flow and therefore it can be used to trigger on-line actions. This work proposes the use of a supervised machine learning method for traffic identification based on Support Vector Machines (SVM). We compare the SVM classification accuracy with a more classical centroid based approach, obtaining good results. We also propose an improvement of the classification accuracy preformed by one single SVM model, introducing a weighted voting scheme of the verdicts of a sequence of SVM models. This sequence is generated by means of the boosting technique and the proposed method improves the classification accuracy of poorly classified classes without noticeable detriment of the other traffic classes. This work analyzes the behavior of both TCP and UDP transport protocols.


Detalles Bibliográficos
2012
Traffic indentification
Traffic clasification
Support vector machines
Boosting
Telecomunicaciones
Inglés
Universidad de la República
COLIBRI
https://hdl.handle.net/20.500.12008/41156
Acceso abierto
Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
_version_ 1807522938086227968
author Gómez, Gabriel
author2 Belzarena, Pablo
author2_role author
author_facet Gómez, Gabriel
Belzarena, Pablo
author_role author
bitstream.checksum.fl_str_mv 7f2e2c17ef6585de66da58d1bfa8b5e1
9833653f73f7853880c94a6fead477b1
4afdbb8c545fd630ea7db775da747b2f
9da0b6dfac957114c6a7714714b86306
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
bitstream.url.fl_str_mv http://localhost:8080/xmlui/bitstream/20.500.12008/41156/4/license.txt
http://localhost:8080/xmlui/bitstream/20.500.12008/41156/1/license_text
http://localhost:8080/xmlui/bitstream/20.500.12008/41156/2/license_url
http://localhost:8080/xmlui/bitstream/20.500.12008/41156/3/license_rdf
collection COLIBRI
dc.creator.none.fl_str_mv Gómez, Gabriel
Belzarena, Pablo
dc.date.accessioned.none.fl_str_mv 2023-11-14T17:04:34Z
dc.date.available.none.fl_str_mv 2023-11-14T17:04:34Z
dc.date.issued.es.fl_str_mv 2012
dc.date.submitted.es.fl_str_mv 20231114
dc.description.abstract.none.fl_txt_mv In recent years, traffic classification based on the statistical properties of flows has become an important topic. In this paper we statistically analyze the data length of the first few segments exchanged by a transport ow. This traffic classification method may be useful for early traffic identification in real time, since it takes into account only the beginning of the flow and therefore it can be used to trigger on-line actions. This work proposes the use of a supervised machine learning method for traffic identification based on Support Vector Machines (SVM). We compare the SVM classification accuracy with a more classical centroid based approach, obtaining good results. We also propose an improvement of the classification accuracy preformed by one single SVM model, introducing a weighted voting scheme of the verdicts of a sequence of SVM models. This sequence is generated by means of the boosting technique and the proposed method improves the classification accuracy of poorly classified classes without noticeable detriment of the other traffic classes. This work analyzes the behavior of both TCP and UDP transport protocols.
dc.description.es.fl_txt_mv Trabajo presentado a LANC 12, Statistical traffic classification by boosting support vector machines
dc.identifier.citation.es.fl_str_mv Gómez, G, Belzarena, P. "Statistical traffic classification by boosting support vector machines" Publicado en Proceedings of the 7th Latin American Networking Conference, Medellín, Colombia, 4-5 oct. 2012. pp. 9–18. https://doi.org/10.1145/2382016.2382019
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12008/41156
dc.language.iso.none.fl_str_mv en
eng
dc.rights.license.none.fl_str_mv Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
dc.source.none.fl_str_mv reponame:COLIBRI
instname:Universidad de la República
instacron:Universidad de la República
dc.subject.es.fl_str_mv Traffic indentification
Traffic clasification
Support vector machines
Boosting
dc.subject.other.es.fl_str_mv Telecomunicaciones
dc.title.none.fl_str_mv Statistical traffic classification by Boosting Support Vector Machines
dc.type.es.fl_str_mv Preprint
dc.type.none.fl_str_mv info:eu-repo/semantics/preprint
dc.type.version.none.fl_str_mv info:eu-repo/semantics/submittedVersion
description Trabajo presentado a LANC 12, Statistical traffic classification by boosting support vector machines
eu_rights_str_mv openAccess
format preprint
id COLIBRI_6d84b2c6e66ebae3ab786c97d890a9a5
identifier_str_mv Gómez, G, Belzarena, P. "Statistical traffic classification by boosting support vector machines" Publicado en Proceedings of the 7th Latin American Networking Conference, Medellín, Colombia, 4-5 oct. 2012. pp. 9–18. https://doi.org/10.1145/2382016.2382019
instacron_str Universidad de la República
institution Universidad de la República
instname_str Universidad de la República
language eng
language_invalid_str_mv en
network_acronym_str COLIBRI
network_name_str COLIBRI
oai_identifier_str oai:colibri.udelar.edu.uy:20.500.12008/41156
publishDate 2012
reponame_str COLIBRI
repository.mail.fl_str_mv mabel.seroubian@seciu.edu.uy
repository.name.fl_str_mv COLIBRI - Universidad de la República
repository_id_str 4771
rights_invalid_str_mv Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
spelling 2023-11-14T17:04:34Z2023-11-14T17:04:34Z201220231114Gómez, G, Belzarena, P. "Statistical traffic classification by boosting support vector machines" Publicado en Proceedings of the 7th Latin American Networking Conference, Medellín, Colombia, 4-5 oct. 2012. pp. 9–18. https://doi.org/10.1145/2382016.2382019https://hdl.handle.net/20.500.12008/41156Trabajo presentado a LANC 12, Statistical traffic classification by boosting support vector machinesIn recent years, traffic classification based on the statistical properties of flows has become an important topic. In this paper we statistically analyze the data length of the first few segments exchanged by a transport ow. This traffic classification method may be useful for early traffic identification in real time, since it takes into account only the beginning of the flow and therefore it can be used to trigger on-line actions. This work proposes the use of a supervised machine learning method for traffic identification based on Support Vector Machines (SVM). We compare the SVM classification accuracy with a more classical centroid based approach, obtaining good results. We also propose an improvement of the classification accuracy preformed by one single SVM model, introducing a weighted voting scheme of the verdicts of a sequence of SVM models. This sequence is generated by means of the boosting technique and the proposed method improves the classification accuracy of poorly classified classes without noticeable detriment of the other traffic classes. This work analyzes the behavior of both TCP and UDP transport protocols.Made available in DSpace on 2023-11-14T17:04:34Z (GMT). No. of bitstreams: 4 license_text: 21936 bytes, checksum: 9833653f73f7853880c94a6fead477b1 (MD5) license_url: 49 bytes, checksum: 4afdbb8c545fd630ea7db775da747b2f (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) license.txt: 4194 bytes, checksum: 7f2e2c17ef6585de66da58d1bfa8b5e1 (MD5) Previous issue date: 2012enengLas obras depositadas en el Repositorio se rigen por la Ordenanza de los Derechos de la Propiedad Intelectual de la Universidad De La República. (Res. Nº 91 de C.D.C. de 8/III/1994 – D.O. 7/IV/1994) y por la Ordenanza del Repositorio Abierto de la Universidad de la República (Res. Nº 16 de C.D.C. de 07/10/2014)info:eu-repo/semantics/openAccessLicencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)Traffic indentificationTraffic clasificationSupport vector machinesBoostingTelecomunicacionesStatistical traffic classification by Boosting Support Vector MachinesPreprintinfo:eu-repo/semantics/preprintinfo:eu-repo/semantics/submittedVersionreponame:COLIBRIinstname:Universidad de la Repúblicainstacron:Universidad de la RepúblicaGómez, GabrielBelzarena, PabloTelecomunicacionesAnálisis de Redes, Tráfico y Estadísticas de ServiciosLICENSElicense.txttext/plain4194http://localhost:8080/xmlui/bitstream/20.500.12008/41156/4/license.txt7f2e2c17ef6585de66da58d1bfa8b5e1MD54CC-LICENSElicense_textapplication/octet-stream21936http://localhost:8080/xmlui/bitstream/20.500.12008/41156/1/license_text9833653f73f7853880c94a6fead477b1MD51license_urlapplication/octet-stream49http://localhost:8080/xmlui/bitstream/20.500.12008/41156/2/license_url4afdbb8c545fd630ea7db775da747b2fMD52license_rdfapplication/octet-stream23148http://localhost:8080/xmlui/bitstream/20.500.12008/41156/3/license_rdf9da0b6dfac957114c6a7714714b86306MD5320.500.12008/411562024-07-24 17:25:48.255oai:colibri.udelar.edu.uy:20.500.12008/41156VGVybWlub3MgeSBjb25kaWNpb25lcyByZWxhdGl2YXMgYWwgZGVwb3NpdG8gZGUgb2JyYXMKCgpMYXMgb2JyYXMgZGVwb3NpdGFkYXMgZW4gZWwgUmVwb3NpdG9yaW8gc2UgcmlnZW4gcG9yIGxhIE9yZGVuYW56YSBkZSBsb3MgRGVyZWNob3MgZGUgbGEgUHJvcGllZGFkIEludGVsZWN0dWFsICBkZSBsYSBVbml2ZXJzaWRhZCBEZSBMYSBSZXDvv71ibGljYS4gKFJlcy4gTu+/vSA5MSBkZSBDLkQuQy4gZGUgOC9JSUkvMTk5NCDvv70gRC5PLiA3L0lWLzE5OTQpIHkgIHBvciBsYSBPcmRlbmFuemEgZGVsIFJlcG9zaXRvcmlvIEFiaWVydG8gZGUgbGEgVW5pdmVyc2lkYWQgZGUgbGEgUmVw77+9YmxpY2EgKFJlcy4gTu+/vSAxNiBkZSBDLkQuQy4gZGUgMDcvMTAvMjAxNCkuIAoKQWNlcHRhbmRvIGVsIGF1dG9yIGVzdG9zIHTvv71ybWlub3MgeSBjb25kaWNpb25lcyBkZSBkZXDvv71zaXRvIGVuIENPTElCUkksIGxhIFVuaXZlcnNpZGFkIGRlIFJlcO+/vWJsaWNhIHByb2NlZGVy77+9IGE6ICAKCmEpIGFyY2hpdmFyIG3vv71zIGRlIHVuYSBjb3BpYSBkZSBsYSBvYnJhIGVuIGxvcyBzZXJ2aWRvcmVzIGRlIGxhIFVuaXZlcnNpZGFkIGEgbG9zIGVmZWN0b3MgZGUgZ2FyYW50aXphciBhY2Nlc28sIHNlZ3VyaWRhZCB5IHByZXNlcnZhY2nvv71uCmIpIGNvbnZlcnRpciBsYSBvYnJhIGEgb3Ryb3MgZm9ybWF0b3Mgc2kgZnVlcmEgbmVjZXNhcmlvICBwYXJhIGZhY2lsaXRhciBzdSBwcmVzZXJ2YWNp77+9biB5IGFjY2VzaWJpbGlkYWQgc2luIGFsdGVyYXIgc3UgY29udGVuaWRvLgpjKSByZWFsaXphciBsYSBjb211bmljYWNp77+9biBw77+9YmxpY2EgeSBkaXNwb25lciBlbCBhY2Nlc28gbGlicmUgeSBncmF0dWl0byBhIHRyYXbvv71zIGRlIEludGVybmV0IG1lZGlhbnRlIGxhIHB1YmxpY2Fjae+/vW4gZGUgbGEgb2JyYSBiYWpvIGxhIGxpY2VuY2lhIENyZWF0aXZlIENvbW1vbnMgc2VsZWNjaW9uYWRhIHBvciBlbCBwcm9waW8gYXV0b3IuCgoKRW4gY2FzbyBxdWUgZWwgYXV0b3IgaGF5YSBkaWZ1bmRpZG8geSBkYWRvIGEgcHVibGljaWRhZCBhIGxhIG9icmEgZW4gZm9ybWEgcHJldmlhLCAgcG9kcu+/vSBzb2xpY2l0YXIgdW4gcGVy77+9b2RvIGRlIGVtYmFyZ28gc29icmUgbGEgZGlzcG9uaWJpbGlkYWQgcO+/vWJsaWNhIGRlIGxhIG1pc21hLCBlbCBjdWFsIGNvbWVuemFy77+9IGEgcGFydGlyIGRlIGxhIGFjZXB0YWNp77+9biBkZSBlc3RlIGRvY3VtZW50byB5IGhhc3RhIGxhIGZlY2hhIHF1ZSBpbmRpcXVlIC4KCkVsIGF1dG9yIGFzZWd1cmEgcXVlIGxhIG9icmEgbm8gaW5mcmlnZSBuaW5n77+9biBkZXJlY2hvIHNvYnJlIHRlcmNlcm9zLCB5YSBzZWEgZGUgcHJvcGllZGFkIGludGVsZWN0dWFsIG8gY3VhbHF1aWVyIG90cm8uCgpFbCBhdXRvciBnYXJhbnRpemEgcXVlIHNpIGVsIGRvY3VtZW50byBjb250aWVuZSBtYXRlcmlhbGVzIGRlIGxvcyBjdWFsZXMgbm8gdGllbmUgbG9zIGRlcmVjaG9zIGRlIGF1dG9yLCAgaGEgb2J0ZW5pZG8gZWwgcGVybWlzbyBkZWwgcHJvcGlldGFyaW8gZGUgbG9zIGRlcmVjaG9zIGRlIGF1dG9yLCB5IHF1ZSBlc2UgbWF0ZXJpYWwgY3V5b3MgZGVyZWNob3Mgc29uIGRlIHRlcmNlcm9zIGVzdO+/vSBjbGFyYW1lbnRlIGlkZW50aWZpY2FkbyB5IHJlY29ub2NpZG8gZW4gZWwgdGV4dG8gbyBjb250ZW5pZG8gZGVsIGRvY3VtZW50byBkZXBvc2l0YWRvIGVuIGVsIFJlcG9zaXRvcmlvLgoKRW4gb2JyYXMgZGUgYXV0b3Lvv71hIG3vv71sdGlwbGUgL3NlIHByZXN1bWUvIHF1ZSBlbCBhdXRvciBkZXBvc2l0YW50ZSBkZWNsYXJhIHF1ZSBoYSByZWNhYmFkbyBlbCBjb25zZW50aW1pZW50byBkZSB0b2RvcyBsb3MgYXV0b3JlcyBwYXJhIHB1YmxpY2FybGEgZW4gZWwgUmVwb3NpdG9yaW8sIHNpZW5kbyDvv71zdGUgZWwg77+9bmljbyByZXNwb25zYWJsZSBmcmVudGUgYSBjdWFscXVpZXIgdGlwbyBkZSByZWNsYW1hY2nvv71uIGRlIGxvcyBvdHJvcyBjb2F1dG9yZXMuCgpFbCBhdXRvciBzZXLvv70gcmVzcG9uc2FibGUgZGVsIGNvbnRlbmlkbyBkZSBsb3MgZG9jdW1lbnRvcyBxdWUgZGVwb3NpdGEuIExhIFVERUxBUiBubyBzZXLvv70gcmVzcG9uc2FibGUgcG9yIGxhcyBldmVudHVhbGVzIHZpb2xhY2lvbmVzIGFsIGRlcmVjaG8gZGUgcHJvcGllZGFkIGludGVsZWN0dWFsIGVuIHF1ZSBwdWVkYSBpbmN1cnJpciBlbCBhdXRvci4KCkFudGUgY3VhbHF1aWVyIGRlbnVuY2lhIGRlIHZpb2xhY2nvv71uIGRlIGRlcmVjaG9zIGRlIHByb3BpZWRhZCBpbnRlbGVjdHVhbCwgbGEgVURFTEFSICBhZG9wdGFy77+9IHRvZGFzIGxhcyBtZWRpZGFzIG5lY2VzYXJpYXMgcGFyYSBldml0YXIgbGEgY29udGludWFjae+/vW4gZGUgZGljaGEgaW5mcmFjY2nvv71uLCBsYXMgcXVlIHBvZHLvv71uIGluY2x1aXIgZWwgcmV0aXJvIGRlbCBhY2Nlc28gYSBsb3MgY29udGVuaWRvcyB5L28gbWV0YWRhdG9zIGRlbCBkb2N1bWVudG8gcmVzcGVjdGl2by4KCkxhIG9icmEgc2UgcG9uZHLvv70gYSBkaXNwb3NpY2nvv71uIGRlbCBw77+9YmxpY28gYSB0cmF277+9cyBkZSBsYXMgbGljZW5jaWFzIENyZWF0aXZlIENvbW1vbnMsIGVsIGF1dG9yIHBvZHLvv70gc2VsZWNjaW9uYXIgdW5hIGRlIGxhcyA2IGxpY2VuY2lhcyBkaXNwb25pYmxlczoKCgpBdHJpYnVjae+/vW4gKENDIC0gQnkpOiBQZXJtaXRlIHVzYXIgbGEgb2JyYSB5IGdlbmVyYXIgb2JyYXMgZGVyaXZhZGFzLCBpbmNsdXNvIGNvbiBmaW5lcyBjb21lcmNpYWxlcywgc2llbXByZSBxdWUgc2UgcmVjb25vemNhIGFsIGF1dG9yLgoKQXRyaWJ1Y2nvv71uIO+/vSBDb21wYXJ0aXIgSWd1YWwgKENDIC0gQnktU0EpOiBQZXJtaXRlIHVzYXIgbGEgb2JyYSB5IGdlbmVyYXIgb2JyYXMgZGVyaXZhZGFzLCBpbmNsdXNvIGNvbiBmaW5lcyBjb21lcmNpYWxlcywgcGVybyBsYSBkaXN0cmlidWNp77+9biBkZSBsYXMgb2JyYXMgZGVyaXZhZGFzIGRlYmUgaGFjZXJzZSBtZWRpYW50ZSB1bmEgbGljZW5jaWEgaWTvv71udGljYSBhIGxhIGRlIGxhIG9icmEgb3JpZ2luYWwsIHJlY29ub2NpZW5kbyBhIGxvcyBhdXRvcmVzLgoKQXRyaWJ1Y2nvv71uIO+/vSBObyBDb21lcmNpYWwgKENDIC0gQnktTkMpOiBQZXJtaXRlIHVzYXIgbGEgb2JyYSB5IGdlbmVyYXIgb2JyYXMgZGVyaXZhZGFzLCBzaWVtcHJlIHkgY3VhbmRvIGVzb3MgdXNvcyBubyB0ZW5nYW4gZmluZXMgY29tZXJjaWFsZXMsIHJlY29ub2NpZW5kbyBhbCBhdXRvci4KCkF0cmlidWNp77+9biDvv70gU2luIERlcml2YWRhcyAoQ0MgLSBCeS1ORCk6IFBlcm1pdGUgZWwgdXNvIGRlIGxhIG9icmEsIGluY2x1c28gY29uIGZpbmVzIGNvbWVyY2lhbGVzLCBwZXJvIG5vIHNlIHBlcm1pdGUgZ2VuZXJhciBvYnJhcyBkZXJpdmFkYXMsIGRlYmllbmRvIHJlY29ub2NlciBhbCBhdXRvci4KCkF0cmlidWNp77+9biDvv70gTm8gQ29tZXJjaWFsIO+/vSBDb21wYXJ0aXIgSWd1YWwgKENDIO+/vSBCeS1OQy1TQSk6IFBlcm1pdGUgdXNhciBsYSBvYnJhIHkgZ2VuZXJhciBvYnJhcyBkZXJpdmFkYXMsIHNpZW1wcmUgeSBjdWFuZG8gZXNvcyB1c29zIG5vIHRlbmdhbiBmaW5lcyBjb21lcmNpYWxlcyB5IGxhIGRpc3RyaWJ1Y2nvv71uIGRlIGxhcyBvYnJhcyBkZXJpdmFkYXMgc2UgaGFnYSBtZWRpYW50ZSBsaWNlbmNpYSBpZO+/vW50aWNhIGEgbGEgZGUgbGEgb2JyYSBvcmlnaW5hbCwgcmVjb25vY2llbmRvIGEgbG9zIGF1dG9yZXMuCgpBdHJpYnVjae+/vW4g77+9IE5vIENvbWVyY2lhbCDvv70gU2luIERlcml2YWRhcyAoQ0MgLSBCeS1OQy1ORCk6IFBlcm1pdGUgdXNhciBsYSBvYnJhLCBwZXJvIG5vIHNlIHBlcm1pdGUgZ2VuZXJhciBvYnJhcyBkZXJpdmFkYXMgeSBubyBzZSBwZXJtaXRlIHVzbyBjb24gZmluZXMgY29tZXJjaWFsZXMsIGRlYmllbmRvIHJlY29ub2NlciBhbCBhdXRvci4KCkxvcyB1c29zIHByZXZpc3RvcyBlbiBsYXMgbGljZW5jaWFzIGluY2x1eWVuIGxhIGVuYWplbmFjae+/vW4sIHJlcHJvZHVjY2nvv71uLCBjb211bmljYWNp77+9biwgcHVibGljYWNp77+9biwgZGlzdHJpYnVjae+/vW4geSBwdWVzdGEgYSBkaXNwb3NpY2nvv71uIGRlbCBw77+9YmxpY28uIExhIGNyZWFjae+/vW4gZGUgb2JyYXMgZGVyaXZhZGFzIGluY2x1eWUgbGEgYWRhcHRhY2nvv71uLCB0cmFkdWNjae+/vW4geSBlbCByZW1peC4KCkN1YW5kbyBzZSBzZWxlY2Npb25lIHVuYSBsaWNlbmNpYSBxdWUgaGFiaWxpdGUgdXNvcyBjb21lcmNpYWxlcywgZWwgZGVw77+9c2l0byBkZWJlcu+/vSBzZXIgYWNvbXBh77+9YWRvIGRlbCBhdmFsIGRlbCBqZXJhcmNhIG3vv714aW1vIGRlbCBTZXJ2aWNpbyBjb3JyZXNwb25kaWVudGUuCgoKCgoKCgoKUniversidadhttps://udelar.edu.uy/https://www.colibri.udelar.edu.uy/oai/requestmabel.seroubian@seciu.edu.uyUruguayopendoar:47712024-07-25T14:33:37.485878COLIBRI - Universidad de la Repúblicafalse
spellingShingle Statistical traffic classification by Boosting Support Vector Machines
Gómez, Gabriel
Traffic indentification
Traffic clasification
Support vector machines
Boosting
Telecomunicaciones
status_str submittedVersion
title Statistical traffic classification by Boosting Support Vector Machines
title_full Statistical traffic classification by Boosting Support Vector Machines
title_fullStr Statistical traffic classification by Boosting Support Vector Machines
title_full_unstemmed Statistical traffic classification by Boosting Support Vector Machines
title_short Statistical traffic classification by Boosting Support Vector Machines
title_sort Statistical traffic classification by Boosting Support Vector Machines
topic Traffic indentification
Traffic clasification
Support vector machines
Boosting
Telecomunicaciones
url https://hdl.handle.net/20.500.12008/41156