Survey on Machine Learning in 5G

Download Full-Text PDF Cite this Publication

Text Only Version

Survey on Machine Learning in 5G

Rohini M

Assistant professor

Dept. of Computer Science and Engineering Coimbatore Institute of Engineering and Technology Tamil Nadu, India.

Suganya G

Assistant professor

Dept. of Information Technology Coimbatore Institute of Engineering and Technology

Tamil Nadu, India.

Selvakumar N

Assistant professor

Dept. of Computer Science and Engineering Coimbatore Institute of Engineering and Technology Tamil Nadu, India.

Shanthi D

Assistant professor

Dept. of Computer Science and Engineering Coimbatore Institute of Engineering and Technology Tamil Nadu, India

Abstract The core of next generation 5G wireless network is heterogeneous network. The upcoming 5G heterogeneous network cannot be fulfilled until Artificial Intelligence is deployed in the network. The existing traditional 4G technology approaches are centrally managed and reactive conception-based network which needs additional hardware for every update and when there is a demand for the resources in the network. 5G helps in giving solution to the problem of 4G network using prediction and traffic learning to increase performance and bandwidth. Heterogeneous network provides more desirable Quality of Service (QOS) and explores the resources of the network explicitly. The assortment of heterogeneous network brings difficulty in traffic control of the network. The problem in heterogeneous network is network traffic which cannot be controlled and managed due to different protocols and data transfer rate. To solve the problem in heterogeneous network advanced techniques like Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) are employed in 5G Network which are self pro-active, predictive and adaptive. In this paper we discuss about above mentioned advanced techniques that are deployed in 5G to reduce traffic in a network which increases efficiency of the network.

Keywords 5GNetwork, Artificial Intelligence, Machine Learning, Deep Learning.


    In recent years with the prosperous development of the Internet, networking has allured a lot of recognition in both industry and academia. The improvement of mobile communication has increased the data transfer rate significantly, for large amount of data with multimedia communication service. The mobile communication is now stepping into 5G. To satisfy the data traffic demand the network technology are moving towards heterogeneous network which provides ubiquitous internet access and

    enhanced public services [1]. The next generation network is service-driven where a single infrastructure should efficiently provide different service such as low latency communication, enhanced mobile broadband and immense machine type communication for heterogeneous network. The heterogeneous network different layers of cells like femto, macro, micro, pico, relays, diverse user devices and application. In this paper we will discuss about

    how AIs potential is used in next generation wireless network using basic learning algorithms like ML, DL, etc.,

    Fig-1: Components of Heterogeneous Network

    Fig-2: Traffic Classification approaches.

    1. Port based IP traffic classification

      TCP and UDP give multiplexing of different streams between IP endpoints with the assistance of port numbers. Generally numerous applications use an 'outstanding' port to which different hosts may start correspondently. The application is deduced by looking into the TCP SYN parcel's objective port number in the Internet Assigned Numbers Authority (IANA's) rundown of enlisted ports. In any case, this methodology has constraints. Right off the bat, a few applications might not have their ports enrolled with IANA (for instance, distributed applications, for example, Napster and Kazaa). An application may utilize ports other than its outstanding ports to rescue from working framework get to control confinements. Additionally, at times server ports are powerfully assigned as required. Though port-based traffic grouping is the quickest and straightforward strategy, a few examinations have demonstrated that it performs ineffectively, e.g., under 70% precision in characterizing streams [11]12].

    2. Payload based IP traffic classification

      This methodology reviews the packet header to decide the applications. Packet payloads are analyzed a little bit at a time to find the bit streams that contain signature. In the event such piece of streams is discovered, at that point bundles can be precisely named. This methodology is regularly utilized for P2P traffic discovery and system interruption identification. Real impediments of this methodology is that the protection laws may not enable directors to assess the payload; it additionally forces huge multifaceted nature and preparing load on traffic ID gadget; requires significant computationally power and capacity limit since it examinations the full payload [2]

    3. Protocol Behavior or Heuristics Based Classification

      In this method the classification of networks is based on connection level patterns and network protocol behavior. This method is based on identifying and observing patterns of host behavior at the transport layer.

      The advantage of this classification is that packet pay load access is not needed [10][2].

    4. Classification based on flow statistics traffic properties:

      The preceding techniques are restricted by their dependence on the inferred linguistics of the information gathered through deep review of packet content (payload and port numbers). Newer approaches depend on traffic's statistical characteristics to identify the applying [4][7][6][9]. associate degree assumption underlying such ways in which is that traffic at the network layer has mathematical properties that are distinctive definitely classes of applications and modify wholly totally different offer applications to be distinguished from each other. It uses network or transport layer that has applied mathematics properties like distribution of flow length, flow idle time, packet interarrival time, packet lengths etc. These are distinctive sure categories of applications and thence facilitate to {differentiate|to tell apart} different applications from one another. This methodology is possible to see

      application sort however not usually the particular consumer type. as an example, it can't verify if flow belongs to Skype or MSN traveller voice traffic specifically. The advantage of this approach is that there's no packet payload scrutiny concerned.


    In every possible field machine learning has been used to leverage its astonishing power. In variety of application such as speech recognition, bio informatics and computer vision, ML techniques have been used efficiently. Machine learning is mainly used for prediction and classification and also in networking it is mainly used for performance prediction and intrusion detection. To make decision directly Machine learning constructs models that can learn themselves from data without being explicitly programmed or without following some set of rules.

    Machine learning enables the model to get into self- learning mode without being explicitly programmed. The model can be trained by providing data sets to them, when exposed to new data, models are enabled to learn, predict and develop by themselves. Machine learning algorithm can be classified into three categories. They are supervised learning, unsupervised learning, reinforcement learning [3].

    In Supervisedlearning the model is trained on a labeled data set which then learns on its own and when new testing data is given it compares with the training data set and predicts the output. Supervised learning is mainly used for regression and classification problems.

    In unsupervised learning the training data set is un- labelled, and it finds pattern and relationship among data. It is mainly used in clustering and association problems. In reinforcement learning the model learns on its own without any training data.

    Fig-3: Machine Learning Work Flow

  3. SUPERVISED LEARNING ALGORITHM The supervised machine learning algorithms are:

      • Naïve Bayes

      • Support Vector Machine (SVM)

      • K-Nearest Neighbor

    The steps involved in supervised machine learning algorithm are:

    STEP1: Prepare data

    STEP2: Choose an algorithm STEP3: Fit a model

    STEP4: Choose a validation method

    STEP5: Examine fit and update until satisfied STEP6: Use fitted model for prediction

    The Figure 1 denotes work flow of supervised learning algorithm.

    Fig-4: Work Flow of Supervised Learning

    1. Naïve Bayes

      Naïve Bayes is a classification algorithm which mainly relies on Bayes theorem. To control traffic in network Bayes theorem is used which classifies the network traffic accurately with the help of the flow feature which is given as training data to the model [26].

      Fig-5: Conditional probability of Naïve Bayes classifier

      Fig-6: Naïve Bayes classifier for network traffic classification and detection.

      1. Pre-processing

        In this process IP packets over a network is collected and used for designing and also for determining the header of packets. A stream is regularly plot as sequent IP bundles having the qualities, for example, 5-tuple: supply IP, supply port, goal IP, goal port, and transport layer convention [25]. Since we tend to have some expertise in a connected science approach for grouping strategy, we need to extricate the stream connected arithmetic alternatives and is discretized for speaking to the traffic streams.

      2. Correlation Based Feature Selection

        In this method measurable highlights are extricated and are utilized to speak to traffic streams that is finished by pre-handling to apply include determination [16] to expel immaterial and excess highlights from the list of capabilities. The relationship-based element subset choice is utilized in the investigations, which looks for a subset of highlights with high class-explicit connection and low inter connection. Relationship coefficient is signified as 'r' where

        where n represents number of instances x indicates attributes to be tested or correlation. y represents the attributes to be tested against the x. Finally, 0.75 is selected as threshold value

      3. Feature Discretization

        Discretization [30] could be a method of changing numeric values into intervals and associating them to a nominal image. These symbols are then used as new values rather than the initial numeric values. The new dataset is smaller than that of the previous one, i.e.) a discretized feature is having a fewer attainable values than that of non- discretized one. The key method in discretization is that the choice of intervals which may be determined by associate experience within the field or by discretization rule. There ar 2 approaches for discretization: One is to discretize every feature while not the data of the categories within the

        coaching set (unsupervised discretization). the opposite is to form use of the categories once discretizing (supervised discretization) [15].

      4. Naïve Bayes Classification

        A Naïve-Bayes (NB) metric capacity unit algorithmic program [6] could be a straightforward structure consisting of a category node because the parent node of all alternative nodes. the fundamental structure of Naïve Bayes Classifier is shown in Fig three within which C represents main category and a, b, c and d represent alternative feature or attribute nodes of a selected sample. No alternative connections square measure allowed during a Naïve-Bayes structure. Naïve-Bayes has been used as a good classifier.It is simple to construct Naïve Bayes classifier as compared to alternative classifiers as a result of the structure is given a priori and thus no structure learning procedure is needed. Naïve-Bayes works alright over an outsized variety of datasets, particularly wherever the options accustomed characterize every sample don't seem to be properly related to.

        Fig: Naïve Bayes Classifier

    2. K-Nearest Neighbor

      It is a kind of classification algorithm which collect all similar data and forms cluster. If a new data enters into the model based on the closeness of the data it classifies them to the corresponding clusters [5].

      It is a non-parametric algorithm which does not require any prior knowledge about the data and enhances the robustness of the model. In network the traffic can be classified using K-Nearest neighbor by assigning the cluster value. In K nearest neighbor, K can be an integer greater than 1. For every new data point we want to classify, we compute to which neighboring group it is closest to [30].

      Fig-7: K-nearest neighbor data classification for k=1 and 3.

    3. Support Vector Machine

    It is a supervised machine learning algorithm which is mainly used for classification and regression. In this algorithm the data is plotted in n-dimensional space, where n represents the number of features that is used for training [13][14]. Then the classification is done by the hyper-plane that differentiates two classes. In networking the features of the network are trained and tested with new data and then the algorithm learns to predict and classify the new incoming class.

    we must first train the classifier and then cross validate with test the data [17]. To get accurate prediction using SVM classifier we need to use SVM kernel function and then the parameters has to be tuned. the process involved in SVM classifier is as follows:

    step 1: training SVM classifier.

    step 2: classifying new data with SVM classifier. step 3: tuning SVM classifier.

    In learning phase, the classifier is made to learn about the fundus images. Feature vector of the image is fed to the classifier and then the output is labelled [19]. In testing phase, feature vector of unknown image is fed to the classifier and the lesion is classified. The extracted feature of the image is given to the classifier to classify the fundus image of the retina accurately. For non-linear classification SVM uses kernel function to map the data to dimensional space.

    Fig-8: SVM hyperplane between two classes for classification.


    In unsupervised learning algorithm the training data sets are unlabeled. This can be mainly used for clustering problems. In network the traffic can be clustered based on their features. Un supervised algorithms used in networking are K-Means, DBSCAN.

    1. K-Means

      k-means is one of unsupervised learning algorithm which is used for making inference from datasets by only using vectors as input. They do not refer to known, labelled outcomes. K-means algorithm groups data together forms cluster and finds the pattern that is involved in the dataset. Clusters refers to group of data point that have certain similarities [19].

      K-Means randomly selects k centroids. Then it works in iterative way to perform two different tasks. First each data Is assigned to closest centroid, using the standard Euclidean distance. Euclidean distance finds similarity between flow

      of data. Next For each centroid, mean value of data point has to be calculated.

      In this training data contains payload to entitle flow with source application. Learning process involves two steps. First step contains explanation of each cluster and the other contains theapplications structure. In classification packet size are noted and compared with the new flow of data. Flow is directed to the application that has more dominant value in the cluster [23][24].

    2. DBSCAN

    Density Based Spatial Clustering of Application of Noise is a density- based clustering algorithm which uses dense area of objects. The parameters that are used in DBSCAN algorithm are eps, min points. The clusters in DBSCAN are formed from the core point which are directly- density reachable and density reachable.

    The data are collected from online tools and features are selected based on the packets and then they are fit into the model for testing and training. Finally, they are predicted and classified. The steps involved in DBSCAN are

    Let Y = {Y1,Y2,Y3) be the set of data points

    1. Initially the process has to be started with an arbitrary starting point which is not visited already.

    2. Extract the neighbour of arbitrary point using .

    3. Then clustering process starts if there are sufficient neighbourhood and point is marked as visited or it is noted as noise data.

    4. If a point is found to be a part of the cluster then its neighbour is also the part of the cluster and the above procedure from step 2 is repeated for all neighbour points. Until all the cluster point is determined this process is repeated.

    5. The unvisited new point is retrieved.

    6. This process stops when all points are marked as visited.

  5. ARTIFICIAL NEURAL NETWORK Artificial neural network is one of the learning algorithms where which is used within machine learning techniques. It consists of many layers for learning and analysing data. It learns like human brain and it is mainly used for pattern recognition and data classification. Neural networks are trained using examples. They can be programmed explicitly. It contains three layers

      • Input layer

      • Hidden layer

      • Output layer

    It may also contain multiple hidden layer. Hidden layer is mainly used for feature extraction and calculation. Feed forward and feedback are two topologies in neural network [22].

    Fig-9: Artificial Neural Network

    1. Backpropagation algorithm

      It is the most important algorithm for training a neural network. It is mainly used to network traffic effectively in heterogenous network. For training the weights in multi – layer feed forward network, backpropagation algorithm is used.

      The neuron has weights that has to be maintained. Then the forward propagation is classified as neuron activation, neuron transfer, forward propagation.

      Next is backpropagate error where the error is calculated and error is then back propagated through the hidden layer. It involves transfer derivate and error propagation. Then the network has to be trained by propagating the error and forwarding inputs [16].

      Finally, the prediction of the network traffic is made effectively.


    In next generation network machine learning plays an important role. The steps that are involved in networking are:

      • Step1: Problem formulation

      • Step2: Data collection

      • Step3: Data analysis

      • Step4: Model construction

      • Step5: Model validation

      • Step6: Deployment and interference

    1. Problem formulation

      In machine learning the training process is time consuming so it is mandatory that the problem should be formulated correctly at the beginning of the process. There should be a strong relation between the problem and the data that has been collected. The machine learning model is classified as clustering, classification and decision making and the problem statement should also fall under this category [8][15][16].

      This help in identifying the learning model and also for collecting data. When the problem formulation is not done properly it leads to unsuitable learning model and un satisfactory performance.

    2. Data collection

      . There are two types of data collection. They are offline data collection and online data collection.

      In online data collection the real time data are collected and they can be used as feedback for the model and it can also be used as a re-training data for the model. Offline data can be collected from repositories [17][18][19].

      For the purpose of classification of network traffic, we are utilizing the datasets that are made from this present reality traffic flow named as 'wide'. The wide dataset comprises of traffic streams which are haphazardly chosen from the wide follow and cautiously perceived by the manual examination. It comprises of 3416 occasions with 7 classes, for example, (bt, dns, ftp, http, smtp, yahoomsg, ssh) and 22 traits. The features that are extracted from the process

      [13] is recorded in Table 1

      By using monitoring and measurement tool online and offline data can be collected effectively which provides security in various data collection aspects. It can also be stored for model adaption. After data collection the process is categorized as training or learning phase, validation and testing phase.

    3. Data analysis

      Data analysis consists of two phases. They are:

      • pre-processing

      • feature extraction.

    Pre-processing is done to remove noise from the data that has been collected. Then the features of the data are extracted which is a prior step for learning and training[10]. The types of features that can be extracted from the network are:

    Training involves training of the model along with the data set that is bee collected at the beginning of the stage.

    The tuning process helps in making the model to learn themselves by comparing them with the trained data.

    1. Model validation

      It involves cross validation of the testing process to test the accuracy of the model. This helps in optimizing the model and maintains the overall performance of the system.

    2. Deployment and interference

      In deployment and interference stage all the trade off and stability of the model is maintained to check the accuracy and finds the best way in which steps has been followed.


    It is a process in which network traffic can be categorized based on the parameters into number of traffic classes. It first captures network traffic and extracts the features of the selected data. Then training process is done using data sampling method and finally algorithm is implemented and results are calculated.

    Figure11 represents the work flow of traffic classification in networks

      • Packet level features.

      • Flow level features

    In packet level features the extracted features are packet size, mean, root and variance.

    In flow level mean flow duration and mean number of packet flow features are extracted.


    Types Of Features

    Feature Description



    Number of packets transferred



    Volume of bytes transferred


    Packet Size

    Min, Max, Median and Standard.

    Deviation of packet size



    Packet Time

    Min, Max, Median and Standard. Deviation of inter packet time




    1. Model construction

      In this process model selection, training and tuning are involved. According to the size of the data set a suitable learning model and algorithm needs to be selected.

      Fig-10: Traffic classification

      Fig 11: Workflow of machine learning in 5G networks

      Figure represents machine learning work flow in heterogenous network [7].


    It enables the model to learn on its own automatically and make decisions by interacting with the environment continuously. When it gets combined with deep learning it becomes a solution to the problems which are un traceable in the real world[9].

    It has three components.

      • First the agent behavior is defined by the policy function

      • Second the state and value are evaluated

      • A model which represents learned knowledge

  9. DEEP LEARNING IN HETEROGENOUS NETWORK The deep learning mechanism continues to exists in three phases [8]. They are:

      • Initial phase

      • Training phase

      • action or running phase. Initial Phase

    In this phase the relevant data from the deep learning system is obtained. To stimulate the communication between different routers under different conditions, traditional routing OSPF is used and also to record the traffic patterns in the network.

    Training Phase

    The training algorithm contains two main parts:

    The greedy layer-wise training method is used to initialize the deep learning system.

    The backpropagation algorithm is used to fine tune the deep neural networks. In each router the training period is executed.

    Running phase

    In running phase, the system is executed and the performance is calculated.


AI is used as a tool to improve 5G technologies in recent technologies. The reason for not using AI algorithm in networking is due to the lack of learning process that has been left in past few years. Heterogeneous network is the

basic for next generation network where traffic in a network plays a major role in disturbing the performance of the network. In this paper we discussed about machine learning techniques and its implementation in 5G heterogeneous network to increase its performance by reducing traffic in a network.


  1. L. C. I, S. Han, Z. Xu, et al, New paradigm of 5G wireless internet [J], IEEE Journal on Selected Areas in Communications,Vol.34, No.3,474-482,2016.

  2. R. Li, Z. Zhao, X. Zhou, et al, Intelligent 5G: when cellular networks meet artificial intelligence[J], IEEE Wireless Communications,

    PP(99) ,2-10,2017

  3. S. Dzulkifly, L. Giupponi, F. Sai, et al, Decentralized Q learning for uplink power control[C], IEEE, International Workshop on Computer Aided Modelling and Design of Communication Links and Networks, IEEE, 54-58, 2015.

  4. M. Shafi et al.,5G: A Tutorial Overview of Standards, Trials,Challenges, Deployment, and Practice, IEEE JSAC, vol. 35,no. 6, , pp. 120121. June 2017

  5. D. Soldani, A. Manzalini, Horizon 2020 and beyond: on the5G operating system for a true digital society [J], IEEE Vehicular Technology Magazine, Vol.10, No.1, 32-42,2015.

  6. Y. Zhang et al., Home M2M Networks: Architectures, Standards, and QoS Improvement, IEEE Commun. Mag.,vol. 49, no. 4, Apr. 2011, pp. 4452.

  7. Chen Z, Wen J, Geng Y. Predicting future traffic using hidden markov models. In: Proceddings of 24th IEEE International Conference on Network Protocols (ICNP). IEEE; 2016. p. 16.

  8. I. Portugal, P. Alencar, and D. Cowan, The Use of MachineLearning Algorithms in Recommender Systems: A Systematic Review, Expert Systems with Applications, vol. 97, May 2018, pp. 20527.

  9. S. Maharjan et al., Dependable Demand Response Management in the Smart Grid: A Stackelberg Game Approach, IEEE Trans. Smart Grid, vol. 4, no. 1, 2013, pp. 12032.

  10. G. Alnwaimi, S. Vahid, and K. Moessner, Dynamic heterogeneous learning games for opportunistic access in LTE-based macro/femtocell deployments, IEEE Transactions on Wireless Communications, vol. 14, no. 4, pp. 22942308, 2015.

  11. D. D. Nguyen, H. X. Nguyen, and L. B. White, Reinforcement Learning with Network-Assisted Feedback for Heterogeneous RAT Selection, IEEE Transactions on Wireless Communications, vol. 16, no. 9, pp. 60626076, 2017.

  12. U. Challita, L. Dong, and W. Saad, Deep learning for proactive resource allocation in LTE-U networks, in European Wireless 2017- 23rd European Wireless Conference, 2017.

  13. R. Pascanu, T. Mikolov, and Y. Bengio, On the difficulty of training recurrent neural networks, Tech. Rep., 2013.

  14. C. Jiang, H. Zhang, Y. Ren, Z. Han, K. C. Chen, and L. Hanzo, Machine Learning Paradigms for Next-Generation Wireless Networks, IEEE Wireless Communications, 2017.

  15. T. E. Bogale, X. Wang, and L. B. Le, Machine Intelligence Techniques for Next-Generation Context-Aware Wireless Networks, ITU Special Issue: The impact of Artificial Intelligence (AI) on communication networks and services., vol. 1, 2018

  16. G. Villarrubia, J. F. De Paz, P. Chamoso, and F. D. la Prieta, Artificial neural networks used in optimization problems, Neurocomputing, vol. 272, pp. 1016, 2018.

  17. B. Bojovic´, E. Meshkova, N. Baldo, J. Riihijärvi, and M. Petrova, Machine learning-based dynamic frequency and bandwidth allocation in self-organized LTE dense small cell deployments, Eurasip Journal on Wireless Communications and Networking, vol. 2016, no. 1, 2016.

  18. E. Balevi and R. D. Gitlin, Unsupervised machine learning in 5G networks for low latency communications, 2017 IEEE 36th International Performance Computing and Communications Conference, IPCCC 2017, vol. 2018-Janua, pp. 12, 2018.

  19. U. Challita, L. Dong, and W. Saad, Deep learning for proactive resource allocation in LTE-U networks, in European Wireless 2017- 23rd European Wireless Conference, 2017

  20. R. Pascanu, T. Mikolov, and Y. Bengio, On the difficulty of training recurrent neural networks, Tech. Rep., 2013

  21. Q. V. Le, N. Jaitly, and G. E. Hinton Google, A Simple Way to Initialize Recurrent Networks of Rectified Linear Units, Tech. Rep., 2015.

  22. G. Alnwaimi, S. Vahid, and K. Moessner, Dynamic heterogeneous learning games for opportunistic access in LTE-based macro/femtocell deployments, IEEE Transactions on Wireless Communications, vol. 14, no. 4, pp. 22942308, 2015.

  23. D. D. Nguyen, H. X. Nguyen, and L. B. White, Reinforcement Learning with Network-Assisted Feedback for Heterogeneous RAT Selection, IEEE Transactions on Wireless Communications, vol. 16, no. 9, pp. 60626076, 2017.

  24. M. S. Parwez, D. B. Rawat, and M. Garuba, Big data analytics for user-activity analysis and user-anomaly detection in mobile wireless network, IEEE Transactions on Industrial Informatics, vol. 13, no. 4, pp. 20582065, 2017.

  25. U. Challita, L. Dong, and W. Saad, Deep learning for proactive resource allocation in LTE-U networks, in European Wireless 2017- 23rd European Wireless Conference, 2017.

  26. L.-C. Wang and S. H. Cheng, Data-Driven Resource Management for Ultra-Dense Small Cells: An Affinity Propagation Clustering Approach, IEEE Transactions on Network Science and Engineering, vol. 4697, no. c, pp. 11, 2018.

  27. S. P. Sotiroudis, K. Siakavara, and J. N. Sahalos, A Neural Network Approach to the Prediction of the Propagation Path-loss for Mobile Communications Systems in Urban Environments, PIERS Online, vol. 3, no. 8, pp. 11751179, 2007.

  28. T. M. Mitchell, Machine Learning, 1st ed. McGraw-Hill Science/Engineering/Math, 1997.

  29. T. Zhou, L. Chen, and J. Shen, Movie Recommendation System Employing the User-Based CF in Cloud Computing, Proc. 2017 IEEE Intl. Conf. Computational Science and Engineeringand Embedded and Ubiquitous Computing, vol. 2. 2017, pp. 4650.

  30. P. U. Stanford Vision Lab, Stanford University, ImageNet. [Online]. Available:

Leave a Reply

Your email address will not be published. Required fields are marked *