“AI networking offers great potential to disrupt long-standing traditional networking operations to create a massive productivity increase.” Gartner, Innovation Insight: AI Networking Has the Potential to Revolutionize Network Operations
Artificial Intelligence for Network Operations (AIopss) is expanding into AI networking, focusing on ongoing “day 2” network administration, maintenance, and optimization. Networking infrastructure and AI are combined to automate and optimize IT operations.
AIOps is more broadly focused on the information and operations (I&O) infrastructure level. At the same time, AI networking is specific to the networking domain (data center switching, wired, wireless, LAN, WAN, SD-WAN, multicloud).
Though the concept has previously existed under several titles that effectively related to the same job, Gartner coined the phrase in 2023. It has been referred to by vendors as intent-based networking, autonomous networks, self-driving networks, and self-healing networks.
IT infrastructure is critical to today’s enterprise, but it can be complex and difficult to manage, and IT teams often require specific, high-level skills to identify, troubleshoot, and solve network problems. Additionally, network managers are bombarded with alerts from all angles that can be difficult to sift through and prioritize. All of this is complicated by the ongoing talent shortage of IT workers, which makes automation an urgent matter.
AI networking seeks to transform traditional IT operations and make networks more intelligent, self-adaptive, efficient, and reliable. The technology uses machine learning, deep learning, natural language processing (NLP), generative AI (genAI), and other methods to monitor, troubleshoot, and secure networks.
Core capabilities include:
Automation of networks
Tasks like network configuration, monitoring, and troubleshooting are automated by AI networking. As a result, downtime is decreased, performance is enhanced, and resource allocation is optimized. Recommendation and response are automated, as are configuration and problem management, software updates, and other tasks.
ITSM that is optimized
AI networking can optimize IT service management (ITSM) by handling the most basic level 1 and level 2 support issues (like password resets or hardware glitches). Leveraging NLP, chatbots, and virtual agents can field the most common and simple service desk inquiries and help users troubleshoot. AI can also identify higher-level issues that go beyond step-by-step instructions and pass them along for human support.
AI networking can also help reduce trouble ticket false-positives by approving or rejecting tickets before they are acted on by the IT help desk. This can reduce the probability that human workers will chase tickets that either weren’t real problems in the first place, were mistakenly submitted were duplicates, or were already resolved.
These AI handoffs can improve response times and reduce IT staff workloads, allowing them to focus on strategy and more advanced tasks. AI networking can also enhance operational efficiency and reduce human error caused by alert burnout.
Improved network management and performance
AI can analyze large amounts of network data and traffic and perform predictive network maintenance. Algorithms can identify patterns, anomalies, and trends to anticipate potential issues before they degrade performance or cause unexpected network outages. IT teams can then act on these to prevent or at least minimize disruption.
Through correlation analysis, pattern recognition,n, and other methods, AI algorithms can target incident causes and suggest remediation actions. This can reduce the IT team’s time and effort in identifying, diagnosing, and resolving issues.
AI can look at traffic, user behavior, and system logs to pinpoint anomalies and flag potential security breaches or attacks. This can support proactive threat detection, response time, mitigation, and network protection. AI can also respond to cybersecurity issues in real time.
While it homes in on day 2 operations, AI networking also supports day 0 and day 1 functions, including network design, setup, and recommendations to optimize network performance.
Through data analysis and statistical models, AI can learn to understand a network and its policies. It can study and process predefined metrics, traffic flows, trtrendsDSS, and patterns, and compare them against established baselines.
Algorithms can analyze trends and use pattern recognition to make sense out of real-time and historical data. Through monitoring and observability, AI can process event and telemetry data to detect incidents as they occur.
The system does this by creating a baseline from historic data, then continually learning and refining, with or without human-in-the-loop, patterns of events based on data as well as human operator input, guidelines, reaction, and interaction.
Using baseline models, time-series, and topology information, AI can compress and correlate events across telemetry domains and group-related events, thus reducing the need for human intervention.
AI networking continuously learns and improves associations between events and human responses, whether through explicit actions, guidance, or simple observation. This process might trigger a system to offer recommendations or take action itself based on its training and parameters.
GenAI
AI networking will increasingly use genAI and large language models (LLMs), which can offer suggestions or create specific, catered plans of action.
For instance, Gartner posits, an engineer could ask a ChatGPT-like interface to design a leaf-spine network (consisting of two switching layers) that could support 400 servers using Vendor A. Using data (both public and organization- and industry-specific), the platform could then generate the required configurations for this specific prompt.
Using a simulated nonproduction environment, enterprises can validate the impacts of network changes before they are deployed in the physical world. A combination of AI and digital twins can also work into a continuous integration/continuous delivery (CI/CD) pipeline that can allow for “what if” scenarios and ensure that the network is operating as expected.
Gartner predicts that by 2026, 50% of networking vendors will offer a digital twin capability in their tools, up from 10% in 2023.
Gartner asserts that AI networking can drive operational management savings by up to 25%. This is because it can reduce support calls, allow for improved troubleshooting, increase network availability, and optimize end-user experience, “that can’t reasonably be achieved by scaling manual resources,” the tech firm says.
Notably, AI networking simplifies network management, security, and application infrastructures even as they become more complex due to disparate data center, multicloud, colocation, and edge environments, as well as increasing abstraction layers (Kubernetes or containers).
Ultimately, Gartner says it has seen enterprises experience savings of more than 50% in areas including troubleshooting and install time.
Furthermore, because AI networking simplifies network management, workers don’t need to have deep network configuration and troubleshooting skills. Enterprises can automate via genAI and remove the manual human setup.
And, with fewer workers needed to manage the network, organizations struggling with a skills/experience gap can manage networks in-house rather than outsourcing.
Still, ambiguity around AI networking, what it means, and how it works, remains, as the definition is new and fuses different concepts that vendors have been promoting for some time now.
This lack of clarity and mixing of terminology has hampered implementation: Gartner estimates that the AI networking adoption rate is less than 10%. This indicates that enterprises are interested, but need more clarification on the technology and what it does.
Inaccurate AI recommendations can lead to incorrect network configurations, creating unnecessary complexity, or causing outages or other issues. This can stem from incorrect prompts from users, or occur when a system hasn’t been trained correctly or with enough data.
Tool sprawl: Enterprises are increasingly concerned with “technical debt” and the associated costs.
Culture and buy-in: Network management personnel can be risk-averse and may not trust AI tools or unproven recommendations. Similarly, some workers may eschew the technology for fear of it replacing their jobs, or because they are content with the status quo. These factors can limit the value of AI investment and its potential benefits.
Enterprises may lack sufficient, quality data to provide proper insights or resolve issues.
Technical skills for areas such as prompt engineering may require additional time and resource investment. While the idea is to automate workflows so that human workers can focus their talents on more strategic, high-level tasks, new data science skills may emerge as systems evolve. For instance, users may need to become versed in prompt engineering or equipped with skills to effectively analyze AI outputs.
Inflated expectations: There’s a lot of hype around AI. And because the technology is so new, there is little standardization, and vendors may overstate their capabilities. This can lead to oversetting expectations that exceed reality, and enterprises may not get the value they’d hoped for, with systems providing incremental or negligible benefits.
Gartner predicts that by 2027, 90% of enterprises will use some AI to automate day 2 network operations. Similarly, the firm says that by 2026, genAI technology will account for 20% of initial network configuration, up from near zero in 2023.
Moving forward, AI networking will be offered via numerous methods. These include the following:
As with any new or evolving technology, enterprises should proceed with care and due diligence. Gartner and other experts make numerous suggestions as enterprises explore AI networking technologies, their use cases, and benefits. These include the following: