Back from CiscoLive two weeks ago, a mad networking week with 14500 people, where I presented “YANG Data Modeling and NETCONF: Cisco and Industry Developments” with Carl Moberg. Part of this session, we briefly touched on telemetry.
Telemetry is a big buzzword in the networking industry these days. As any buzzword, telemetry means different things to different people; exactly like SDN or intent-based networking (I guess this one will need its own blog entry at some point in time). In different discussions, telemetry is meant to specify:
- The science and technology of automatic measurement and transmission of data.
- The mechanism to push any monitoring information to a collector (In that sense, NetFlow is a telemetry mechanism). Since it’s about streaming data on regular basis, it is also known as “streaming telemetry”.
- The data model-driven push of information, pushing YANG objects.
- The hardware-based telemetry, pushing packet-related information from ASICs.
- Device-level telemetry, such as the pushing of information about hardware and software inventory, configuration, the enabled features, etc. with the intention to automate diagnostics, understand overall usage, and provide install base management
During CiscoLive, I had to clarify a few times my definition of telemetry and why data model-driven telemetry makes more sense. The Cisco documentation goes in that direction:
Telemetry is an automated communications process by which measurements and other data are collected at remote or inaccessible points and transmitted to receiving equipment for monitoring. Model-driven telemetry provides a mechanism to stream data from a model-driven telemetry-capable device to a destination.
Telemetry uses a subscription model to identify information sources and destinations. Model-driven telemetry replaces the need for the periodic polling of network elements; instead, a continuous request for information to be delivered to a subscriber is established upon the network element. Then, either periodically, or as objects change, a subscribed set of YANG objects are streamed to that subscriber.
The data to be streamed is driven through subscription. Subscriptions allow applications to subscribe to updates (automatic and continuous updates) from a YANG datastore, and this enables the publisher to push and in effect stream those updates.
To justify the need for data model-driven telemetry, I prefer to start the explanation with: why do we need telemetry? We hear all types of reasons: because SNMP is boring, because SNMP is slow, because SNMP is not precise in terms of polling time, you name it. Note: I even heard “because SNMP is not secure”. Really? SNMPv3 is secure! Anyway, there is a bigger reason to focus on data model-driven telemetry. Let me try to demonstrate this:
- We know that SNMP does not work for configuration (See RFC3535 for a justification), while it’s suitable for monitoring.
- Network configuration is based on YANG data models, with protocol/encoding such as NETCONF/XML, RESTCONF/JSON, GRPC/protobuf, etc.
- Knowing that a configuration is applied doesn’t imply that the service is running, we must monitor the service operational data at the same time as the configuration
- There is not much correlation between the MIB modules for the monitoring and YANG modules for configuration, except maybe a few indices such as the ifIndex in RFC2863 or the interface key name in RFC7223. And “Translation of Structure of Management Information Version 2 (SMIv2) MIB Modules to YANG Modules, RFC6643” doesn’t help to map YANG and SMI.
- Any intent-based mechanism requires a quality assurance feedback loop, which is nothing else than the telemetry mechanism.
- Therefore, since the configuration is YANG data model-driven, so must be the telemetry.
A network administrator needs to manage his network as a whole, independently of the use cases or the management protocols. Here is the issue: with different protocols come different data models, and different ways to model the same type of information. When network administrators deals with multiple protocols, the network management must perform the difficult and time-consuming job of mapping data models: the one from configuration with the one from monitoring. Network management has been a difficult task with MIB modules, YANG models, IPFIX information elements, syslog plain text, TACACS+, RADIUS, etc. Therefore, I frown up protocol design decisions that don’t simplify this data model issue. The only exception might (I want to stress might here) be the one from hardware-based telemetry: pushing telemetry directly from ASICs, at line rate, might not leave room for a YANG mapping on the network element.
Permalink
Nice simple and rational explanation.
Permalink
Thanks to clarify.
Permalink
Great Explanation!!