Documentation

Track tracing plugin

Since version 2.2.0, Nacos support to inject track tracing plugins through SPI, to subscribe and process trace events in the plugin with the way you want (such as logging, writing to storage, etc.). This document will describe how to implement a track tracing plugin and how to make it work.

Attention: At present, the track tracing plugin is still in the beta stage, and its API and interface definitions maybe modified with version upgrades. Please pay attention to the applicable version of your plugin.

The track tracing of Nacos is different from the tracking in general sense. It is mainly used to trace and record some Nacos related operations, such as service registration, de-registration, push, status change, etc. It is not used to trace the access and request between micro-services. If you need to monitor the access and request between services, please use the corresponding tracing projects.

Concepts in Track tracing Plugin

TraceEvent

Nacos embeds points in the important operations, and defines a series of trace events named 'TraceEvent'. After combined multiple 'TraceEvent's for the same resource (such as services, configurations, etc.), the trace of the resource will be gotten.

The TraceEvent will include following:

Field Name Description
type Type of event, defined by sub-events
eventTime Time of the event occurs
namespaceId Corresponding resource namespace ID of the event
group Corresponding resource group name of the event
name Corresponding resource name of the event,such as service name or dataId for config

Currently, the sub event types defined in Nacos include:

Event Name Description Details
RegisterInstanceTraceEvent The event for service instance registration, mainly occurs when the service provider is registered Detail
DeregisterInstanceTraceEvent The event for service instance de-registration, mainly occurs when the service provider is de-registered Detail
RegisterServiceTraceEvent The event for service registration, different from RegisterInstanceTraceEvent, mainly occurs when create empty services Detail
DeregisterServiceTraceEvent The event for service de-registration, different from DeregisterInstanceTraceEvent, mainly occurs when remove empty services Detail
SubscribeServiceTraceEvent The event for service subscription, mainly occurs when the service is subscribed Detail
UnsubscribeServiceTraceEvent The event for service unsubscription, mainly occurs when the service is unsubscribed Detail
PushServiceTraceEvent The event for service pushing, mainly occurs when the service is pushed to subscribed Detail
HealthStateChangeTraceEvent The event for service instance health state changing, mainly occurs when an instance's health state changes due to a heartbeat/health check Detail

Plugin Development

To develop a Nacos track tracing plugin, developer first need to depend on the relevant API of the track tracing plugin.

        <dependency>
            <groupId>com.alibaba.nacos</groupId>
            <artifactId>nacos-trace-plugin</artifactId>
            <version>${project.version}</version>
        </dependency>

${project.version} is the version of Nacos for your development plugin.

Then implement interface com.alibaba.nacos.plugin.trace.spi.NacosTraceSubscriber, and put your implementation into services of SPI.

The methods of interface in following:

method name parameters returns description
getName void String he name of the plugin. When the name is the same, the plugin loaded later will overwrite the plugin loaded first.
subscribeTypes void List<Class<? extends TraceEvent>> The expected the event type of the subscription for this plugin. If returns an empty list, plugin will not subscribe any event.
onEvent TraceEvent void The logic for handle events. The type of events will defined by subscribeTypes.
executor void Executor When return not null, Nacos will use the Executor to call onEvent, otherwise use event distribution thread to call onEvent.

Attention: It is recommend that you use a dependent Executor for plugin implementations, such as blocked IO operations in plugin implementations, which will block onEvent called to other events when there are IO exceptions, causing backlogs.

In nacos-group/nacos-plugin,providing a demo implementation for track tracing plugin. This demo subscribes RegisterInstanceTraceEvent and DeregisterInstanceTraceEvent and print result information into logs.

Degradation of Track Tracking Plugin

Because the Track Tracking Plugin is for the monitoring category, and will not affect Nacos data. So when the Track Tracking Plugin has problems, it should not affect the Nacos primary works.

It is recommend that you use a dependent Executor for plugin implementations, such as blocked IO operations in plugin implementations, which will block onEvent called to other events when there are IO exceptions, causing backlogs.

If the backlog occurs unfortunately, subsequent events will be automatically discarded when the event queue of the Track Tracking Plugin reaches the upper limit to ensure overall system stability.

You can see the words Trace Event Publish failed, event : {}, publish queue size : {} in nacos.log when the discard occurred

Appendix: Sub-trace Event Details

RegisterInstanceTraceEvent

Since 2.2.0.

type: REGISTER_INSTANCE_TRACE_EVENT

Extra Content:

Field Name Description
clientIp The source IP of registering service instance request, probably null.
rpc Whether the source request is gRPC, true when request is gRPC, false is HTTP.
instanceIp The IP or Host of service instance registered
instancePort The Port of service instance registered

DeregisterInstanceTraceEvent

Since 2.2.0.

type: DEREGISTER_INSTANCE_TRACE_EVENT

Extra Content:

Field Name Description
clientIp The source IP of de-registering service instance request, probably null.
reason The reason of de-registering, details see DeregisterInstanceReason
rpc Whether the source request is gRPC, true when request is gRPC, false is HTTP.
instanceIp The IP or Host of service instance de-registered
instancePort The Port of service instance de-registered

DeregisterInstanceReason

Reason Description
REQUEST De-registration comes from client requests, in other word, user initiated de-registration.
NATIVE_DISCONNECTED De-registration comes from client disconnected
SYNCED_DISCONNECTED De-registration comes from client disconnected in other server node, and synced from other server node.
HEARTBEAT_EXPIRE De-registration comes from heartbeat timeout for 1.X version client.

RegisterServiceTraceEvent

Since 2.2.0.

type: REGISTER_SERVICE_TRACE_EVENT

Extra Content: None

DeregisterServiceTraceEvent

Since 2.2.0.

type: DEREGISTER_SERVICE_TRACE_EVENT

Extra Content: None

SubscribeServiceTraceEvent

Since 2.2.0.

type: SUBSCRIBE_SERVICE_TRACE_EVENT

Extra Content:

Field Name Description
clientIp The IP of subscriber

UnsubscribeServiceTraceEvent

Since 2.2.0.

type: UNSUBSCRIBE_SERVICE_TRACE_EVENT

Extra Content:

Field Name Description
clientIp The IP of subscriber

PushServiceTraceEvent

Since 2.2.0.

type: PUSH_SERVICE_TRACE_EVENT

Extra Content:

Field Name Description
clientIp The IP of subscriber
instanceSize The size of service instance for this push
pushCostTimeForAll The full cost for this push, means that the cost from start pushing to end pushing, including the wait time in combined queue and the time for executing.
pushCostTimeForNetWork The network cost for this push, means that the cost from executing to end pushing, only including the network cost.
serviceLevelAgreementTime The actual cost for this push, means the cost from services changeing to end pushing. It's a reference value not accuracy.

HealthStateChangeTraceEvent

Since 2.2.0.

type: HEALTH_STATE_CHANGE_TRACE_EVENT

Extra Content:

Field Name Description
instanceIp The IP or Host of service instance changed
instancePort The Port of service instance changed
isHealthy The change result is healthy or not
healthCheckType The type of health check
healthStateChangeReason The reason of healthy changed