Skip to content

Tracing

Track tracing plugin

Since version 2.2.0, Nacos support to inject track tracing plugins through SPI, to subscribe and process trace events in the plugin with the way you want (such as logging, writing to storage, etc.). This document will describe how to implement a track tracing plugin and how to make it work.

Attention: At present, the track tracing plugin is still in the beta stage, and its API and interface definitions maybe modified with version upgrades. Please pay attention to the applicable version of your plugin.

The track tracing of Nacos is different from the tracking in general sense. It is mainly used to trace and record some Nacos related operations, such as service registration, de-registration, push, status change, etc. It is not used to trace the access and request between micro-services. If you need to monitor the access and request between services, please use the corresponding tracing projects.

Concepts in Track tracing Plugin

TraceEvent

Nacos embeds points in the important operations, and defines a series of trace events named ‘TraceEvent’. After combined multiple ‘TraceEvent’s for the same resource (such as services, configurations, etc.), the trace of the resource will be gotten.

The TraceEvent will include following:

Field NameDescription
typeType of event, defined by sub-events
eventTimeTime of the event occurs
namespaceIdCorresponding resource namespace ID of the event
groupCorresponding resource group name of the event
nameCorresponding resource name of the event,such as service name or dataId for config

Currently, the sub event types defined in Nacos include:

Event NameDescriptionDetails
RegisterInstanceTraceEventThe event for service instance registration, mainly occurs when the service provider is registeredDetail
DeregisterInstanceTraceEventThe event for service instance de-registration, mainly occurs when the service provider is de-registeredDetail
RegisterServiceTraceEventThe event for service registration, different from RegisterInstanceTraceEvent, mainly occurs when create empty servicesDetail
DeregisterServiceTraceEventThe event for service de-registration, different from DeregisterInstanceTraceEvent, mainly occurs when remove empty servicesDetail
SubscribeServiceTraceEventThe event for service subscription, mainly occurs when the service is subscribedDetail
UnsubscribeServiceTraceEventThe event for service unsubscription, mainly occurs when the service is unsubscribedDetail
PushServiceTraceEventThe event for service pushing, mainly occurs when the service is pushed to subscribedDetail
HealthStateChangeTraceEventThe event for service instance health state changing, mainly occurs when an instance’s health state changes due to a heartbeat/health checkDetail

Plugin Development

To develop a Nacos track tracing plugin, developer first need to depend on the relevant API of the track tracing plugin.

<dependency>
<groupId>com.alibaba.nacos</groupId>
<artifactId>nacos-trace-plugin</artifactId>
<version>${project.version}</version>
</dependency>

${project.version} is the version of Nacos for your development plugin.

Then implement interface com.alibaba.nacos.plugin.trace.spi.NacosTraceSubscriber, and put your implementation into services of SPI.

The methods of interface in following:

method nameparametersreturnsdescription
getNamevoidStringhe name of the plugin. When the name is the same, the plugin loaded later will overwrite the plugin loaded first.
subscribeTypesvoidList<Class<? extends TraceEvent>>The expected the event type of the subscription for this plugin. If returns an empty list, plugin will not subscribe any event.
onEventTraceEventvoidThe logic for handle events. The type of events will defined by subscribeTypes.
executorvoidExecutorWhen return not null, Nacos will use the Executor to call onEvent, otherwise use event distribution thread to call onEvent.

Attention: It is recommend that you use a dependent Executor for plugin implementations, such as blocked IO operations in plugin implementations, which will block onEvent called to other events when there are IO exceptions, causing backlogs.

In nacos-group/nacos-plugin,providing a demo implementation for track tracing plugin. This demo subscribes RegisterInstanceTraceEvent and DeregisterInstanceTraceEvent and print result information into logs.

Degradation of Track Tracking Plugin

Because the Track Tracking Plugin is for the monitoring category, and will not affect Nacos data. So when the Track Tracking Plugin has problems, it should not affect the Nacos primary works.

It is recommend that you use a dependent Executor for plugin implementations, such as blocked IO operations in plugin implementations, which will block onEvent called to other events when there are IO exceptions, causing backlogs.

If the backlog occurs unfortunately, subsequent events will be automatically discarded when the event queue of the Track Tracking Plugin reaches the upper limit to ensure overall system stability.

You can see the words Trace Event Publish failed, event : {}, publish queue size : {} in nacos.log when the discard occurred

Appendix: Sub-trace Event Details

RegisterInstanceTraceEvent

Since 2.2.0.

type: REGISTER_INSTANCE_TRACE_EVENT

Extra Content:

Field NameDescription
clientIpThe source IP of registering service instance request, probably null.
rpcWhether the source request is gRPC, true when request is gRPC, false is HTTP.
instanceIpThe IP or Host of service instance registered
instancePortThe Port of service instance registered

DeregisterInstanceTraceEvent

Since 2.2.0.

type: DEREGISTER_INSTANCE_TRACE_EVENT

Extra Content:

Field NameDescription
clientIpThe source IP of de-registering service instance request, probably null.
reasonThe reason of de-registering, details see DeregisterInstanceReason
rpcWhether the source request is gRPC, true when request is gRPC, false is HTTP.
instanceIpThe IP or Host of service instance de-registered
instancePortThe Port of service instance de-registered

DeregisterInstanceReason

ReasonDescription
REQUESTDe-registration comes from client requests, in other word, user initiated de-registration.
NATIVE_DISCONNECTEDDe-registration comes from client disconnected
SYNCED_DISCONNECTEDDe-registration comes from client disconnected in other server node, and synced from other server node.
HEARTBEAT_EXPIREDe-registration comes from heartbeat timeout for 1.X version client.

RegisterServiceTraceEvent

Since 2.2.0.

type: REGISTER_SERVICE_TRACE_EVENT

Extra Content: None

DeregisterServiceTraceEvent

Since 2.2.0.

type: DEREGISTER_SERVICE_TRACE_EVENT

Extra Content: None

SubscribeServiceTraceEvent

Since 2.2.0.

type: SUBSCRIBE_SERVICE_TRACE_EVENT

Extra Content:

Field NameDescription
clientIpThe IP of subscriber

UnsubscribeServiceTraceEvent

Since 2.2.0.

type: UNSUBSCRIBE_SERVICE_TRACE_EVENT

Extra Content:

Field NameDescription
clientIpThe IP of subscriber

PushServiceTraceEvent

Since 2.2.0.

type: PUSH_SERVICE_TRACE_EVENT

Extra Content:

Field NameDescription
clientIpThe IP of subscriber
instanceSizeThe size of service instance for this push
pushCostTimeForAllThe full cost for this push, means that the cost from start pushing to end pushing, including the wait time in combined queue and the time for executing.
pushCostTimeForNetWorkThe network cost for this push, means that the cost from executing to end pushing, only including the network cost.
serviceLevelAgreementTimeThe actual cost for this push, means the cost from services changeing to end pushing. It’s a reference value not accuracy.

HealthStateChangeTraceEvent

Since 2.2.0.

type: HEALTH_STATE_CHANGE_TRACE_EVENT

Extra Content:

Field NameDescription
instanceIpThe IP or Host of service instance changed
instancePortThe Port of service instance changed
isHealthyThe change result is healthy or not
healthCheckTypeThe type of health check
healthStateChangeReasonThe reason of healthy changed