Python 语言快速开始
快速开始
本文将描述您如何在 Python 中开始使用 OpenTelemetry即OTel。您将学习如何对一个简单的Python应用程序进行观测,并向控制台上报trace、log、metrics数据
前置条件
请确保您已经安装
Demo示例
下文将展示以一个简单的 Flask 应用程序接入OTel Python 探针的过程。当然,OTel Python 探针也支持Django、FastAPI等框架。有关支持框架的库的完整列表,详见:OTel Python探针插件支持列表
环境准备
首先,我们创建一个新的目录,并设置新的python环境
mkdir otel-getting-startedcd otel-getting-startedpython3 -m venv venvsource ./venv/bin/activate
使用pip安装Flask:
pip install flask
创建并启动一个HTTP服务器
新建一个名为app.py
的文件, 具体代码如下:
from random import randintfrom flask import Flask, requestimport logging
app = Flask(__name__)logging.basicConfig(level=logging.INFO)logger = logging.getLogger(__name__)
@app.route("/rolldice")def roll_dice(): player = request.args.get('player', default=None, type=str) result = str(roll()) if player: logger.warning("%s is rolling the dice: %s", player, result) else: logger.warning("Anonymous player is rolling the dice: %s", result) return result
def roll(): return randint(1, 6)
使用以下命令运行应用程序,并在您的网页浏览器中打开 http://localhost:8080/rolldice以确保它正常工作。
flask run -p 8080
接入OTel Python探针
使用无侵入方式接入OTel Python探针,您不需要更改任何代码既可拥有完整的可观测数据,详见: 无侵入注入原理.
首先需要安装opentelemetry-distro
、opentelemetry-bootstrap
、opentelemetry-instrument
三个package,具体的
Step 1. 安装opentelemetry-distro
pip install opentelemetry-distro
Step 2. 使用 opentelemetry-bootstrap
命令安装观测应用所需的埋点插件
opentelemetry-bootstrap -a install
在本示例中这将会安装观测Flask的插件。
Run 使用OTel Python探针来启动应用
export OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=trueopentelemetry-instrument \ --traces_exporter console \ --metrics_exporter console \ --logs_exporter console \ --service_name dice-server \ flask run -p 8080
在您的网页浏览器中打开 http://localhost:8080/rolldice,并多次刷新页面。过一段时间后,您应该会在控制台中看到打印出的spans,如下所示:
{ "name": "/rolldice", "context": { "trace_id": "0xdb1fc322141e64eb84f5bd8a8b1c6d1f", "span_id": "0x5c2b0f851030d17d", "trace_state": "[]" }, "kind": "SpanKind.SERVER", "parent_id": null, "start_time": "2023-10-10T08:14:32.630332Z", "end_time": "2023-10-10T08:14:32.631523Z", "status": { "status_code": "UNSET" }, "attributes": { "http.method": "GET", "http.server_name": "127.0.0.1", "http.scheme": "http", "net.host.port": 8080, "http.host": "localhost:8080", "http.target": "/rolldice?rolls=12", "net.peer.ip": "127.0.0.1", "http.user_agent": "curl/8.1.2", "net.peer.port": 58419, "http.flavor": "1.1", "http.route": "/rolldice", "http.status_code": 200 }, "events": [], "links": [], "resource": { "attributes": { "telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.17.0", "service.name": "dice-server", "telemetry.auto.version": "0.38b0" }, "schema_url": "" }}{ "body": "Anonymous player is rolling the dice: 3", "severity_number": "<SeverityNumber.WARN: 13>", "severity_text": "WARNING", "attributes": { "otelSpanID": "5c2b0f851030d17d", "otelTraceID": "db1fc322141e64eb84f5bd8a8b1c6d1f", "otelServiceName": "dice-server" }, "timestamp": "2023-10-10T08:14:32.631195Z", "trace_id": "0xdb1fc322141e64eb84f5bd8a8b1c6d1f", "span_id": "0x5c2b0f851030d17d", "trace_flags": 1, "resource": "BoundedAttributes({'telemetry.sdk.language': 'python', 'telemetry.sdk.name': 'opentelemetry', 'telemetry.sdk.version': '1.17.0', 'service.name': 'dice-server', 'telemetry.auto.version': '0.38b0'}, maxlen=None)"}
生成的 span 是对 /rolldice 路由的请求生命周期。在请求期间发出的调用链中包含相同的Trace ID 和 span ID,并通过Span Exporter导出到控制台。
发送几条请求到该端点,然后稍等片刻或终止应用程序,你将会在控制台输出中看到一些指标,例如以下内容:
{ "resource_metrics": [ { "resource": { "attributes": { "service.name": "unknown_service", "telemetry.auto.version": "0.34b0", "telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.13.0" }, "schema_url": "" }, "schema_url": "", "scope_metrics": [ { "metrics": [ { "data": { "aggregation_temporality": 2, "data_points": [ { "attributes": { "http.flavor": "1.1", "http.host": "localhost:5000", "http.method": "GET", "http.scheme": "http", "http.server_name": "127.0.0.1" }, "start_time_unix_nano": 1666077040061693305, "time_unix_nano": 1666077098181107419, "value": 0 } ], "is_monotonic": false }, "description": "measures the number of concurrent HTTP requests that are currently in-flight", "name": "http.server.active_requests", "unit": "requests" }, { "data": { "aggregation_temporality": 2, "data_points": [ { "attributes": { "http.flavor": "1.1", "http.host": "localhost:5000", "http.method": "GET", "http.scheme": "http", "http.server_name": "127.0.0.1", "http.status_code": 200, "net.host.port": 5000 }, "bucket_counts": [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], "count": 1, "explicit_bounds": [ 0, 5, 10, 25, 50, 75, 100, 250, 500, 1000 ], "max": 1, "min": 1, "start_time_unix_nano": 1666077040063027610, "sum": 1, "time_unix_nano": 1666077098181107419 } ] }, "description": "measures the duration of the inbound HTTP request", "name": "http.server.duration", "unit": "ms" } ], "schema_url": "", "scope": { "name": "opentelemetry.instrumentation.flask", "schema_url": "", "version": "0.34b0" } } ] } ]}
手动埋点与自动埋点联动
自动埋点主要用于采集一些常用组件的观测数据,如HTTP请求的信息,但是没法自动观测应用业务的数据。如果需要采集业务的观测数据,需要手动埋点,以上是进行手动埋点和自动埋点联动的例子:
手动埋点采集日志
首先更改上文中的app.py代码,获取当前trace实例,tracer,并使用tracer来新建一条新的trace。
from random import randintfrom flask import Flask
from opentelemetry import trace
# Acquire a tracertracer = trace.get_tracer("diceroller.tracer")
app = Flask(__name__)
@app.route("/rolldice")def roll_dice(): return str(roll())
def roll(): # This creates a new span that's the child of the current one with tracer.start_as_current_span("roll") as rollspan: res = randint(1, 6) rollspan.set_attribute("roll.value", res) return res
重新使用OTel Python探针启动应用:
export OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=trueopentelemetry-instrument \ --traces_exporter console \ --metrics_exporter console \ --logs_exporter console \ --service_name dice-server \ flask run -p 8080
这时在向Flask Web服务器发送请求时,您将看到两个Span,如下所示:
{ "name": "roll", "context": { "trace_id": "0x6f781c83394ed2f33120370a11fced47", "span_id": "0x623321c35b8fa837", "trace_state": "[]" }, "kind": "SpanKind.INTERNAL", "parent_id": "0x09abe52faf1d80d5", "start_time": "2023-10-10T08:18:28.679261Z", "end_time": "2023-10-10T08:18:28.679560Z", "status": { "status_code": "UNSET" }, "attributes": { "roll.value": "6" }, "events": [], "links": [], "resource": { "attributes": { "telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.17.0", "service.name": "dice-server", "telemetry.auto.version": "0.38b0" }, "schema_url": "" }}{ "name": "/rolldice", "context": { "trace_id": "0x6f781c83394ed2f33120370a11fced47", "span_id": "0x09abe52faf1d80d5", "trace_state": "[]" }, "kind": "SpanKind.SERVER", "parent_id": null, "start_time": "2023-10-10T08:18:28.678348Z", "end_time": "2023-10-10T08:18:28.679677Z", "status": { "status_code": "UNSET" }, "attributes": { "http.method": "GET", "http.server_name": "127.0.0.1", "http.scheme": "http", "net.host.port": 8080, "http.host": "localhost:8080", "http.target": "/rolldice?rolls=12", "net.peer.ip": "127.0.0.1", "http.user_agent": "curl/8.1.2", "net.peer.port": 58485, "http.flavor": "1.1", "http.route": "/rolldice", "http.status_code": 200 }, "events": [], "links": [], "resource": { "attributes": { "telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.17.0", "service.name": "dice-server", "telemetry.auto.version": "0.38b0" }, "schema_url": "" }}
roll Span的的parent_id与/rolldice的span_id相同,这表明了其父子关系。
手动采集指标
修改 app.py
文件,在代码中初始化一个meter,并使用meter创建一个Counter类型的指标,用于统计每个可能请求数量。
# These are the necessary import declarationsfrom opentelemetry import tracefrom opentelemetry import metrics
from random import randintfrom flask import Flask, requestimport logging
# Acquire a tracertracer = trace.get_tracer("diceroller.tracer")# Acquire a meter.meter = metrics.get_meter("diceroller.meter")
# Now create a counter instrument to make measurements withroll_counter = meter.create_counter( "dice.rolls", description="The number of rolls by roll value",)
app = Flask(__name__)logging.basicConfig(level=logging.INFO)logger = logging.getLogger(__name__)
@app.route("/rolldice")def roll_dice(): # This creates a new span that's the child of the current one with tracer.start_as_current_span("roll") as roll_span: player = request.args.get('player', default = None, type = str) result = str(roll()) roll_span.set_attribute("roll.value", result) # This adds 1 to the counter for the given roll value roll_counter.add(1, {"roll.value": result}) if player: logger.warn("{} is rolling the dice: {}", player, result) else: logger.warn("Anonymous player is rolling the dice: %s", result) return result
def roll(): return randint(1, 6)
重新启动应用:
export OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=trueopentelemetry-instrument \ --traces_exporter console \ --metrics_exporter console \ --logs_exporter console \ --service_name dice-server \ flask run -p 8080
当你向服务器发送请求时,你将在控制台看到请求roll的计数指标被输出,并且每次请求roll都有单独的计数:
{ "resource_metrics": [ { "resource": { "attributes": { "telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.17.0", "service.name": "dice-server", "telemetry.auto.version": "0.38b0" }, "schema_url": "" }, "scope_metrics": [ { "scope": { "name": "opentelemetry.instrumentation.flask", "version": "0.38b0", "schema_url": "" }, "metrics": [ { "name": "http.server.active_requests", "description": "measures the number of concurrent HTTP requests that are currently in-flight", "unit": "requests", "data": { "data_points": [ { "attributes": { "http.method": "GET", "http.host": "localhost:8080", "http.scheme": "http", "http.flavor": "1.1", "http.server_name": "127.0.0.1" }, "start_time_unix_nano": 1696926005694857000, "time_unix_nano": 1696926063549782000, "value": 0 } ], "aggregation_temporality": 2, "is_monotonic": false } }, { "name": "http.server.duration", "description": "measures the duration of the inbound HTTP request", "unit": "ms", "data": { "data_points": [ { "attributes": { "http.method": "GET", "http.host": "localhost:8080", "http.scheme": "http", "http.flavor": "1.1", "http.server_name": "127.0.0.1", "net.host.port": 8080, "http.status_code": 200 }, "start_time_unix_nano": 1696926005695798000, "time_unix_nano": 1696926063549782000, "count": 7, "sum": 6, "bucket_counts": [ 1, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], "explicit_bounds": [ 0.0, 5.0, 10.0, 25.0, 50.0, 75.0, 100.0, 250.0, 500.0, 750.0, 1000.0, 2500.0, 5000.0, 7500.0, 10000.0 ], "min": 0, "max": 1 } ], "aggregation_temporality": 2 } } ], "schema_url": "" }, { "scope": { "name": "diceroller.meter", "version": "", "schema_url": "" }, "metrics": [ { "name": "dice.rolls", "description": "The number of rolls by roll value", "unit": "", "data": { "data_points": [ { "attributes": { "roll.value": "5" }, "start_time_unix_nano": 1696926005695491000, "time_unix_nano": 1696926063549782000, "value": 3 }, { "attributes": { "roll.value": "6" }, "start_time_unix_nano": 1696926005695491000, "time_unix_nano": 1696926063549782000, "value": 1 }, { "attributes": { "roll.value": "1" }, "start_time_unix_nano": 1696926005695491000, "time_unix_nano": 1696926063549782000, "value": 1 }, { "attributes": { "roll.value": "3" }, "start_time_unix_nano": 1696926005695491000, "time_unix_nano": 1696926063549782000, "value": 1 }, { "attributes": { "roll.value": "4" }, "start_time_unix_nano": 1696926005695491000, "time_unix_nano": 1696926063549782000, "value": 1 } ], "aggregation_temporality": 2, "is_monotonic": true } } ], "schema_url": "" } ], "schema_url": "" } ]}
将观测数据发往 OTel Collector
OTel Collector 是大多数生产部署中一个关键的组件。以下是一些使用OTel Collector的优势:
● 一个由多个服务共享的单一可观测数据收集器,以减少切换Exporter的开销
- 在发往服务端之前可以集中处理trace,避免重复处理操作
● 可以聚合多个服务、多个主机上的Trace
除非您只有一个服务或正在进行测试,否则在生产部署中,推荐您使用收集器
配置并启动一个本地的OTel Collector
首先,将以下OTel Collector配置代码保存到 /tmp/ 目录中的文件中:
receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318exporters: # NOTE: Prior to v0.86.0 use `logging` instead of `debug`. debug: verbosity: detailedprocessors: batch:service: pipelines: traces: receivers: [otlp] exporters: [debug] processors: [batch] metrics: receivers: [otlp] exporters: [debug] processors: [batch] logs: receivers: [otlp] exporters: [debug] processors: [batch]
以上配置将使用otlp 协议来接收用户的输入,即OTel Python探针与OTel Collector之间使用otlp 协议进行通信。并将数据最终打印在OTel Collector控制台。您也可以将数据上报至于Prometheus 或者Jaeger中,详细的配置信息见:OTel Collector配置
然后运行 Docker 命令,根据此配置获取并运行OTel Collector:
docker run -p 4317:4317 \ -v /tmp/otel-collector-config.yaml:/etc/otel-collector-config.yaml \ otel/opentelemetry-collector:latest \ --config=/etc/otel-collector-config.yaml
您现在将在本地运行一个OTel Collector实例,该实例监听4317端口。
修改OTel Python 探针的启动命令,使用OTLP上报Trace和Metrics
下一步是修改命令,使其通过 OTLP 将Trace和Metrics发送到OTel Collector中,而不是打印到控制台。
首先安装OTLP exporter :
pip install opentelemetry-exporter-otlp
opentelemetry-instrument 将会检测到您刚刚安装的包,并在下次运行时默认为 OTLP 导出。
启动应用
像之前一样运行应用程序,但不要打印到控制台:
export OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=trueopentelemetry-instrument --logs_exporter otlp flask run -p 8080
默认情况下,opentelemetry-instrument 通过 OTLP/gRPC 导出追踪和指标,并将它们发送到 localhost:4317,即发送到上文启动的OTel Collector中,
此时再访问 /rolldice 地址时,你会在OTel Collector 进程中看到输出,而不是在 Flask 进程中:
2022-06-09T20:43:39.915Z DEBUG debugexporter/debug_exporter.go:51 ResourceSpans #0Resource labels: -> telemetry.sdk.language: STRING(python) -> telemetry.sdk.name: STRING(opentelemetry) -> telemetry.sdk.version: STRING(1.12.0rc1) -> telemetry.auto.version: STRING(0.31b0) -> service.name: STRING(unknown_service)InstrumentationLibrarySpans #0InstrumentationLibrary appSpan #0 Trace ID : 7d4047189ac3d5f96d590f974bbec20a Parent ID : 0b21630539446c31 ID : 4d18cee9463a79ba Name : roll Kind : SPAN_KIND_INTERNAL Start time : 2022-06-09 20:43:37.390134089 +0000 UTC End time : 2022-06-09 20:43:37.390327687 +0000 UTC Status code : STATUS_CODE_UNSET Status message :Attributes: -> roll.value: INT(5)InstrumentationLibrarySpans #1InstrumentationLibrary opentelemetry.instrumentation.flask 0.31b0Span #0 Trace ID : 7d4047189ac3d5f96d590f974bbec20a Parent ID : ID : 0b21630539446c31 Name : /rolldice Kind : SPAN_KIND_SERVER Start time : 2022-06-09 20:43:37.388733595 +0000 UTC End time : 2022-06-09 20:43:37.390723792 +0000 UTC Status code : STATUS_CODE_UNSET Status message :Attributes: -> http.method: STRING(GET) -> http.server_name: STRING(127.0.0.1) -> http.scheme: STRING(http) -> net.host.port: INT(5000) -> http.host: STRING(localhost:5000) -> http.target: STRING(/rolldice) -> net.peer.ip: STRING(127.0.0.1) -> http.user_agent: STRING(curl/7.82.0) -> net.peer.port: INT(53878) -> http.flavor: STRING(1.1) -> http.route: STRING(/rolldice) -> http.status_code: INT(200)
2022-06-09T20:43:40.025Z INFO debugexporter/debug_exporter.go:56 MetricsExporter {"#metrics": 1}2022-06-09T20:43:40.025Z DEBUG debugexporter/debug_exporter.go:66 ResourceMetrics #0Resource labels: -> telemetry.sdk.language: STRING(python) -> telemetry.sdk.name: STRING(opentelemetry) -> telemetry.sdk.version: STRING(1.12.0rc1) -> telemetry.auto.version: STRING(0.31b0) -> service.name: STRING(unknown_service)InstrumentationLibraryMetrics #0InstrumentationLibrary appMetric #0Descriptor: -> Name: roll_counter -> Description: The number of rolls by roll value -> Unit: -> DataType: Sum -> IsMonotonic: true -> AggregationTemporality: AGGREGATION_TEMPORALITY_CUMULATIVENumberDataPoints #0Data point attributes: -> roll.value: INT(5)StartTimestamp: 2022-06-09 20:43:37.390226915 +0000 UTCTimestamp: 2022-06-09 20:43:39.848587966 +0000 UTCValue: 1
如果OTel Collector配置为上报至jaeger,将会出现如下记录: