Project

General

Profile

Add SystemtapDtrace static markers

Summary

Dtrace and SystemTap are monitoring tools, each providing a way to inspect what the processes on a computer system are doing. They both use domain-specific languages allowing a user to write scripts which:
- filter which processes are to be observed
- gather data from the processes of interest
- generate reports on the data

Ceph can be built with embedded "markers" that can be observed by a SystemTap script, making it easier to monitor what the Ceph
processes on a system are doing.

Owners

  • Haomai Wang (UnitedStack)
  • Danny Al-Gaaf (Deutsche Telekom AG)
  • Marc Koderer (Deutsche Telekom AG)
  • Name

Interested Parties

  • Guang Yang (Yahoo!)
  • Saket Sinha (Google Summer of Code 2014)

Current Status

Detailed Description

Now MySQL, PosgreSQL, Glibc, CPython and others all support static markers which benefits much for developers. Alghough Ceph already has intern "perf counter" component which plays a subset role. see (ceph.com/docs/master/dev/logs/#performance-counters). Compared to Systemtap/Dtrace, It obviously has some limitations, for example, we can't program it when running, lack of flexibility, need to write toolchains to display, visualization or format it from the result of "perf dump". What's more, Systemtap/Dtrace static markers focus on Ceph developers, and "perf counter" is likely designed for system admin.

Static markers in Ceph can be enabled with build progress, like:
./configure --enable-systemtap

At first, We would like to add markers in critical I/O path similliar to perf counter.

Work items

Identify our Use Case
Following tasks can be performed in the Linux by System Tap-
1. Track Scheduling Time -
a) Peek into kernel scheduler
b) Visualize current activity
c) Evaluate System Load
2. Tweak I/O usage - which process responsible for most I/O
3. Kernel Profiling - Record stack traces
4. Call graph tracing - Start tracing when a trigger is called, Visualize all functions that are invoked, how long each step takes

Coding tasks

  1. Add basic workable infrastruction
  2. Add more static markers

Build / release tasks

  1. Task 1
  2. Task 2
  3. Task 3

Documentation tasks

  1. Add usage and build related document

Pad

http://pad.ceph.com/p/cdsgiant-syste...static-markers
http://pad.ceph.com/p/GH-systemtap-l...static-markers

Resources

Systemtap/DTrace: LTTng: