Project

General

Profile

Actions

Bug #25211

closed

bug in PerfCounters

Added by hongpeng lu over 5 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

when we call PerfCounters::inc() and read_avg() at the same time, maybe the result is not what we want.

show the code here

void PerfCounters::inc(int idx, uint64_t amt)
{
  if (!m_cct->_conf->perf)
    return;

  assert(idx > m_lower_bound);
  assert(idx < m_upper_bound);
  perf_counter_data_any_d& data(m_data[idx - m_lower_bound - 1]);
  if (!(data.type & PERFCOUNTER_U64))
    return;
  if (data.type & PERFCOUNTER_LONGRUNAVG) {
    data.avgcount.inc();
    data.u64.add(amt);
    data.avgcount2.inc();
  } else {
    data.u64.add(amt);
  }
}

/// read <sum, count> safely
    pair<uint64_t,uint64_t> read_avg() const {
      uint64_t sum, count;
      do {
        count = avgcount.read();
        sum = u64.read();
      } while (avgcount2.read() != count);
      return make_pair(sum, count);
    }

ok, here is the problem, if we have two thread:
assume data.u64 = s; data.avgcount = data.avgcount2

     thread1 : read             |    thread2: inc(v)
                                |
     count   = (avgcount=n)     |    
                                |    avgcount = (n+1)
                                |    u64 = (s+v)
     sum = (s+v)                |
     count    == avgcount2 == n |    
                                |    avgcount2 = (n+1)
                               \/    

we get (s+v, n), but correct result is (s+v, n+1)

same problem:

     thread1 : read_avg()              thread2: inc(v)

                                  |    avgcount = (n+1)
                                  |    
     count = (avgcount=(n+1))     |    
     sum = (data.u64 = s)         |
                                  |    data.u64 = (s+v)   
                                  |    avgcount2 = (n+1)
     count    == avgcount2 == n+1 |
                                  \/

we get (s, n+1), but correct result is (s+v, n+1)

we should swap the position of avgcount and avgcount2:

/// read <sum, count> safely
    pair<uint64_t,uint64_t> read_avg() const {
      uint64_t sum, count;
      do {
        count = avgcount2.read();     // read avgcount2 first instead of avgcount
        sum = u64.read();
      } while (avgcount.read() != count);
      return make_pair(sum, count);
    }

Actions #2

Updated by Josh Durgin over 5 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Kefu Chai over 5 years ago

  • Description updated (diff)
Actions #4

Updated by Kefu Chai over 5 years ago

  • Assignee set to hongpeng lu
Actions #5

Updated by Kefu Chai over 5 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF