8000 Deadlock in Defects-issue783-offload1 test · Issue #31 · hypertable/hypertable · GitHub
[go: up one dir, main page]

Skip to content

Deadlock in Defects-issue783-offload1 test #31

@nuggetwheat

Description

@nuggetwheat

The scan is stuck because it can't find the range:

RANGE SERVER range not found - (a) 2[0098861..??]

This range is in the process of being loaded by rs3, but it can't update the METADATA table. Here's the stack trace:

Thread 77 (Thread 0x7f726c809700 (LWP 1474)):
#0  0x000000399d4df1b3 in poll () from /lib64/libc.so.6
#1  0x00000000007a5485 in Hypertable::TableMutatorAsyncScatterBuffer::create_redo_buffer(unsigned int) ()
#2  0x000000000079ab3f in Hypertable::TableMutatorAsync::buffer_finish(unsigned int, int, bool) ()
#3  0x00000000007a2bbd in Hypertable::TableMutatorAsyncScatterBuffer::finish() ()
#4  0x00000000008337b5 in Hypertable::TableMutatorAsyncHandler::run() ()
#5  0x0000000000794b36 in Hypertable::TableMutator::wait_for_flush_completion(Hypertable::TableMutatorAsync*) ()
#6  0x0000000000795843 in Hypertable::TableMutator::flush() ()
#7  0x0000000000592565 in Hypertable::Apps::RangeServer::load_range(Hypertable::ResponseCallback*, Hypertable::TableIdentifier const&, Hypertable::RangeSpec const&, Hypertable::RangeState const&, bool) ()
#8  0x00000000006a9d9b in Hypertable::RangeServer::Request::Handler::LoadRange::run() ()
#9  0x0000000000566c35 in Hypertable::ApplicationQueue::Worker::operator()() ()
#10 0x00007f72819dc082 in thread_proxy () from /opt/hypertable/doug/0.9.8.5/lib/libboost_thread.so.1.54.0
#11 0x000000399d8079d1 in start_thread () from /lib64/libpthread.so.0
#12 0x000000399d4e89dd in clone () from /lib64/libc.so.6

Over in rs1, it appears that the METADATA range 0/0[0/0:??..??] is stuck at the end of run_compaction blocked on the LiveFileTracker mutex. Here's a snippet of the rangeserver dump:

RANGE 0/0[0/0:??..??]
...
state=3

State 3 is RELINQUISH_LOG_INSTALLED and here's the stack trace from rs1:

Thread 34 (Thread 0x7f1964493700 (LWP 1575)):
#0  0x000000399d4df1b3 in poll () from /lib64/libc.so.6
#1  0x00000000007a5485 in Hypertable::TableMutatorAsyncScatterBuffer::create_redo_buffer(unsigned int) ()
#2  0x000000000079ab3f in Hypertable::TableMutatorAsync::buffer_finish(unsigned int, int, bool) ()
#3  0x00000000007a2bbd in Hypertable::TableMutatorAsyncScatterBuffer::finish() ()
#4  0x00000000008337b5 in Hypertable::TableMutatorAsyncHandler::run() ()
#5  0x0000000000794b36 in Hypertable::TableMutator::wait_for_flush_completion(Hypertable::TableMutatorAsync*) ()
#6  0x0000000000795843 in Hypertable::TableMutator::flush() ()
#7  0x000000000067cbdc in Hypertable::MetadataNormal::write_files(std::basic_string, std::allocator > const&, std::basic_string, std::allocator > const&, long, unsigned int) ()
#8  0x0000000000656ecd in Hypertable::LiveFileTracker::update_files_column() ()
#9  0x00000000005dcf60 in Hypertable::AccessGroup::run_compaction(int, Hypertable::AccessGroup::Hints*) ()
#10 0x00000000006964c5 in Hypertable::Range::relinquish_compact() ()
#11 0x0000000000696e18 in Hypertable::Range::relinquish() ()
#12 0x00000000005b01df in Hypertable::MaintenanceQueue::Worker::operator()() ()
#13 0x00007f19950ac082 in thread_proxy () from /opt/hypertable/doug/0.9.8.5/lib/libboost_thread.so.1.54.0
#14 0x000000399d8079d1 in start_thread () from /lib64/libpthread.so.0
#15 0x000000399d4e89dd in clone () from /lib64/libc.so.6

I've also seen it here:

Thread 34 (Thread 0x7f1964493700 (LWP 1575)):
#0  0x000000399d80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x000000399d809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x000000399d8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000566028 in boost::mutex::lock() ()
#4  0x0000000000656cc4 in Hypertable::LiveFileTracker::update_files_column() ()
#5  0x00000000005dcf60 in Hypertable::AccessGroup::run_compaction(int, Hypertable::AccessGroup::Hints*) ()
#6  0x00000000006964c5 in Hypertable::Range::relinquish_compact() ()
#7  0x0000000000696e18 in Hypertable::Range::relinquish() ()
#8  0x00000000005b01df in Hypertable::MaintenanceQueue::Worker::operator()() ()
#9  0x00007f19950ac082 in thread_proxy () from /opt/hypertable/doug/0.9.8.5/lib/libboost_thread.so.1.54.0
#10 0x000000399d8079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x000000399d4e89dd in clone () from /lib64/libc.so.6

I also see a number of threads (in rs1) stuck here:

Thread 87 (Thread 0x7f19862e3700 (LWP 1418)):
#0  0x000000399d80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x000000399d809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x000000399d8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00000000005660b8 in boost::unique_lock::lock() ()
#4  0x000000000065669f in Hypertable::LiveFileTracker::remove_references(std::vector, std::allocator >, 
std::allocator, std::allocator > > > const&) ()
#5  0x00000000005d6c40 in Hypertable::AccessGroup::release_files(std::vector, std::allocator >, std::all
ocator, std::allocator > > > const&) ()
#6  0x000000000066c057 in Hypertable::MergeScannerAccessGroup::~MergeScannerAccessGroup() ()
#7  0x000000000066c3b9 in Hypertable::MergeScannerAccessGroup::~MergeScannerAccessGroup() ()
#8  0x0000000000671556 in Hypertable::MergeScannerRange::~MergeScannerRange() ()
#9  0x0000000000563325 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() ()
#10 0x000000000058f898 in Hypertable::Apps::RangeServer::create_scanner(Hypertable::RangeServer::Response::Callback::CreateScanner*, Hypertable::TableIdentif
ier const&, Hypertable::RangeSpec const&, Hypertable::Lib::ScanSpec const&, Hypertable::QueryCache::Key*) ()
#11 0x00000000006a0bc5 in Hypertable::RangeServer::Request::Handler::CreateScanner::run() ()
#12 0x0000000000566c35 in Hypertable::ApplicationQueue::Worker::operator()() ()
#13 0x00007f19950ac082 in thread_proxy () from /opt/hypertable/doug/0.9.8.5/lib/libboost_thread.so.1.54.0
#14 0x000000399d8079d1 in start_thread () from /lib64/libpthread.so.0
#15 0x000000399d4e89dd in clone () from /lib64/libc.so.6

and here ...

Thread 70 (Thread 0x7f197b8d2700 (LWP 1456)):
#0  0x000000399d80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x000000399d809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x000000399d8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00000000005660b8 in boost::unique_lock::lock() ()
#4  0x0000000000656a36 in Hypertable::LiveFileTracker::add_references(std::vector, std::allocator >, std
::allocator, std::allocator > > > const&) ()
#5  0x00000000005db064 in Hypertable::AccessGroup::create_scanner(boost::intrusive_ptr&) ()
#6  0x000000000068b8ef in Hypertable::Range::create_scanner(boost::intrusive_ptr&, std::shared_ptr&) 
()
#7  0x000000000058f713 in Hypertable::Apps::RangeServer::create_scanner(Hypertable::RangeServer::Response::Callback::CreateScanner*, Hypertable::TableIdentif
ier const&, Hypertable::RangeSpec const&, Hypertable::Lib::ScanSpec const&, Hypertable::QueryCache::Key*) ()
#8  0x00000000006a0bc5 in Hypertable::RangeServer::Request::Handler::CreateScanner::run() ()
#9  0x0000000000566c35 in Hypertable::ApplicationQueue::Worker::operator()() ()
#10 0x00007f19950ac082 in thread_proxy () from /opt/hypertable/doug/0.9.8.5/lib/libboost_thread.so.1.54.0
#11 0x000000399d8079d1 in start_thread () from /lib64/libpthread.so.0
#12 0x000000399d4e89dd in clone () from /lib64/libc.so.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0