Similar to the issue #25, concurrency issues occur during "Transaction aborts" and "Reads".
1) Issue Description: graph_db::abort_transaction() can cause concurrency issues just like the #25 (closed)
2) How to reproduce?
Consider 3 transactions :
Txn-1 : add a node.
Txn-2 : Read the node.
Txn-3 : Tries to Update the node but eventually aborts ( Txn-2 and Txn-3 are concurrent transactions)
As usual Insert the below test case in "transaction_test.cpp" and comment out all remaining test cases.
4) Test case and debug code:
NOTE : In case, one wants to save time (instead of copying the test code and building it under their home directory), then go the the path "/home/arth0746/03_july_cocurr_issues/poseidon_core/build/test" on dbpmem to directly run the test case as ===>**** ./transaction_test ( of course, you should acquire permissions to access other's home directory)
`
TEST_CASE("Test concurrency between update abort and read" "[transaction]") {
#ifdef USE_PMDK auto pop = prepare_pool(); auto gdb = create_graph_db(pop); #else auto gdb = create_graph_db(); #endif
node::id_t nid = 0; barrier b1{}, b2{}, b3{};
// Just add a node
auto tx = gdb->begin_transaction(); // This is Txn-1
nid = gdb->add_node("Actor",
{{"name", boost::any(std::string("Mark Wahlberg"))},
{"age", boost::any(48)}});
gdb->commit_transaction();
// Thread 1: read the node
auto t1 = std::thread( [&] () { // This is Txn-2
// read the node
auto tx = gdb->begin_transaction();
b1.notify(); // Inform txn-3 to start
b2.wait(); // wait until Txn-3 updates
auto &n = gdb->node_by_id(nid);
b3.notify(); // inform txn-2 has read dirty version
auto nd = gdb->get_node_description(n); // at this point it crashes with **segmentation fault**
REQUIRE(nd.label == "Actor");
REQUIRE(get_property<int>(nd.properties, "age") == 48);
gdb->commit_transaction();
});
// Thread 2: update the same node
auto t2 = std::thread( [&] () { // This is Txn-3
// update the node
b1.wait(); // ensure that update starts after read Txn
auto tx = gdb->begin_transaction();
auto &n = gdb->node_by_id(nid);
gdb->update_node(n, //update
{
{ "age", boost::any(52)},
},
"Updated Actor");
b2.notify(); // Inform txn-2 that update is done but not yet committed or aborted
b3.wait(); // wait until Txn-2 has accessed a dirty version
gdb->abort_transaction(); // abort the update
});
//---------------------------------------------------------------
t1.join(); t2.join();
#ifdef USE_PMDK
drop_graph_db(pop, gdb);
#endif } `
4) The failed output :
arth0746@dbpmem:~/03_july_cocurr_issues/poseidon_core/build/test$ ./transaction_test
transaction_test is a Catch v2.9.1 host application.
Run with -? for options
Test concurrency between update abort and read[transaction]
/home/arth0746/03_july_cocurr_issues/poseidon_core/test/transaction_test.cpp:93
...............................................................................
/home/arth0746/03_july_cocurr_issues/poseidon_core/test/transaction_test.cpp:93: FAILED:
due to a fatal error condition:
SIGSEGV - Segmentation violation signal
test cases: 1 | 1 failed
assertions: 1 | 1 failed
Segmentation fault (core dumped)
===============================================================================
- What is really happening in the issue ?
-
Txn-3 starts after Txn-2.
-
Txn-3 manages to execute update_node(..) before Txn-2 has a chance to execute node_by_id(nid); As a result, Txn-2 gets a reference to dirty version.
-
Then Txn-3 executes abort_transaction(). Thus Txn-3 deletes the dirty node ( to which Txn-2 is holding a reference!!).
-
Then Txn-2 tries to execute get_node_description() on the deleted dirty node resulting in a Semgentation fault.
-
Conclusion : Thus, critical portions of the method "graph_db::abort_transaction()" must be made thread-safe just like "graph_db::commit_transaction()" [It becomes really very hard to detect such issues with 20 to 30 concurrent transactions]
-
A similar issue can occur with relationships too.