Can two stores be reordered in such singleton implementation?

Can two stores be reordered in such singleton implementation?

In the following singleton 'get' function, can other threads see instance as not-null, but almost_done still false? (Say almost_done is initially false.)

instance

almost_done

false

Singleton *Singleton::Get() { auto tmp = instance.load(std::memory_order_relaxed); std::atomic_thread_fence(std::memory_order_acquire); if (tmp == nullptr) { std::lock_guard<std::mutex> guard(lock); tmp = instance.load(std::memory_order_relaxed); if (tmp == nullptr) { tmp = new Singleton(); almost_done.store(true, std::memory_order_relaxed); // 1 std::atomic_thread_fence(std::memory_order_release); instance.store(tmp, std::memory_order_relaxed); // 2 } } return tmp; }

If they can, why? What's the rationale?

I know nothing can "get out" of an acquire-release section, but can't 2 enter it and be reordered with 1?

2

1

I'm aware I don't need such complex techniques for thread-safe singletons in C++, and yes, there's not much sense in almost_done, this is purely for learning.

almost_done

Why wouldn't you just move this line std::lock_guard<std::mutex> guard(lock); to be the first line in the function, wouldn't that solve any syncrhonization issues
– Omid CompSCI
Jul 1 at 20:36

@OmidCompSCI as I said this is purely for learning anyway, but this is just standard double-checked locking with outer if for reducing contention after instance is initialized
– ledonter
Jul 1 at 20:39

@ledonter the return statement is missing
– Olivier Sohn
Jul 1 at 20:52

@iMajuscule fixed
– ledonter
Jul 1 at 20:54

I think the "std::atomic_thread_fence(std::memory_order_acquire);" should be written after "if (tmp == nullptr) {" : when tmp is not nullptr, we don't want to do that I guess? (I know it's just for learning but still that would make a bit more sense I think)
– Olivier Sohn
Jul 1 at 21:00

1 Answer
1

Your code shows a valid implementation of the Double-Checked-Locking pattern (DCLP).
Synchronization is handled by either the std::mutex or std::atomic::instance depending on the order in which threads enter the code.

std::mutex

std::atomic::instance

can other threads see instance as not-null, but almost_done still false?

No, this is not possible.

The DCLP pattern guarantees that all threads that perform a load-acquire (that returns a non-null value) at the beginning, are guaranteed to see instance point at valid memory and almost_done==true
because the load has synchronized with the store-release.

instance

almost_done==true

A reason one might think it is possible, is in the small window of opportunity where the first thread (#1) is holding the std::mutex while a second thread (#2) is entering the first if-statement.

std::mutex

if

Before #2 locks the std::mutex, it may observe a value for instance (still pointing at unsynchronized memory because the mutex is responsible for that, but hasn't synchronized yet).
But even if that happens (a valid scenario in this pattern), #2 will see almost_done==true since the release fence (called by #1) orders the store-relaxed to almost_done
before the store-relaxed to instance and that same order is observed by other threads.

std::mutex

instance

almost_done==true

almost_done

instance

I'm asking not asking about other threading calling the same function, I'm asking about a thread #2 just doing to relaxed loads for almost_done and then for instance, without any acquire operations. And anyway, why does a release fence prevent stores from being reordered? I didn't find anything like that in the standard (or frankly anywhere). If you'll say "by definition", please provide it, because apparently it doesn't exist in the standard :(
– ledonter
Jul 1 at 22:00

almost_done

instance

@ledonter A release fence prevents all preceding memory operations from being reordered past subsequent writes. Check out Jeff Preshing's article on the topic (or this one)
– LWimsey
Jul 1 at 22:19

that's exactly what I'm asking about! I read that article and yes, it says it as you quoted it, and a couple of other articles do too. But a lot of people are saying that actually anything (both reads and writes) can enter acquire-release section e.g. from the bottom. This is a source of my confusion. How is it defined by C++? Or is it not defined by C++ and just defined by architecture? And more importantly, what's the rationale for allowing anything enter and allowing just stores from the top / loads from the bottom? Is it patterns like this?
– ledonter
Jul 1 at 22:41

@ledonter The release fence orders the store to almost_done (in thread #1) before the store to instance.. What that means is that thread #2 observes both stores in that same order. What is does not mean is that thread #2 is allowed to use that ordering without also setting an acquire fence. The reason is that memory ordering is a 2-way street; i.e. anything done by thread #2 can also be observed out-of-order by thread #1. To guarantee a valid inter-thread happens-before relationship, you must use the synchronizes-with relationship as defined by the C++ standard.
– LWimsey
Jul 2 at 0:38

almost_done

instance

@ledonter Your question was about how other threads could see the variable, and not how they could use it. This is a very subtle difference you may not have realized that when posting the question. The C++ standard defines the synchronizes-with relationship (with both acquire and release) since that is what a C++ program needs to reliably access data shared between threads.
– LWimsey
Jul 2 at 0:42

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

peheCXh b1kQ RZaD

搜尋此網誌

Gtjkyu