Memory Corruption

After several years of lurking in various sites and blogs where professionals talk about and help each other with problems they encounter at work, I already knew that debugging memory corruption problems was one of the most difficult. Up until this point, I never realized how annoying and frustrating it could be, though.

This took me weeks to finish and I had to enlist the help of my team leader to read the disassembly because I followed about three lines of code and my brain was already dying.

The code is similar to the following:

#include <iostream>
#include <memory>
using namespace std;

class D { 
	public: D () {} 
	void CantCallThisFunc() {}
};

const shared_ptr<D> RootProblemFunc (const shared_ptr<D>& d_ptr) { 
	d_ptr->CantCallThisFunc(); 
	return d_ptr;
}

class C { 
	const shared_ptr<D>& m_d_ptr; 
public: 
	C (const shared_ptr<D>& d_ptr) : m_d_ptr(d_ptr) {}; 
	operator shared_ptr<D>() { 
		return RootProblemFunc(m_d_ptr); 
	}
};

C MakeC (const shared_ptr<D>& d_ptr) { 
	return C(d_ptr);
}

class S { 
	const shared_ptr<D> m_d_ptr;
public: 
	S (shared_ptr<D> d_ptr) : m_d_ptr(d_ptr) {} 
	shared_ptr<D> GetDPtr() const { 
		return m_d_ptr; 
	} 
	C ProblemFunc() const { 
		return MakeC(GetDPtr()); 
	}
};

int main() { 
	shared_ptr<S> s; 
	{ 
		shared_ptr<D> d (new D()); 
		s.reset(new S(d)); 
	} 
	shared_ptr<D> new_d = s->ProblemFunc(); 
	return 0;
}

Depending on your luck, this code will work or this will fail. But this is actually a [super] simplified version of code written over 4 files and 2 projects in a solution with over 3 million lines of code, and the actual version causes the program to crash every single time.

The solution? Remove the ‘&’ from the m_d_ptr member of class C.

Starting with the main function, a copy of shared_ptr<D> is passed as argument to the S object being initialized. When ProblemFunc() is called, it returns a copy of S::m_d_ptr and passes it to the MakeC() function. Up to this point, we’ve been passing around copies so there was no problem with using the shared_ptr at all.

However, the parameter of MakeC is a reference, and this is where it got confusing. MakeC has a reference of the copy returned by GetDPtr. MakeC then creates a new object C by passing another reference. This reference is stored as C::m_d_ptr and the new C object is then returned.

We see in main, though, that the returned C object is assigned to a shared_ptr<D> variable. So it then enters into the operator function. Inside the function, RootProblemFunc is called with C::m_d_ptr’s reference passed as an argument. This reference is then used to call D::CantCallThisFunc. And this is where the error occurs and the program crashes.

We are using C::m_d_ptr, which is a reference to an object that “died” the moment the MakeC function ended. Removing the ‘&’ from C::m_d_ptr will make it hold a copy of that object, instead, so we can use it even after the function has returned.

Of course there are many questions to ask before simply just removing the reference. After all, there must be reasons why that class was made to hold a reference in the first place. Or maybe there wasn’t. Perhaps all the functions should have just accepted references so that d’s life will be tied to s’s.

In any case, after some deliberation and consultation with my team leader, that was the solution I ended up with. We considered the use-cases of the particular function it was used for and decided it didn’t make much difference.

I’m just glad it’s over.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

Up ↑

%d bloggers like this: