Skip to content

Don't crash when walking callstack roots in the debugger#124402

Open
leculver wants to merge 1 commit intodotnet:mainfrom
leculver:dac-crash
Open

Don't crash when walking callstack roots in the debugger#124402
leculver wants to merge 1 commit intodotnet:mainfrom
leculver:dac-crash

Conversation

@leculver
Copy link
Contributor

Changeset 1fa1745 introduced a stack walking regression where we will crash trying to unwind/report registers. The real fix here is in cgenamd64.cpp/excep.cpp, which will point the context to a local copy before we try to use it.

I've also added some changes in gcinfodecoder.cpp that aren't strictly necessary but would have avoided the problem to begin with.

Fixes #124401.

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @agocke
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a critical crash that occurs when debuggers walk callstack roots on Linux amd64 coredumps. The root cause is that context pointers in DAC (Data Access Component) builds resolve through a cache that can evict entries before the pointers are consumed, leading to invalid memory access.

Changes:

  • Fixed AMD64 FaultingExceptionFrame to use local context pointers in DAC builds instead of frame member references
  • Extended SoftwareExceptionFrame context pointer fix from X86-only to all architectures
  • Added defensive checks in GC info decoder to use captured register values in DAC scenarios across all architectures

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/coreclr/vm/amd64/cgenamd64.cpp Added DACCESS_COMPILE guards in FaultingExceptionFrame::UpdateRegDisplay_Impl to point context pointers at local pCurrentContext copy instead of m_ctx members
src/coreclr/vm/excep.cpp Expanded DACCESS_COMPILE context pointer fix from TARGET_X86 to all architectures in SoftwareExceptionFrame::UpdateRegDisplay_Impl
src/coreclr/vm/gcinfodecoder.cpp Added DACCESS_COMPILE && TARGET_UNIX guards to use GetCapturedRegister instead of GetRegisterSlot in ReportRegisterToGC and GetStackSlot across all architectures


OBJECTREF* pObjRef = GetRegisterSlot( regNum, pRD );
OBJECTREF* pObjRef;
#if defined(DACCESS_COMPILE) && defined(TARGET_UNIX)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this Unix specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetCapturedRegister is only defined under TARGET_UNIX && !FEATURE_NATIVEAOT. I can probably change that if you want me to expand its availability. I'm not deeply familiar with this area of the codebase anymore, so I was just trying to follow conventions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli Something like this I think? I updated the PR to widen that function to all platforms.

Copy link
Member

@jkotas jkotas Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetCapturedRegister was a workaround for HelperMethodFrames. We got rid of HelperMethodFrames. GetCapturedRegister is dead code at the moment that can be deleted (or kept under DECODE_OLD_FORMATS to make it clear that is not used anymore).

Does the debugger fix require GetCapturedRegister part of the change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh that's great! No it doesn't. Let me strip that out of here then, no need to complicate dead code. I'll retest without this to make sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've removed that side of the code. It was only an extra check here, the current changeset fixes the issue.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Copilot AI review requested due to automatic review settings February 13, 2026 21:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@janvorli
Copy link
Member

Looking at the FaultingExceptionFrame::UpdateRegDisplay_Impl variants, it seems the other architectures would suffer from the same issue.

@leculver
Copy link
Contributor Author

I'll take a look and keep working on it when I'm back in office Tuesday morning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DAC crash when walking stack roots

3 participants