The Parts

The time has come to reflect on my misadventures in PC building. Last christmas I bought copmonents to build my sister a midrange AM4 PC, which included the following:

  • MB : ASUS ROG STRIX Gaming II (WiFi)
  • CPU: AMD Ryzen 7 5700G
  • RAM: 4x8GB G.SKILL Trident Z
  • SSD: Samsung 980 Pro 1TB Samsung 980 Pro 2TB

Total spend, including some odds and ends (e.g. thermal paste, cooler, etc) was $589. Given that the last PC was basically a decade old, I’m perfectly happy with the cost of these components. (Last PC was AMD FX-8150, built in 2012, with a refresh of failed parts in 2018.)

Absent from the list, of course, is a GPU. She has a 1650 which is being replaced with a used 1080. (It is going to spend its retirement playing the Sims at 1080p, I’m sure it will be fine.)

The Customer Complaint

The whole reason my sister is getting a new PC is because the old one is possessed. TOPH is basically a whole new machine at this point, it has had:

  1. A CPU upgrade (8150 to 8350)
  2. A replacement motherboard (ASUS TUF to some random GIGABYTE)
  3. A replacement SSD after her old one shit itself (Sandisk to Crucial)

I think the only thing that had survived from my original build was a DVD drive and some RAM. Even with all these augmentations, problems abound:

  1. The DVD drive perpetually tries to eject itself. (???)
  2. The system is extremely slow to boot, but otherwise runs OK.
  3. It’s a bulldozer, lmao, it’s slow.

Clue 1: Dying hardware. Keep these six things in the back of your mind, I didn’t.

The Build

The system went together super easily. The motherboard has an integrated IO shield which is great, not cutting myself on that stamped piece of shit was the highlight of the build for sure. I did cut myself on the old CPU cooler, but at this point I have accepted that all PC builds require blood sacrifice.

Her PC has a lot less going on than mine, and most of the cable management was already done. Building in a midtower case for a change was actually kind of nice: it’s light, easy to maneuver, and you don’t need to worry about running out of length on cables when trying to hide them.

Everything assembled together and I booted into Memtest86+ and let it do one pass. No issues. I decided that was good enough to try a Windows install. Windows 11 installed so fast that I legitimately thought it would continue copying data after the reboot. Modern flash storage is amazing.

First Issue: Microsoft Account

You now have to make sure the PC cannot reach the internet before OOBE starts, use a secret cheat code to open an elevated command prompt (Shift+F10), and run a command (OOBE\BypassNRO.cmd) to disable the network requirement. It is imperative that you do this before Windows realizes you actually have network access, if you’re past that screen: get fucked, I guess?

This is the only way to setup Windows 11 Pro without an MS account. I knew this was a thing, but hadn’t had to deal with it yet. Absolutely ridiculous.

Fuck you, Microsoft. This is a 200$ product, in retail packaging. I should not have to jump through hoops to avoid being a revenue stream.

Second Issue: Drivers

There is an “ROG Strix Gaming WiFi” and “ROG Strix Gaming WiFi II.” This is an important distinction, one that was completely lost on me when browsing ASUS’ horrendously broken site at 1024x768 resolution.

Thankfully the main difference seems to be a Mediatek wireless chipset instead of an Intel part, and the Intel driver refused to install, so no issues there. The more terrifying part is I almost applied the wrong BIOS, but thankfully ASUS’ flash utility actually validates the board model.

I can’t believe motherboards have unnecessary sequels, these days, we truly live in interesting times.

After installing the wifi driver the machine immediately blue screened. It, of course, did this right in the middle of Windows Update. So my brand new Wndows install was now hopelessly corrupted.

Third Issue: Blue Screens

I rebooted the machine and it became increasingly unstable, as in it would run for increasingly shorter periods of time between bluescreens. Eventually it would basically boot, login, and immediately crash. I initially thought it was the WiFi card, since the problem turned up shortly after I had figured out that driver situation, so I tried disabling that in the BIOS. At first it seemed to help but later it crashed with a different blue screen.

The two exceptions I saw repeatedly were:

  • CRITICAL_PROCESS_DIED
  • UNEXPECTED_STORE_EXCEPTION

The problem with modern Google is thus, googling any bugcheck will bring you to only one of these three things:

  1. “Top 10” blogspam advising you how to use some PC cleaner malware to fix all your bluescreen woes.

  2. Microsoft Technet MVPs telling you to run sfc /scannow. Wow. So helpful.

  3. Forums buried in the internet graveyard, where the netizens require you to post a minidump before they would give any advice.

YOU FOOLS. I AM GOOGLING THE ISSUE BECAUSE THERE IS NO MINIDUMP. WHAT I WOULD GIVE FOR THE GUIDING HAND OF KERNEL MEMORY.

There was also a weird “pointer-chasey” error from Windows that I saw once. (I don’t remember the error, just that it had a memory address and was obviously some kind of memory protection violation.) - Thankfully I didn’t chase this lead too hard, because (with the power of hindsight) barking down the “bad RAM” troubleshooting tree would have been a huge waste of time.

The strangest thing to me is that I reinstalled Windows twice: once on the 980 Pro, and again on my old 950 Pro. Both times the install finished flawlessly, it was only post-install that there was any instability. This is why I kept assuming it must be some kind of driver issue, but…

Clue 2: decreasing stability over time spent powered on.

Red Herring 1: The SSD

Some bad Google advice said UNEXPECTED_STORE_EXCEPTION indicated bad SSDs. This is, of course, patently false. (It indicates an issue with Windows memory compression. Now I supposed this could hit storage, if things were being paged out. This system is doing virtually nothing and has 32GB of RAM though, so page file usage was at a cool 0.2%.)

Regardless of this being bad advice (in hindsight) I blindly followed it, and threw my old 950 Pro into the system. My reasoning was thus: this CPU only supports PCIe 3.0 on a board that otherwise supports PCIe 4.0, and I was using an SSD that supports PCIe 4.0. I figured there may be some weird compatibility issue. The 950 Pro was the only NVMe drive I have that was an old enough interface to rule that possibility out.

To my surprise the system was stable enough to run a gamut of synthetic benchmarks. I naively assumed this must have been the solution, and began to figure out Samsung’s RMA process. That is a rant for another day, but suffice it to say I’m never building a PC again with parts that are outside their retail return window.

Not wanting to wait for Samsung’s RMA process: I bought a 990 Pro to do a little swapperoo:

  1. I get a shiny new 990 Pro
  2. Sister dearest gets a 980 Pro 2TB
  3. I have a spare 980 Pro 1TB (after RMA) for my RAID array, or whatever.

I figured the story would end here, but it didn’t. I was able to play Persona 5 for a few hours with no issue. I left the system to idle overnight, newly cloned replacement SSD attached, and to my great dismay it had bluescreen’d again by morning.

Clue 3: crash with no appreciable system tasks or load.

Red Herring 2: The WiFi Card, again.

The event viewer showed some weird errors about the WiFi card, which now was refusing to operate. It showed no networks, and was “Code 10”, aka out to lunch, in Device Manager.

At this point I’m pretty convinced it must be a motherboard fault. I try to reseat everything on my lunch break: CPU, RAM, SSD, etc. I go to power on computer and: nothing. No lights, no fan spin, zilch. (When the PSU main is switched on there is a brief flicker of USB lights and that’s it.)

My heart sinks, did I bend a pin? Thermal paste in the socket? Isopropyl where it shouldn’t be? ESD damage in my haste? I’m now convinced that I just made a big mess even bigger.

I go back to work in an absolutely sour mood. I left the computer completely unplugged. (Just in case there was something shorted.) - I come back, plug it in, and it boots right up.

What. The. Fuck.

I decide I’m going to pull the whole motherboard out and build out of the case, but when I go to power off the machine it hangs for a long time. “That’s weird”, I thought, because there were no programs running and Windows update was done at this point. Eventually it shuts down and I notice the fans flicker as the machine shuts off.

Power Supply

I had just upgraded to an ATX 3 power supply, so as fortune would have it I have a spare EVGA 850 T2 kicking around. I plug that shit in on a lark because of the fan brown-out:

  1. 10 passes of MemTest86+ overnight
  2. Windows can idle on twitch.tv for 9 hours
  3. System stable under prime95 (blend) plus SSD benchmarks
  4. System stable with discrete GPU plus Heaven benchmark
  5. System idled overnight.
  6. Haven’t seen it hang during shutdown since.
  7. The MB powering on/off subjectively looks less “brown-out-y”

So in summary a bad poewr supply looked like the following failures, all simultaneously:

  1. Bad mobo (wifi/storage/pcie issues)
  2. Bad RAM (hangs, BSOD, memory protection faults)
  3. Dead short / dead CPU
  4. Bad regulation

Reflecting back on the old build this could also explain:

  1. Dead mobo
  2. Dead CPU
  3. Possessed DVD drive (which I can’t reproduce anymore)
  4. Dead SSD

The best part is I have no way to test it. The voltages as reported by the system were fine. The power supply tested off-load also seemed fine. You would have probably had to scope it, while loaded, during a transient to actually see the issue. I don’t have a high voltage differential probe, or a substitute DC load, so I ain’t doin that shit.

I think this is also the first time I’ve ever solved an issue by RGB LED fans. So, uh, thanks manufacturers for putting LEDs on everything, I guess?

Despair

“sad-jill”