Recently my bench greeted not one, but two old and pretty exotic GPUs – Nvidia 9800GX2. For those unaware these are essentially two underclocked 9800GTXs SLI-ed in one package. Pretty cool if I do say so myself. Surprisingly they work, at least to some degree – one craps out after installing drivers, second one initializes only when it detects HDMI output.

First GPU is in not-so-great visual condition.

Physical condition is even worse.

Man, where do I even start with this one…

Firstly, this is silicone based thermal paste, which is widely known for pretty bad heat conductivity – usually about 1 W/mK compared to about 8 W/mK in case of pastes with metal additives.

Secondly, there is NO WAY that that paste on VRAM ICs did anything besides making them even hotter. Such thick application does reverse of conducting heat.

Lastly, why there is a damaged trace?

Many questions, little answers.

I started with repairing broken trace and cleaning.

I want You to look closely at the cores. One has G92-450-A2 markings and the second one has G92-451-A1. There is hardly any information on the second revision, but supposedly there are no changes to the transistors themselves, it’s only a manufacturing revision.

Time for the harder thing.

VRAM repair.

“Why harder”, I hear You ask. Because MATS in version 295 doesn’t support 3D controllers, it only accepts VGA adapters. This is a problem, since dead VRAM is on the second PCB, which identifies as a 3D controller.

My idea was to swap the VBIOS with the “correct” identifier.

There is only one problem – Nvflash didn’t let me flash the “incorrect” VBIOS file. I was back on square one, with “lspci” showing me only this output.

What did I do?

Did I give up?

Did I sell the GPU as “unknown condition”?

NOPE. I did it the old way. The hard way. The true way.

Since dictionary attack was unsuccesful I opted for a brute force solution – I replaced half the VRAM.

This is where the real problem started.

ALL voltage rails (besides 6-pin external PCI-E connector) were shorted. That could indicate the true extent of problems caused by the bad thermal paste.

Desoldered GPU chip looks perfect, but it’s completely shorted, which means that there was no moisture damage caused while replacing the VRAM. It just died.

Surely enough, after removing the core all resistances became normal again.

I can’t find anywhere a suitable core to install, however G92-420 should be close enough, so maybe later I will repair this PCB.

Second card

This Leadtek card looks not very good, but it’s still better than the Gainward one.

Moreover, it even works! – HDMI issues included, but I don’t find this a big issue worth fussing over.

Let me tell You something – this card is a major pain to disassemble – there are countless different screws, parts of the heatsink are overlapping themselves and some metal parts need to be bent before disassembly. This GPU is a nighmare to service

This card has both G92-450 cores, manufactured almost 2 months earlier.

This is a bridge chip that, long story short, allows SLI to happen. If You look closely at the bottom right corner You will find a small chip in the die. It’s so small that it’s unlikely to cause any problems.

At this point I looked closely at area near the DVI connectors, however I didn’t finy anything missing or broken, so this card will be a HDMI only GPU for now.

Surprise!

A wild blue thermopad appeared!

This card used to have the same type of extra squishy white pads on it and I was all out of thermal putty, so I decided to squish it with an ordinary kitchen rolling pin.

+1 point to creative problem solving, -10 points to intelligence.

How did I know when to stop rolling?

I didn’t. I was looking for an even spread on the thermal paste and no light leaking between the pads and VRAM ICs.

How was it?

Not good. I couldn’t adjust the thickness properly – some were too thick and some were too thin. I’m sure that given enough time and thermal pads I could get it to work, but thermal putty is just a better solution.

So I waited.

I opted for a different one and it’s overall fine. It’s a bit thicker than the Upsiren one, but it supposedly has almost twice the conductivity.

GPUs almost ready to be reassembled.

Thankfully it’s still working, even idle tempeatures are a bit lower. I have tested them for quite some time and let me tell You, these GPUs could substitute for an oven. 100 degrees Celsius wasn’t a problem to achieve in Furmark and 80+ in Unigine Heaven was normal. Additonally I was testing them during some of the hottest days is Poland and my room was almost melting from the heat.

Thanks for reading!

Leave a Reply

Your email address will not be published. Required fields are marked *