I spent hours troubleshooting the NAS until I fixed the power strip that the software couldn’t do


I recently went through the difficult process of moving my home server to a new box and with that I decided to install more memory. Not just any storage, but the facility used SAS driversComplete with an LSI card I found on eBay for $40. I have very little hands-on experience with these drives, and I knew that by going this route I was exposing myself to potentially painful troubleshooting, but the solution was not what I expected. After hours of throwing software solutions at a hardware problem, I wrote a Hail Mary in the form of some electrical tape. PCIe pins and it was the miracle fix I needed.


LSI 9305-24i-HBA-5

An HBA card is a great addition to your home server if you don’t already have one

A reliable way to add more memory

The first problem was power

Not a good first symptom, but also incredibly inexplicable

home-lab-pc-build

The first problem I encountered wasn’t really a problem, but it wasn’t encouraging. After each cold boot attempt with an installed HBA card, the system repeatedly shuts down overcurrent and power cycles before finally coming back to life. Not only is this encouraging, but my 650W PSU should be able to handle 2 SAS drives and an HBA card with no auxiliary power. This was the first sign of things to come, and it would ruin the rest of my troubleshooting efforts. I had no idea if the PSU was really to blame; it was pretty old and could certainly be replaced, but I wanted to get this system up and running sooner rather than later, and I didn’t have a spare kicking in.


Person holding TP-Link 10G NIC

I used my spare PCIe slot for something boring and it turned out to be my favorite PC upgrade

This solved my biggest network problem

The HBA selection ROM was the next hurdle

Or so I thought

An image of the SAS configuration utility

Once the system actually booted, it would bypass the BIOS screen after booting and go straight to the LSI card’s ROM, which would let me list the drivers, tell me the card was working, and enter the configuration menu. If I let it try to boot into Proxmox it freezes at the BIOS screen below. Not a hard crash, no bugs, just a complete lockup.

This made me suspect that I had to remove the option ROM from the card so that the system wouldn’t try to boot it. I first tried disabling CSM and blocking any way the system could try to boot from the ROM, but nothing worked. BIOS settings were also not saved between boots, most likely due to a dead CMOS battery.

Attempting to flash the ROM took me down a multi-hour path of various methods to flash the card, only one of which eventually worked. Flashing tools struggled with me in four different environments before finally flashing the ROM via a modified EFI shell and sas2flash. I rebooted the card with the SAS address, rebooted, watched the screen appear and it hung exactly the same place. Back to square one.


A person with an Intel N100 mini computer

N100 mini PCs quietly killed the Raspberry Pi for home servers

Your Raspberry Pi home server is obsolete and the N100 is because of it

What was actually happening with my HBA card

It can happen to you

Here’s what was really going on. Along with the high-speed data lanes everyone thinks of, sits a small, slow side-channel called SMBus, tucked into a PCIe slot. This is the control bus and is not for transferring your data, but for housekeeping like reading a sensor or identifying a card.

My HBA and motherboard were both trying to use that bus during the first moments of POST, and they were stepping on each other badly enough to stop the entire boot. This does not happen with every card and motherboard combination, but it is common with special LSI cards found in consumer motherboards. My workstation computer would boot fine with the card installed indeed puzzled me until I found it Very informative video from Art of ServerI’m showing the tape mode for my particular card and it describes the failure mode I’m experiencing.

What did I have to lose? Everything else didn’t work, so I removed the electrical tape, cut a very small strip and attached the pins as described, and booted into Proxmox with my LSI card and SAS drives detected.

smbus-tape-mod-hba-card

Photo of the SMBus pins attached to the LSI card

The reason this works is very simple. The two SMBus pins on the connector, positions B5 and B6, are defined as optional by PCI-SIG, with no required behavior regardless of the add-in card. Simply covering these two pins on the card resolves the boot conflict. Kapton tape is the preferred material for doing this, but I used normal electrical tape to good effect.


45HomeLab HL15 Beast complete box

I bought a used Nvidia Tesla GPU for my home lab, then read about what the owners are actually up to

Tesla cards are the best deal when you’re looking at VRAM, but everything else makes them a bit more difficult

Before diving in, take some time to read up on the quirks of your hardware

This whole saga could have been avoided if I had done a little more research on the LSI card I bought. I researched the drivers pretty thoroughly and knew what to expect, but neglected to do enough preliminary reading about potential quirks like this. This problem is quite common and I could have saved myself many painful hours of troubleshooting. Even so, I still don’t regret going SAS drivers For the NAS, because the HBA gives me a lot of space cheap enterprise storage in the future.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *