Proxmox 8 - Realtek Gbe and HP SmartArray P222 network weirdness

Absolute ProxMox n00b here, with little linux experience but a good amount of Windows experience (Sysadmin for 25+ years)

I’m trying to troubleshoot what is probably a fairly basic networking configuration issue - hoping someone can steer me in the right direction!

I got some pointers from this post by @bladewdr but am not sure yet whether that has anything to do with my issue, I’m starting to think not…

I’m running a lenovo mini desktop with a Realtek PCIe Gbe family NIC

1 x NVMe 250GB with Windows 11
1 x SATA Samsung 2.5" SSD via mobo SATA connector with Proxmox 8

Networking works fine in windows, and works initially upon first install in Proxmox 8
Router has a DHCP reservation for the Lenovo for 192.168.2.8 but I also specify the same static IP during proxmox install just to be safe

After first install and an update, I can connect via HTTP to the management interface and can see:

My /etc/network/interfaces:

auto lo
iface lo inet loopback

iface enp3s0f0 inet manual

auto vmbr0
iface vmbr0 inet static
                    address 192.168.2.8/24
                    gateway 192.168.2.1
                    bridge-ports enp3s0f0
                    bridge-stp off
                    bridge-fd 0

… and lspci displays the following:

root@proxmox:~# lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne Root Complex
... <snip>
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 7
01:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43ef
01:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 500 Series Chipset SATA Controller
01:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 500 Series Chipset Switch Upstream Port
02:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0e)
04:00.0 Non-Volatile memory controller: Sandisk Corp WD Black SN770 NVMe SSD (rev 01)
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev da)
05:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon High Definition Audio Controller
05:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
05:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
05:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
05:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor (rev 01)
05:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller
root@proxmox:~#

All is as you would expect, and there is no sign of an issue with the NIC or driver.
So, I then shutdown, plug in the RAID adapter and upon booting up things look normal, however, I can no longer connect to the management interface remotely and pinging the router (192.168.2.1) from the Proxmox local console gives me a DESTINATION HOST UNREACHABLE error.

Rebooting back to the Windows NVMe drive and the network card takes about 20 seconds to sort itself out but then I get an insertion light come on in the switch and everything is fine.

At that time Using lspci I can see the following changes (first line new entry):

01:00.0 RAID bus controller: Hewlett-Packard Company Smart Array Gen8 Controllers (rev 01)
02:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43ef
02:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 500 Series Chipset SATA Controller
02:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 500 Series Chipset Switch Upstream Port
03:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0e)
05:00.0 Non-Volatile memory controller: Sandisk Corp WD Black SN770 NVMe SSD (rev 01)

I then tried typing lspci -v -s 04: to enunciate the resources on that part of the bus and got a bunch of info including:

     Kernel driver in use:   r8169
     Kernel modules: r8169

This was what put me onto the forum post I listed at the top, but when i tried to carry out the deb command in step 2 I get a BASH error telling me the command doesn’t exist.

I have tried editing /etc/network/interfaces and rebooting as follows:

auto lo
iface lo inet loopback

iface enp4s0f0 inet manual

auto vmbr0
iface vmbr0 inet static
                    address 192.168.2.8/24
                    gateway 192.168.2.1
                    bridge-ports enp4s0f0
                    bridge-stp off
                    bridge-fd 0

… to match the updated PCIe bus resource addressing but it doesn’t seem to fix the issue.

As a complete proxmox n00b I have no idea what else to try. Is the problem likely from the Realtek driver or have I missed another file where I need to update to account for the change to the PCIe bus? Any suggestions very gratefully received

Okay, I worked out the solution to my own issue…

Apparently, once the networking code tries to treat the RAID adapter as a network card and assign an IP address to it, something breaks and either the switch is shutting the port down due to traffic weirdness, or proxmox is doing something similar.

If I edit /etc/network/interfaces to enp4s0f0 before shutting down and adding the RAID card then it works fine.

Okay, I will likely be back tomorrow once I come across my next hurdle… :slight_smile:

I’ll go through the rest of this in the morning, but step 2 in that post isn’t a command, it’s a line you need to add to /etc/apt/sources.list.

Once you add the line you need to run an apt update to refresh your package cache.

For some reason, inserting the raid controller in the pcie bus makes all the rest shift one index down on the pcie numbering. I had this happen on my workstation too. Predictable ethernet names are not predictable at all and pcie/bios/whatever plays tricks like this all the time. NetworkInterfaceNames - Debian Wiki explains how to get the old school eth0 name. Since you seem to have only one network interface, this should work fine for you, independent of having the raid card inserted or not.

As for understanding what’s going on with the raid card, dmesg may give you some hints. You should be able to see what driver it attempts to load on it, and if it is really the case that it tries to treat it as network, you may be able to blacklist that particular module.