Siduction Forum
Siduction Forum => Software - Support => Topic started by: dibl on 2024/04/01, 15:34:06
-
It's been a very long time since I had any kind of trouble managing with graphics drivers, but today it feels like something has changed. This post was moved from the "Upgrades > Kernel 6.8x" item because it's not really on that topic.
At the end, I'll post a few of the search results that I have found, none of which have worked to change from nouveau to nvidia driver.
First, let me say my two desktop screens are beautiful and working perfectly with the nouveau driver, so it's not a problem for me if I can't go back to nvidia. But curiosity compels me to ask around, to find out what I've missed.
$ inxi -SG
System:
Host: dibl-MOW Kernel: 6.7.6-1-siduction-amd64 arch: x86_64 bits: 64
Desktop: KDE Plasma v: 5.27.10 Distro: siduction 22.1.2 Masters_of_War -
kde - (202303151559)
Graphics:
Device-1: NVIDIA GP106 [GeForce GTX 1060 6GB] driver: nouveau v: kernel
Display: x11 server: X.Org v: 21.1.11 with: Xwayland v: 23.2.4 driver: X:
loaded: modesetting unloaded: fbdev,vesa dri: nouveau gpu: nouveau
resolution: 1: 1920x1200~60Hz 2: 1920x1080~60Hz
API: EGL v: 1.5 drivers: nouveau,swrast
platforms: gbm,x11,surfaceless,device
API: OpenGL v: 4.5 compat-v: 4.3 vendor: mesa v: 24.0.4-1 renderer: NV136
don@dibl-MOW:~$ sudo apt policy xserver-xorg-video-nouveau
xserver-xorg-video-nouveau:
Installed: (none)
Candidate: 1:1.0.17-3
Version table:
1:1.0.17-3 500
500 https://deb.debian.org/debian unstable/main amd64 Packages
don@dibl-MOW:~$ sudo apt policy nvidia-driver
nvidia-driver:
Installed: 550.67-0siduction2
Candidate: 550.67-0siduction2
Version table:
*** 550.67-0siduction2 500
500 https://liquorix.net/siduction/fixes unstable/non-free amd64 Packages
100 /var/lib/dpkg/status
535.161.08-1 500
500 https://deb.debian.org/debian unstable/non-free amd64 Packages
don@dibl-MOW:~$ nvidia-settings
ERROR: NVIDIA driver is not loaded
(nvidia-settings:72640): GLib-GObject-CRITICAL **: 07:38:05.012: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
** (nvidia-settings:72640): CRITICAL **: 07:38:05.013: ctk_powermode_new: assertion '(ctrl_target != NULL) && (ctrl_target->h != NULL)' failed
don@dibl-MOW:~$ lsmod | grep nouveau
nouveau 2514944 42
drm_gpuvm 16384 1 nouveau
mxm_wmi 12288 1 nouveau
drm_exec 12288 1 nouveau
gpu_sched 32768 1 nouveau
i2c_algo_bit 12288 1 nouveau
drm_ttm_helper 12288 1 nouveau
ttm 69632 2 drm_ttm_helper,nouveau
drm_display_helper 143360 1 nouveau
button 16384 1 nouveau
video 57344 2 asus_wmi,nouveau
wmi 20480 5 video,asus_wmi,wmi_bmof,mxm_wmi,nouveau
So nouveau is loaded, nvidia-driver and all related packages including dkms, nvidia-kernel-dkms, firmware-misc-nonfree, and firmware-nvidia-gsp are installed.
I set /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="nouveau.blacklist=1 quiet systemd.show_status=1"
and updated grub.
I added /etc/default/modprobe.d/blacklist-nouveau.conf:
blacklist nouveau
options nouveau modeset=0
~/.xsession-errors -- Successful boot, nouveau not blacklisted:
Xsession: X session started for don at Mon Apr 1 08:22:50 AM EDT 2024
dbus-update-activation-environment: setting DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
dbus-update-activation-environment: setting DISPLAY=:0
dbus-update-activation-environment: setting XAUTHORITY=/tmp/xauth_mpNVdL
dbus-update-activation-environment: setting XDG_CURRENT_DESKTOP=KDE
localuser:don being added to access control list
dbus-update-activation-environment: setting QT_ACCESSIBILITY=1
qt.svg: /usr/share/wallpapers/mow.svg:286:33: Could not resolve property: #linearGradient2611
(This last line is related to my "live" wallpaper setup and is not a problem)
Booting with nouveau blacklisted:
Xsession: X session started for don at Mon Apr 1 07:18:41 AM EDT 2024
dbus-update-activation-environment: setting DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
dbus-update-activation-environment: setting DISPLAY=:0
dbus-update-activation-environment: setting XAUTHORITY=/tmp/xauth_AABZNz
dbus-update-activation-environment: setting XDG_CURRENT_DESKTOP=KDE
localuser:don being added to access control list
dbus-update-activation-environment: setting QT_ACCESSIBILITY=1
qt.svg: /usr/share/wallpapers/mow.svg:286:33: Could not resolve property: #linearGradient2611
kdeinit5_wrapper: Warning: connect(/run/user/1000/kdeinit5__0) failed: : No such file or directory
Error: Can not contact kdeinit5!
org.kde.startup: "kdeinit5_shutdown" () exited with code 255
startkde: Starting up...
startkde: Shutting down...
startkde: Done.
I reviewed Debian wiki:
https://www.startpage.com/sp/search?query=debian+wiki+nvidia+graphics&cat=web&pl=opensearch&language=english (https://www.startpage.com/sp/search?query=debian+wiki+nvidia+graphics&cat=web&pl=opensearch&language=english)
and
https://wiki.debian.org/NvidiaGraphicsDrivers/Troubleshooting (https://wiki.debian.org/NvidiaGraphicsDrivers/Troubleshooting)
and Debian forums.
Now I'm scratching my head. Is it old age? ;-)
-
And what looks your /etc/X11/xorg.conf.d/20-nvidia.conf ( or similar) like?
I don't know if it is still needed this days, but I've got this config with
Section "Device"
Identifier "Device 0"
Driver "nvidia"
EndSection
-
My system is a hybrid system ("optimus") only, thus I can not really tell, how it should read, but ...
kdeinit5_wrapper: Warning: connect(/run/user/1000/kdeinit5__0) failed: : No such file or directory
Error: Can not contact kdeinit5!
org.kde.startup: "kdeinit5_shutdown" () exited with code 255
... why is there no "/run/user/1000/kdeinit5__0" file? It is unlikely, that this is "nvidia"-dependent.
Also
startkde: Starting up...
startkde: Shutting down...
startkde: Done.
seems to say, that you did not start it by/from a display manager (sddm)?
[A display manager should have the same problem starting with the nvidia-driver.]
Also for me this reads "plasma...", not "...kde".
The KDE people - to my knowledge - now prefer wayland over xorg. Is a mismatch possible? System assumes xorg, but default is wayland?
Finally, you could - in terminal mode - test, if the nvidia driver module/s is/are loaded (lsmod) at all.
-
The real problem is, as long, as nouveau module is loaded, no nvidia blob can be used.
-
The real problem is, as long, as nouveau module is loaded, no nvidia blob can be used.
Exactly correct! And we all know what happens to the terminal display when I modprobe -r nouveau
. :-X
Thanks @geier0815 -- my /etc/X11/xorg.conf.d does not contain the 20-nvidia.conf file, but as far as I know, this is no longer needed. My other nvidia desktop does not have it, and it is running the nvidia driver correctly.
I have been setting "Wayland" in my sddm login for over a year with no issues on that.
Maybe I can figure out a way to use the chroot helper on the siduction ISO to unload nouveau and load nvidia.
Or maybe I go to a VESA driver first, then nvidia. But how to remove nouveau? The xserver-xorg-video-nouveau package is already removed.
-
But how to remove nouveau?
With my Ubuntu 18.04 LTS [I never upgraded beyond that because of "forced snap"], I always used
module-load=nvidia blacklist=nouveau
in the grub kernel-load line ("linux").
With Debian 11nouveau.modeset=0
With Debian 12nomodeset
Then I could "rmmod (=modprobe -r) nouveau".
May be, one of those or a combination can help you, too :).
The xserver-xorg-video-nouveau package is already removed.
You may even (re-)install that, it is just "xorg". The problematic part is the nouveau kernel driver, automatically included in the kernel package.
-
Thank you, @ro_sid. Tomorrow will bring new energy to attack this issue. ;D
-
Digging, digging, digging ...
I watched closely while attempting to install --reinstall nvidia-driver. It appears to me that the dkms process to build the nvidia module is not actually working (I have dkms and nvidia-kernel-dkms installed). Here is the output of the installation command:
Reading package lists...
Building dependency tree...
Reading state information...
The following package was automatically installed and is no longer required:
libabsl20220623t64
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 0 not upgraded.
Need to get 520 kB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 https://cdn.liquorix.net/siduction/fixes unstable/non-free amd64 nvidia-driver amd64 550.67-0siduction2 [520 kB]
Fetched 520 kB in 1s (637 kB/s)
(Reading database ... 282371 files and directories currently installed.)
Preparing to unpack .../nvidia-driver_550.67-0siduction2_amd64.deb ...
Unpacking nvidia-driver (550.67-0siduction2) over (550.67-0siduction2) ...
Setting up nvidia-driver (550.67-0siduction2) ...
root@dibl-MOW:~#
That's all of it -- no dkms module build output.
Following that exercise, here is the output of modprobe nvidia
modprobe: FATAL: Module nvidia not found in directory /lib/modules/6.7.6-1-siduction-amd64
Question: Is the 550.67 driver ONLY for kernel 6.8 and later? I'm still on 6.7.
-
Oh, well, sorry, the name of the Nvidia-Siduction-driver 550 is "nvidia-current", not just "nvidia"!
And I built the 550 driver version immediately before the actual one (supplied by towo, also) successfully with kernel version 6.7(.10), too!
[I deleted that kernel now, but I see no reason, why this version should not compile with a 6.7 kernel, also.]
-
nvidia-driver is only a metapackage, so reinstalling it, will never trigger a rebuild for the kernel modules.
dpkg-reconfigure nvidia-kernel-dkms
would do that, as also
apt install --reinstall nvidia-kernel-dkms
-
Thanks, @towo. With your hints, I was able to see it build the nvidia-current module. But, of course the GPU is already bound to the nouveau module, and so when it is unloaded the screen is lost, and apparently some keyboard input is also terminated. I attempted some commands "while blind" like
rmmod nouveau && modprobe nvidia-current && systemctl isolate multi-user.target
(didn't work). I know the keyboard is not totally dead, because Ctrl-Alt-Del will still sent the reboot signal.
-
Normaly, the nvidia packages do all needed, inclusive blacklisting nouveau.
If nouveau is loaded anyway, you have to search for the cause, maybe, your initrd was not rebuilded after nvidia install.
-
I have discovered that with "nomodeset" included in the grub command options line, it is possible to rmmod nouveau after booting to runlevel 3, tty1. This tty goes black, but then I can use another tty to log in and isolate the multi-user target, which will give you back tty1. lsmod verifies that nouveau is not loaded.
Then I can use either of towo's commands to build the nvidia modules. It shows they are installed in their correct directory.
The command modprobe nvidia-current
returns quickly to the prompt with no feedback or error message. However, lsmod shows no loaded nvidia modules.
I have rebuilt the initramfs at this point, but of course that has no effect.
So, I am now officially in the nouveau-lovers club. ;D
-
[...]
The command modprobe nvidia-current
returns quickly to the prompt with no feedback or error message. However, lsmod shows no loaded nvidia modules.
I have never experienced such a result! Either "modprobe" complained in some way (e.g. no "such" hardware available), or the corresponding module had been loaded, though not necessarily being "in use" (i.e. a "0" and no "using modules" shown by lsmod).
I have rebuilt the initramfs at this point, but of course that has no effect.
Well, if it (=nvidia(-current)) is (now) really included in the initramfs-image, you might be able to use the "module-load=nvidia-current" kernel option now with grub.
Once this really works as desired, we can get rid of all the grub options, because as @towo already pointed out, a proper install should set this up automatically. But first we need a proof of concept.
So, I am now officially in the nouveau-lovers club. ;D
Well, as far as I have read - but not experienced -, it should work fine as long as there is no "3d-fast-action" required (e.g. in games) and no GPU-computations or "-multi-media-live-decodings".
-
.... you might be able to use the "module-load=nvidia-current" kernel option now with grub.
I will test that today.
... it should work fine as long as there is no "3d-fast-action" required (e.g. in games) and no GPU-computations or "-multi-media-live-decodings".
I don't need Cuda or the most extreme capabilities of the Nvidia driver. It's just been my habit for many years to run the desktops with Nvidia graphics and their driver. It won't cause any problem to use Nouveau.
Thank you, @ro_sid, for your attention and ideas. :D
-
... if it (=nvidia(-current)) is (now) really included in the initramfs-image, you might be able to use the "module-load=nvidia-current" kernel option now with grub.
Tried this -- after unloading nouveau, and building nvidia-current, and "modprobe nviidia-current", I updated initramfs, edited the grub-cmd line, updated grub, and rebooted.
nouveau driver was automatically loaded and working beautifully. :'(
-
Thanks for reporting. I am really mystified! Your system must be something special :).
I have looked it up in "my" configuration: the package "nvidia-kernel-support" should make it impossible for the nouveau driver/module to load by blacklisting it. That package is normally installed by the nvidia driver metapackage.
At this time, I am clueless, sorry.
-
Thanks, @ro_sid, for your efforts. Today, we can be slightly less clueless.
I carefully stepped through the kubuki dance of edits to the grub command options and the /etc/modprobe.d/blacklist.nouveau, and rebuilt the nvidia modules and modprobed, and saw the same result. Frustrated, I issued modprobe nvidia*
. I was surprised to see this error output, which did not come out of modprobe nvidia-current
:
modprobe: FATAL: Module nvidia-driver_error.txt not found in directory /lib/modules/6.8.3-1-siduction-amd64
I'm way over my head on this, but does this mean something is missing from the driver package?
-
[...]
modprobe: FATAL: Module nvidia-driver_error.txt not found in directory /lib/modules/6.8.3-1-siduction-amd64
I'm way over my head on this, but does this mean something is missing from the driver package?
I do not think so. This is the contents of my directory "/lib/modules/6.8.3-1-siduction-amd64/updates/dkms":
-rw-r--r-- 1 root root 24720 Apr 4 00:06 bbswitch.ko
-rw-r--r-- 1 root root 203864 Apr 4 00:07 nvidia-current-drm.ko
-rw-r--r-- 1 root root 1758288 Apr 4 00:07 nvidia-current-modeset.ko
-rw-r--r-- 1 root root 6776 Apr 4 00:07 nvidia-current-peermem.ko
-rw-r--r-- 1 root root 2839576 Apr 4 00:07 nvidia-current-uvm.ko
-rw-r--r-- 1 root root 62502600 Apr 4 00:07 nvidia-current.ko
where bbswitch.ko is from another, although related, package.
-
I have the same 5 nvidia-x modules and the file sizes match yours.
During boot, I noticed an error about nvidia-persistenced.service not starting. I'm looking into that.