Welcome, Guest. Please login or register.
Did you miss your activation email?

Author Topic: [EN] Problem with random rebooting  (Read 1439 times)

Offline ninefix

  • User
  • Posts: 5
[EN] Problem with random rebooting
« on: 2024/01/17, 08:01:51 »
Hi all, I've been using Siduction for many years without experimenting particular problems, but starting from 1 month ago, I'm struggling to understand because my system suddenly reboot. It happens both when I'm working on it as well as I'm far away from the PC. Randomly.

My machine type: Laptop System: SCHENKER product: SCHENKER_DOCK_15_SDO15L18

Spec:

CPU: 6-core Intel Core i7-8700T, 64GB RAM, Intel UHD Graphics 630

There is no temperature problem of CPU, I'm actively monitor the sensors and the CPU and system temperature are under 50 C°

I've already check the RAM with FULL test, and there are no problem.

I suspect that the problem is the video card

I'm experimented with different kernel and with different kernel paramenters, as mentioned in various post on the internet related to problem with Intel video card, but at the moment the problem persist.

Anyone have experimented something similar or have any hints?

Thanks so much for your time

Offline hendrikL

  • Administrator
  • User
  • *****
  • Gravatar
  • Posts: 933
Re: Problem with random rebooting
« Reply #1 on: 2024/01/17, 09:18:06 »
Please give us the output of  "inxi -Sa" and the output of "inxi -G"

Offline ninefix

  • User
  • Posts: 5
Re: Problem with random rebooting
« Reply #2 on: 2024/01/17, 09:28:46 »
Code: [Select]
inxi -Sa
System:
  Host: a0nf69 Kernel: 6.6.10-1-siduction-amd64 arch: x86_64 bits: 64
    compiler: gcc v: 13.2.0 clocksource: tsc available: hpet,acpi_pm
    parameters: BOOT_IMAGE=/boot/vmlinuz-6.6.10-1-siduction-amd64
    root=UUID=009a439f-e425-43ca-815f-7e46fbc5adfc ro quiet
    systemd.show_status=0 splash intel_idle.max_cstate=1 i915.enable_dc=0
    i915.enable_psr=0 i915.enable_guc=0 i915.enable_execlists=0
    ahci.mobile_lpm_policy=1
  Desktop: KDE Plasma v: 5.27.10 tk: Qt v: 5.15.10 wm: kwin_x11 vt: 2
    dm: SDDM Distro: siduction 17.1.0 Patience - kde - (201703051755)
    base: Debian GNU/Linux trixie/sid


Code: [Select]
inxi -G
Graphics:
  Device-1: Intel CoffeeLake-S GT2 [UHD Graphics 630] driver: i915 v: kernel
  Device-2: Generalplus 808 Camera driver: snd-usb-audio,uvcvideo type: USB
  Device-3: Chicony USB2.0 Camera driver: uvcvideo type: USB
  Display: x11 server: X.Org v: 21.1.10 with: Xwayland v: 23.2.3 driver: X:
    loaded: modesetting unloaded: fbdev,vesa dri: iris gpu: i915 resolution:
    1: 1920x1080~60Hz 2: 1920x1080~60Hz 3: N/A
  API: EGL v: 1.5 drivers: iris,swrast platforms: gbm,x11,surfaceless,device
  API: OpenGL v: 4.6 vendor: intel mesa v: 23.3.2-2 renderer: Mesa Intel
    UHD Graphics 630 (CFL GT2)

Offline dibl

  • siduction community member
  • Global Moderator
  • User
  • *****
  • Posts: 2.358
    • Land of the Buckeye
Re: Problem with random rebooting
« Reply #3 on: 2024/01/17, 16:51:14 »
Random reboots, in my experience, are almost always caused by a hardware fault. Could be internal power supply, a cold solder joint, a failing component on the system board, etc. Electronic components eventually break down from the heating and cooling cycles.
System76 Oryx Pro, Intel Core i7-11800H, SSD 970 EVO Plus;  Asus ROG STRIX X299-E, Core i7-7740X, Nvidia GTX-1060, dual monitors, SSD 860 EVO

Offline ninefix

  • User
  • Posts: 5
Re: Problem with random rebooting
« Reply #4 on: 2024/01/27, 08:59:25 »
I've changed the NMVE SSD with a new one and the problem seems to happen more rarely.

I suspect that the problem could be caused by the high temperature of NVME, because I see that can reach 85/90 C° sometimes (under load)

Offline dibl

  • siduction community member
  • Global Moderator
  • User
  • *****
  • Posts: 2.358
    • Land of the Buckeye
Re: Problem with random rebooting
« Reply #5 on: 2024/01/27, 11:39:44 »
I see that can reach 85/90 C° sometimes (under load)
;

There's the problem -- 85 is too hot, Most common electronics are rated for 0-70 or 0-80. It is shutting down to protect itself from thermal failure.

https://www.electronics-cooling.com/2004/02/the-temperature-ratings-of-electronic-parts/

https://en.m.wikipedia.org/wiki/High-temperature_operating_life

You need to make sure the fan is working and there's no accumulation of dust.
« Last Edit: 2024/01/27, 11:53:32 by dibl »
System76 Oryx Pro, Intel Core i7-11800H, SSD 970 EVO Plus;  Asus ROG STRIX X299-E, Core i7-7740X, Nvidia GTX-1060, dual monitors, SSD 860 EVO

Offline ro_sid

  • User
  • Posts: 223
Re: Problem with random rebooting
« Reply #6 on: 2024/01/27, 11:55:35 »
This is why PCIe 4.0 and more so 5.0 SSDs are (often) mixed blessings. I do not understand the hype about them - speed versus heat. You need a good cooling, which often also means a lot of space for heat sinks, as well as fans.

Offline edlin

  • User
  • Posts: 542
Re: Problem with random rebooting
« Reply #7 on: 2024/01/27, 12:31:10 »
The temperatures are too high! Samsung, for example, specifies an operating temperature of 0°C to 70°C for its 990 EVO NVMe™ M.2 SSD - 2 TB. And this range is known to me as the typical operating range for NVMe SSDs.

What temperatures do you achieve with the new SSD? If the values are that high again, then you should immediately look around for a suitable heat sink! High temperatures do not necessarily have to be immediately destructive, but they do significantly reduce the service life (Arrhenius relationship).

edlin
Der Kluge lernt aus allem und von jedem,
der Normale aus seinen Erfahrungen
und der Dumme weiß alles besser.

Sokrates

Offline ninefix

  • User
  • Posts: 5
Re: Problem with random rebooting
« Reply #8 on: 2024/01/29, 15:55:48 »
This is why PCIe 4.0 and more so 5.0 SSDs are (often) mixed blessings. I do not understand the hype about them - speed versus heat. You need a good cooling, which often also means a lot of space for heat sinks, as well as fans.

I know but it's a problem because there is no space between the SSD and notebook back cover

Offline ninefix

  • User
  • Posts: 5
Re: Problem with random rebooting
« Reply #9 on: 2024/01/29, 16:02:11 »
The temperatures are too high! Samsung, for example, specifies an operating temperature of 0°C to 70°C for its 990 EVO NVMe™ M.2 SSD - 2 TB. And this range is known to me as the typical operating range for NVMe SSDs.

What temperatures do you achieve with the new SSD? If the values are that high again, then you should immediately look around for a suitable heat sink! High temperatures do not necessarily have to be immediately destructive, but they do significantly reduce the service life (Arrhenius relationship).

edlin

I can see the temperature of LMSENSORS/NVME-PCI reaching 84C° sometimes

Furthermore, I've this in the LOG:

the following warning/error was logged by the smartd daemon:

Code: [Select]
Device: /dev/nvme0, Critical Warning (0x02): Temperature

Device info:
Samsung SSD 980 250GB, S/N:S64BNJ0R607134D, FW:1B4QFXO7, 250 GB

I don't really know what can I do. I've already cleaned notebook from dust and I use a laptop cooling pad

It's also true that I use the computer heavily, with 20/25Gb of RAM used normally

Offline ro_sid

  • User
  • Posts: 223
Re: Problem with random rebooting
« Reply #10 on: 2024/01/29, 17:55:12 »
Try to use some "cooler" SSDs. I am using PCIe 3.0 SSDs (2 TB) from Samsung in USB-housings. They run (much) cool(er) - as I can feel by just touching the housing. May be, you can borrow such an SSD to test reboot stability.
Have a look if you can make a better thermal contact between the SSD and your laptop housing, if your housing is some kind of metal/good thermal conductor.
That is all that comes to my mind, for now.