Siduction Forum

Siduction Forum => Software - Support => Topic started by: jyp on 2021/11/09, 20:42:48

Title: k10temp problem with latest kernel
Post by: jyp on 2021/11/09, 20:42:48
Tdie does not show with latest kernel. I  reconfigured lm-sensors but no result.
Not a big thing but surprising and a bit annoying.

Code: [Select]
:~$ uname -a
Linux gamma 5.15.1-1-siduction-amd64 #1 SMP PREEMPT siduction 5.15-1 (2021-11-06) x86_64 GNU/Linux

:~$ sensors
...
k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +45.2°C
Tccd1:        +31.0°C
...

Code: [Select]
~$ uname -a
Linux gamma 5.14.15-2-siduction-amd64 #1 SMP PREEMPT siduction 5.14-15.1 (2021-10-28) x86_64 GNU/Linux

:~$ sensors
...
k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +41.1°C
Tdie:         +41.1°C
Tccd1:        +30.8°C
...

Thanks for your attention
jyp
Title: Re: k10temp problem with latest kernel
Post by: samoht on 2021/11/09, 23:50:44
Confirmed here:
No sensors at all found on current kernel.

On the other hand with kernel 5.14.16-2-siduction-amd64:

Code: [Select]
$ sensors
amdgpu-pci-0400
Adapter: PCI adapter
vddgfx:      718.00 mV
vddnb:       999.00 mV
edge:         +31.0°C
power1:      1000.00 uW

nvme-pci-0300
Adapter: PCI adapter
Composite:    +36.9°C  (low  =  -0.1°C, high = +79.8°C)
                       (crit = +83.8°C)
Sensor 1:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +33.5°C
Tdie:         +33.5°C
Title: Re: k10temp problem with latest kernel
Post by: fams on 2021/11/10, 07:42:22
Same here (Linux Ryzen 5.15.1-1-siduction-amd64 #1 SMP PREEMPT siduction 5.15-1 (2021-11-06) x86_64 GNU/Linux)
Code: [Select]
amdgpu-pci-0a00
Adapter: PCI adapter
vddgfx:      943.00 mV
fan1:          10 RPM  (min =    0 RPM, max = 4600 RPM)
edge:         +45.0°C  (crit = +94.0°C, hyst = -273.1°C)
power1:       17.03 W  (cap =  48.00 W)

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +36.0°C 
Tccd1:        +32.5°C 

nvme-pci-0100
Adapter: PCI adapter
Composite:    +36.9°C  (low  = -273.1°C, high = +84.8°C)
                       (crit = +84.8°C)
Sensor 1:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)

gigabyte_wmi-virtual-0
Adapter: Virtual device
temp1:        +31.0°C 
temp2:        +30.0°C 
temp3:        +36.0°C 
temp4:        +34.0°C 
temp5:        +33.0°C 
temp6:        +41.0°C 

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +16.8°C  (crit = +20.8°C)



I remember that I read somewhere that Tdie and Tclt are linked by a fixed factor (by AMD) and gave the hottest point.
This would  be an explanation for that ommission.
Tccd1 should be the temperatur of one CoreComplexDie, so one value for each core. Question is why is only one value (for core 1) is shown.
But maybe a more competent person comments here...
Some googling may help, too   ;)
Title: Re: k10temp problem with latest kernel
Post by: unklarer on 2021/11/10, 10:01:53
Have you found this statement (https://www.hwinfo.com/forum/threads/is-cpu-tctl-value-is-still-a-tdie-value-on-ryzen.4977/) yet?
Quote
That depends on particular model and Tctl=Tdie only if the CPU doesn't use an offset (Tctl_offset).
If you see in sensors a "CPU (Tctl/Tdie)" value, it means they are same. Otherwise there will be 2 values shown Tctl and Tdie.

I don't have a Ryzen and everything is fine with me.   ;)
(https://i.imgur.com/ibo9wk9t.png) (https://i.imgur.com/ibo9wk9.png)
Code: [Select]
sensors -u
coretemp-isa-0000
Adapter: ISA adapter
Core 0:
  temp2_input: 36.000
  temp2_max: 74.000
  temp2_crit: 100.000
  temp2_crit_alarm: 0.000
Core 1:
  temp3_input: 36.000
  temp3_max: 74.000
  temp3_crit: 100.000
  temp3_crit_alarm: 0.000
Core 2:
  temp4_input: 33.000
  temp4_max: 74.000
  temp4_crit: 100.000
  temp4_crit_alarm: 0.000
Core 3:
  temp5_input: 32.000
  temp5_max: 74.000
  temp5_crit: 100.000
  temp5_crit_alarm: 0.000

radeon-pci-0100
Adapter: PCI adapter
temp1:
  temp1_input: 50.000
  temp1_crit: 120.000
  temp1_crit_hyst: 90.000

atk0110-acpi-0
Adapter: ACPI interface
Vcore Voltage:
  in0_input: 1.064
  in0_min: 0.800
  in0_max: 1.600
 +3.3 Voltage:
  in1_input: 3.248
  in1_min: 2.970
  in1_max: 3.630
 +5 Voltage:
  in2_input: 5.040
  in2_min: 4.500
  in2_max: 5.500
 +12 Voltage:
  in3_input: 11.928
  in3_min: 10.200
  in3_max: 13.800
CPU FAN Speed:
  fan1_input: 691.000
  fan1_min: 600.000
  fan1_max: 7200.000
CHASSIS1 FAN Speed:
  fan2_input: 225.000
  fan2_min: 600.000
  fan2_max: 7200.000
CHASSIS2 FAN Speed:
  fan3_input: 795.000
  fan3_min: 600.000
  fan3_max: 7200.000
POWER FAN Speed:
  fan4_input: 1430.000
  fan4_min: 600.000
  fan4_max: 7200.000
CPU Temperature:
  temp1_input: 23.500
  temp1_max: 60.000
  temp1_crit: 95.000
MB Temperature:
  temp2_input: 37.000
  temp2_max: 45.000
  temp2_crit: 95.000
Title: Re: k10temp problem with latest kernel
Post by: Mister00X on 2021/11/10, 13:02:19
I've tested this on my ryzen laptop.

With kernel 5.14.16-2 the output of sensors is:
Code: [Select]
k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +40.0°C 
Tdie:         +40.0°C 

BAT0-acpi-0
Adapter: ACPI interface
in0:          12.07 V 
curr1:            N/A 

nvme-pci-0100
Adapter: PCI adapter
Composite:    +42.9°C  (low  = -273.1°C, high = +81.8°C)
                       (crit = +84.8°C)
Sensor 1:     +42.9°C  (low  = -273.1°C, high = +65261.8°C)

amdgpu-pci-0400
Adapter: PCI adapter
vddgfx:           N/A 
vddnb:            N/A 
edge:         +40.0°C 

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +39.0°C  (crit = +125.0°C)
temp2:         +0.0°C  (crit = +200.0°C)

With kernel 5.15.1-1 it's:
Code: [Select]
k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +45.1°C 

BAT0-acpi-0
Adapter: ACPI interface
in0:          12.01 V 
curr1:            N/A 

nvme-pci-0100
Adapter: PCI adapter
Composite:    +50.9°C  (low  = -273.1°C, high = +81.8°C)
                       (crit = +84.8°C)
Sensor 1:     +50.9°C  (low  = -273.1°C, high = +65261.8°C)

amdgpu-pci-0400
Adapter: PCI adapter
vddgfx:           N/A 
vddnb:            N/A 
edge:         +45.0°C 

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +45.0°C  (crit = +125.0°C)
temp2:         +0.0°C  (crit = +200.0°C)

So Tdie is missing on 5.15.1-1.
One side effect caused by this is that htop is no longer able to display the cpu temperature :-/

I will also test this on my ryzen desktop PC and will report back.
Title: Re: k10temp problem with latest kernel
Post by: whistler_mb on 2021/11/10, 13:21:31
On my desktop PC with an AMD 4700G it behaves the same as on Mister00X's laptop.

Code: [Select]
~$ uname -r
5.14.16-1-siduction-amd64

~$ sensors
k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +49.8°C
Tdie:         +49.8°C

nvme-pci-0a00
Adapter: PCI adapter
Composite:    +38.9°C  (low  =  -0.1°C, high = +86.8°C)
                       (crit = +89.8°C)
Sensor 1:     +39.9°C  (low  = -273.1°C, high = +65261.8°C)

amdgpu-pci-0b00
Adapter: PCI adapter
vddgfx:      724.00 mV
vddnb:       724.00 mV
edge:         +41.0°C
power1:        0.00 W

Code: [Select]
~$ uname -r
5.15.1-1-siduction-amd64

~$ sensors
k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +47.2°C


nvme-pci-0a00
Adapter: PCI adapter
Composite:    +37.9°C  (low  =  -0.1°C, high = +86.8°C)
                       (crit = +89.8°C)
Sensor 1:     +37.9°C  (low  = -273.1°C, high = +65261.8°C)

amdgpu-pci-0b00
Adapter: PCI adapter
vddgfx:      731.00 mV
vddnb:       937.00 mV
edge:         +38.0°C
power1:      1000.00 uW
Title: Re: k10temp problem with latest kernel
Post by: fams on 2021/11/10, 14:11:33
Tdie and Tctl are linked by a fixed factor (in your case 0), so omission of one seems to be correct.
What is missing is Tccd1 (and others...) that seem to give the real (not peak) cpu core temperature(s).
On my laptop with Bullseye Kernel 5.10.0 and Ryzen 7 5700u I have only Tdie and Tctl with the same value, too.
Title: Re: k10temp problem with latest kernel
Post by: Mister00X on 2021/11/10, 17:14:46
So I have now also testet it on my ryzen desktop PC Tdie is missing on the 5.15.1 Kernel too and with it the temperatures in htop.

Edit: This appears to be intended though according to https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/hwmon/k10temp.c?id=02a2484cf8d17a2acf3b9b151147bafaa55ad38c

Title: Re: k10temp problem with latest kernel
Post by: jyp on 2021/11/10, 18:48:30
To say the least, it seems that, on amd ryzen, temperatures monitoring is not very reliable.

For instance, issuing <sensors>, I get _Tccd1 = +29.0°C_ and conky shows _Tccd1 = +37.0°C_ using the same source (/usr/bin/sensors | grep Tccd1). Really strange.
Title: Re: k10temp problem with latest kernel
Post by: samoht on 2021/11/11, 00:13:50
Remaining question:
What is the reason for the different results with the two kernels?

Code: [Select]
CPU: Quad Core AMD Ryzen 3 PRO 4350GE with Radeon Graphics
Legacy kernel:

Code: [Select]
$ uname -a
Linux tuxxy2-sid 5.14.16-2-siduction-amd64 #1 SMP PREEMPT siduction 5.14-16.1 (2021-11-04) x86_64 GNU/Linux

$ sensors
amdgpu-pci-0400
Adapter: PCI adapter
vddgfx:      718.00 mV
vddnb:       724.00 mV
edge:         +29.0°C
power1:        0.00 W

nvme-pci-0300
Adapter: PCI adapter
Composite:    +36.9°C  (low  =  -0.1°C, high = +79.8°C)
                       (crit = +83.8°C)
Sensor 1:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +32.4°C
Tdie:         +32.4°C

$ inxi -s
Sensors:   System Temperatures: cpu: 40.0 C mobo: N/A gpu: amdgpu temp: 31.0 C
           Fan Speeds (RPM): N/A

Current kernel:

Code: [Select]
$ uname -a
Linux tuxxy2-sid 5.15.1-3-siduction-amd64 #1 SMP PREEMPT siduction 5.15-1.2 (2021-11-10) x86_64 GNU/Linux

$ sensors
amdgpu-pci-0400
Adapter: PCI adapter
vddgfx:      718.00 mV
vddnb:       999.00 mV
edge:         +28.0°C
power1:      1000.00 uW

nvme-pci-0300
Adapter: PCI adapter
Composite:    +37.9°C  (low  =  -0.1°C, high = +79.8°C)
                       (crit = +83.8°C)
Sensor 1:     +37.9°C  (low  = -273.1°C, high = +65261.8°C)

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +31.1°C

$ inxi -s
Sensors:   Message: No sensor data found. Is lm-sensors configured?
Title: Re: k10temp problem with latest kernel
Post by: unklarer on 2021/11/11, 09:39:27
Quote from: samoht
Remaining question:
What is the reason for the different results with the two kernels?

You may read the linked pages....    ;D
Title: Re: k10temp problem with latest kernel
Post by: samoht on 2021/11/12, 08:29:56
Quote
Remaining question:
What is the reason for the different results with the two kernels?

Quote
$ inxi -s
Sensors:   System Temperatures: cpu: 40.0 C mobo: N/A gpu: amdgpu temp: 31.0 C
           Fan Speeds (RPM): N/A

versus

Quote
$ inxi -s
Sensors:   Message: No sensor data found. Is lm-sensors configured?
Title: Re: k10temp problem with latest kernel
Post by: unklarer on 2021/11/12, 10:02:09
I had meant that it was because of the 'unification'
of the values Tctl and Tdie for certain models.

However, you mean something else.

1. Linux has NEVER been at the table for the production of new hardware at the corporations.

2. you have not configured sensors.

3. it might be worth looking at hwmon, which seems the better choice for laptop machines.
Title: Re: k10temp problem with latest kernel
Post by: samoht on 2021/11/13, 00:13:32
Thanks for trying to help.

Quote
... you have not configured sensors.

No, that message comes only with the current kernels

Quote
$ inxi -s
Sensors:   Message: No sensor data found. Is lm-sensors configured?

but not with the kernel 14 releases.
Title: Re: k10temp problem with latest kernel
Post by: unklarer on 2021/11/13, 10:37:03
Sorry, what's wrong with running sensors-detect again?  Or have you already done that?

You don't say anything about hwmon either. I would be interested in the output of this command:
Code: [Select]
$ for m in /sys/class/hwmon/* ; do echo -n "$m = " ; cat $m/name ; done
Everything with the "new" kernels, of course.
Title: Re: k10temp problem with latest kernel
Post by: whistler_mb on 2021/11/13, 14:53:24
Sensors-detect don't find any sensors.

Code: [Select]
~# uname -r
5.15.2-1-siduction-amd64

Code: [Select]
~# sensors-detect
# sensors-detect version 3.6.0
# System: LENOVO 90Q3004BGE [IdeaCentre 5 14ARE05]
# Board: LENOVO 3728
# Kernel: 5.15.2-1-siduction-amd64 x86_64
# Processor: AMD Ryzen 7 4700G with Radeon Graphics (23/96/1)

This program will help you determine which kernel modules you need
to load to use lm_sensors most effectively. It is generally safe
and recommended to accept the default answers to all questions,
unless you know what you're doing.

Some south bridges, CPUs or memory controllers contain embedded sensors.
Do you want to scan for them? This is totally safe. (YES/no):
Module cpuid loaded successfully.
Silicon Integrated Systems SIS5595...                       No
VIA VT82C686 Integrated Sensors...                          No
VIA VT8231 Integrated Sensors...                            No
AMD K8 thermal sensors...                                   No
AMD Family 10h thermal sensors...                           No
AMD Family 11h thermal sensors...                           No
AMD Family 12h and 14h thermal sensors...                   No
AMD Family 15h thermal sensors...                           No
AMD Family 16h thermal sensors...                           No
AMD Family 17h thermal sensors...                           No
AMD Family 15h power sensors...                             No
AMD Family 16h power sensors...                             No
Hygon Family 18h thermal sensors...                         No
Intel digital thermal sensor...                             No
Intel AMB FB-DIMM thermal sensor...                         No
Intel 5500/5520/X58 thermal sensor...                       No
VIA C7 thermal sensor...                                    No
VIA Nano thermal sensor...                                  No

Some Super I/O chips contain embedded sensors. We have to write to
standard I/O ports to probe them. This is usually safe.
Do you want to scan for Super I/O sensors? (YES/no):
Probing for Super-I/O at 0x2e/0x2f
Trying family `National Semiconductor/ITE'...               No
Trying family `SMSC'...                                     No
Trying family `VIA/Winbond/Nuvoton/Fintek'...               No
Trying family `ITE'...                                      Yes
Found unknown chip with ID 0x8638
Probing for Super-I/O at 0x4e/0x4f
Trying family `National Semiconductor/ITE'...               No
Trying family `SMSC'...                                     No
Trying family `VIA/Winbond/Nuvoton/Fintek'...               No
Trying family `ITE'...                                      No

Some systems (mainly servers) implement IPMI, a set of common interfaces
through which system health data may be retrieved, amongst other things.
We first try to get the information from SMBIOS. If we don't find it
there, we have to read from arbitrary I/O ports to probe for such
interfaces. This is normally safe. Do you want to scan for IPMI
interfaces? (YES/no):
Probing for `IPMI BMC KCS' at 0xca0...                      No
Probing for `IPMI BMC SMIC' at 0xca8...                     No

Some hardware monitoring chips are accessible through the ISA I/O ports.
We have to write to arbitrary I/O ports to probe them. This is usually
safe though. Yes, you do have ISA I/O ports even if you do not have any
ISA slots! Do you want to scan the ISA I/O ports? (YES/no):
Probing for `National Semiconductor LM78' at 0x290...       No
Probing for `National Semiconductor LM79' at 0x290...       No
Probing for `Winbond W83781D' at 0x290...                   No
Probing for `Winbond W83782D' at 0x290...                   No

Lastly, we can probe the I2C/SMBus adapters for connected hardware
monitoring devices. This is the most risky part, and while it works
reasonably well on most systems, it has been reported to cause trouble
on some systems.
Do you want to probe the I2C/SMBus adapters now? (YES/no):
Using driver `i2c-piix4' for device 0000:00:14.0: AMD KERNCZ SMBus
Module i2c-dev loaded successfully.

Next adapter: SMBus PIIX4 adapter port 0 at 0b00 (i2c-0)
Do you want to scan it? (YES/no/selectively):
Client found at address 0x52
Probing for `Analog Devices ADM1033'...                     No
Probing for `Analog Devices ADM1034'...                     No
Probing for `SPD EEPROM'...                                 Yes
    (confidence 8, not a hardware monitoring chip)
Client found at address 0x53
Probing for `Analog Devices ADM1033'...                     No
Probing for `Analog Devices ADM1034'...                     No
Probing for `SPD EEPROM'...                                 Yes
    (confidence 8, not a hardware monitoring chip)

Next adapter: SMBus PIIX4 adapter port 2 at 0b00 (i2c-1)
Do you want to scan it? (YES/no/selectively):

Next adapter: AMDGPU DM i2c hw bus 0 (i2c-2)
Do you want to scan it? (yes/NO/selectively): yes

Next adapter: AMDGPU DM i2c hw bus 1 (i2c-3)
Do you want to scan it? (yes/NO/selectively): yes
Client found at address 0x49
Probing for `National Semiconductor LM75'...                No
Probing for `National Semiconductor LM75A'...               No
Probing for `Dallas Semiconductor DS75'...                  No
Probing for `National Semiconductor LM77'...                No
Probing for `Analog Devices ADT7410/ADT7420'...             No
Probing for `Maxim MAX6642'...                              No
Probing for `Texas Instruments TMP435'...                   No
Probing for `National Semiconductor LM73'...                No
Probing for `National Semiconductor LM92'...                No
Probing for `National Semiconductor LM76'...                No
Probing for `Maxim MAX6633/MAX6634/MAX6635'...              No
Probing for `NXP/Philips SA56004'...                        No
Probing for `SMSC EMC1023'...                               No
Probing for `SMSC EMC1043'...                               No
Probing for `SMSC EMC1053'...                               No
Probing for `SMSC EMC1063'...                               No
Client found at address 0x4a
Probing for `National Semiconductor LM75'...                No
Probing for `National Semiconductor LM75A'...               No
Probing for `Dallas Semiconductor DS75'...                  No
Probing for `National Semiconductor LM77'...                No
Probing for `Analog Devices ADT7410/ADT7420'...             No
Probing for `Analog Devices ADT7411'...                     No
Probing for `Maxim MAX6642'...                              No
Probing for `Texas Instruments TMP435'...                   No
Probing for `National Semiconductor LM73'...                No
Probing for `National Semiconductor LM92'...                No
Probing for `National Semiconductor LM76'...                No
Probing for `Maxim MAX6633/MAX6634/MAX6635'...              No
Probing for `NXP/Philips SA56004'...                        No

Next adapter: AMDGPU DM i2c hw bus 2 (i2c-4)
Do you want to scan it? (yes/NO/selectively): yes

Next adapter: AMDGPU DM aux hw bus 0 (i2c-5)
Do you want to scan it? (yes/NO/selectively): yes

Sorry, no sensors were detected.
Either your system has no sensors, or they are not supported, or
they are connected to an I2C or SMBus adapter that is not
supported. If you find out what chips are on your board, check
https://hwmon.wiki.kernel.org/device_support_status for driver status.

And here the output of hwmon
Code: [Select]
~# for m in /sys/class/hwmon/* ; do echo -n "$m = " ; cat $m/name ; done
/sys/class/hwmon/hwmon0 = nvme
/sys/class/hwmon/hwmon1 = k10temp
/sys/class/hwmon/hwmon2 = amdgpu
Title: Re: k10temp problem with latest kernel
Post by: samoht on 2021/11/13, 15:38:15
My results, confirming @whistler_mb, on my similar hard- and software:

Code: [Select]
$ uname -r
5.15.2-1-siduction-amd64

Code: [Select]
# sensors-detect
# sensors-detect version 3.6.0
# System: LENOVO 11JJ0002GE [ThinkCentre M75q Gen 2]
# Board: LENOVO 3190
# Kernel: 5.15.2-1-siduction-amd64 x86_64
# Processor: AMD Ryzen 3 PRO 4350GE with Radeon Graphics (23/96/1)

This program will help you determine which kernel modules you need
to load to use lm_sensors most effectively. It is generally safe
and recommended to accept the default answers to all questions,
unless you know what you're doing.

Some south bridges, CPUs or memory controllers contain embedded sensors.
Do you want to scan for them? This is totally safe. (YES/no):
Module cpuid loaded successfully.
Silicon Integrated Systems SIS5595...                       No
VIA VT82C686 Integrated Sensors...                          No
VIA VT8231 Integrated Sensors...                            No
AMD K8 thermal sensors...                                   No
AMD Family 10h thermal sensors...                           No
AMD Family 11h thermal sensors...                           No
AMD Family 12h and 14h thermal sensors...                   No
AMD Family 15h thermal sensors...                           No
AMD Family 16h thermal sensors...                           No
AMD Family 17h thermal sensors...                           No
AMD Family 15h power sensors...                             No
AMD Family 16h power sensors...                             No
Hygon Family 18h thermal sensors...                         No
Intel digital thermal sensor...                             No
Intel AMB FB-DIMM thermal sensor...                         No
Intel 5500/5520/X58 thermal sensor...                       No
VIA C7 thermal sensor...                                    No
VIA Nano thermal sensor...                                  No

Some Super I/O chips contain embedded sensors. We have to write to
standard I/O ports to probe them. This is usually safe.
Do you want to scan for Super I/O sensors? (YES/no):
Probing for Super-I/O at 0x2e/0x2f
Trying family `National Semiconductor/ITE'...               No
Trying family `SMSC'...                                     No
Trying family `VIA/Winbond/Nuvoton/Fintek'...               No
Trying family `ITE'...                                      Yes
Found unknown chip with ID 0x8638
Probing for Super-I/O at 0x4e/0x4f
Trying family `National Semiconductor/ITE'...               No
Trying family `SMSC'...                                     No
Trying family `VIA/Winbond/Nuvoton/Fintek'...               No
Trying family `ITE'...                                      No

Some systems (mainly servers) implement IPMI, a set of common interfaces
through which system health data may be retrieved, amongst other things.
We first try to get the information from SMBIOS. If we don't find it
there, we have to read from arbitrary I/O ports to probe for such
interfaces. This is normally safe. Do you want to scan for IPMI
interfaces? (YES/no):
Probing for `IPMI BMC KCS' at 0xca0...                      No
Probing for `IPMI BMC SMIC' at 0xca8...                     No

Some hardware monitoring chips are accessible through the ISA I/O ports.
We have to write to arbitrary I/O ports to probe them. This is usually
safe though. Yes, you do have ISA I/O ports even if you do not have any
ISA slots! Do you want to scan the ISA I/O ports? (YES/no):
Probing for `National Semiconductor LM78' at 0x290...       No
Probing for `National Semiconductor LM79' at 0x290...       No
Probing for `Winbond W83781D' at 0x290...                   No
Probing for `Winbond W83782D' at 0x290...                   No

Lastly, we can probe the I2C/SMBus adapters for connected hardware
monitoring devices. This is the most risky part, and while it works
reasonably well on most systems, it has been reported to cause trouble
on some systems.
Do you want to probe the I2C/SMBus adapters now? (YES/no):
Using driver `i2c-piix4' for device 0000:00:14.0: AMD KERNCZ SMBus
Module i2c-dev loaded successfully.

Next adapter: SMBus PIIX4 adapter port 0 at 0b00 (i2c-0)
Do you want to scan it? (YES/no/selectively):
Client found at address 0x50
Probing for `Analog Devices ADM1033'...                     No
Probing for `Analog Devices ADM1034'...                     No
Probing for `SPD EEPROM'...                                 Yes
    (confidence 8, not a hardware monitoring chip)
Probing for `EDID EEPROM'...                                No
Client found at address 0x51
Probing for `Analog Devices ADM1033'...                     No
Probing for `Analog Devices ADM1034'...                     No
Probing for `SPD EEPROM'...                                 Yes
    (confidence 8, not a hardware monitoring chip)

Next adapter: SMBus PIIX4 adapter port 2 at 0b00 (i2c-1)
Do you want to scan it? (YES/no/selectively):

Next adapter: AMDGPU DM i2c hw bus 0 (i2c-2)
Do you want to scan it? (yes/NO/selectively): yes
Client found at address 0x49
Probing for `National Semiconductor LM75'...                No
Probing for `National Semiconductor LM75A'...               No
Probing for `Dallas Semiconductor DS75'...                  No
Probing for `National Semiconductor LM77'...                No
Probing for `Analog Devices ADT7410/ADT7420'...             No
Probing for `Maxim MAX6642'...                              No
Probing for `Texas Instruments TMP435'...                   No
Probing for `National Semiconductor LM73'...                No
Probing for `National Semiconductor LM92'...                No
Probing for `National Semiconductor LM76'...                No
Probing for `Maxim MAX6633/MAX6634/MAX6635'...              No
Probing for `NXP/Philips SA56004'...                        No
Probing for `SMSC EMC1023'...                               No
Probing for `SMSC EMC1043'...                               No
Probing for `SMSC EMC1053'...                               No
Probing for `SMSC EMC1063'...                               No

Next adapter: AMDGPU DM i2c hw bus 1 (i2c-3)
Do you want to scan it? (yes/NO/selectively): yes

Next adapter: AMDGPU DM i2c hw bus 2 (i2c-4)
Do you want to scan it? (yes/NO/selectively): yes

Next adapter: AMDGPU DM aux hw bus 0 (i2c-5)
Do you want to scan it? (yes/NO/selectively): yes

Next adapter: AMDGPU DM aux hw bus 2 (i2c-6)
Do you want to scan it? (yes/NO/selectively): yes

Sorry, no sensors were detected.
Either your system has no sensors, or they are not supported, or
they are connected to an I2C or SMBus adapter that is not
supported. If you find out what chips are on your board, check
https://hwmon.wiki.kernel.org/device_support_status for driver status.

Code: [Select]
# for m in /sys/class/hwmon/* ; do echo -n "$m = " ; cat $m/name ; done
/sys/class/hwmon/hwmon0 = nvme
/sys/class/hwmon/hwmon1 = k10temp
/sys/class/hwmon/hwmon2 = amdgpu

Title: Re: k10temp problem with latest kernel
Post by: unklarer on 2021/11/13, 19:24:03
^Thanks @wistler_mb, @samoht, that's at least an announcement.

So if we know from the link of @Mister00X about the unification of the values Tdie and Tctl in certain models and sensors can't determine anything further in your case, then you have to use the Tctl value accordingly in Conky, the bar, etc., as the case may be.

The one with hwmon is appropriate where it is about not keeping the CPU busy by execi commands all the time, cue laptops, and to save a little CPU.

By the way, the '$' is my mistake, because the command does not need root.   :-[
I hope all clarities are removed.   ;D
Title: Re: k10temp problem with latest kernel
Post by: jyp on 2021/11/13, 20:34:22
Thank you for keeping the thread alive
Here is some input from my new system

Code: [Select]
System:    Host: kappa Kernel: 5.15.1-3-siduction-amd64 x86_64 bits: 64 Desktop: KDE Plasma 5.23.3
           Distro: siduction 21.2.0 Farewell - kde - (202109171658)
Machine:   Type: Desktop Mobo: ASUSTeK model: ROG STRIX X570-E GAMING v: Rev X.0x serial: 210686050900556
           UEFI: American Megatrends v: 3604 date: 04/14/2021
CPU:       Info: 6-Core model: AMD Ryzen 5 5600G with Radeon Graphics bits: 64 type: MT MCP cache: L2: 3 MiB
           Speed: 2313 MHz min/max: 1400/3900 MHz Core speeds (MHz): 1: 2313 2: 2026 3: 1413 4: 1414 5: 1413 6: 1413 7: 1665
           8: 1696 9: 1414 10: 1696 11: 1413 12: 1413
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Cezanne driver: amdgpu v: kernel
           Display: x11 server: X.Org 1.20.11 driver: loaded: amdgpu,ati unloaded: fbdev,modesetting,vesa resolution:
           1: 1080x1920~60Hz 2: 2560x1440~60Hz
           OpenGL: renderer: AMD RENOIR (DRM 3.42.0 5.15.1-3-siduction-amd64 LLVM 12.0.1) v: 4.6 Mesa 21.2.5
Audio:     Device-1: Advanced Micro Devices [AMD/ATI] Renoir Radeon High Definition Audio driver: snd_hda_intel
           Device-2: Advanced Micro Devices [AMD] Family 17h HD Audio driver: snd_hda_intel
           Sound Server-1: ALSA v: k5.15.1-3-siduction-amd64 running: yes
           Sound Server-2: PipeWire v: 0.3.39 running: yes
Network:   Device-1: Intel Wi-Fi 6 AX200 driver: iwlwifi
           IF: wlan0 state: up mac: 84:1b:77:fd:b5:97
           Device-2: Realtek RTL8125 2.5GbE driver: r8169
           IF: enp5s0 state: down mac: 7c:10:c9:43:7a:5a
           Device-3: Intel I211 Gigabit Network driver: igb
           IF: enp6s0 state: down mac: 7c:10:c9:43:7a:59
Bluetooth: Device-1: Intel AX200 Bluetooth type: USB driver: btusb
           Report: hciconfig ID: hci0 state: up address: 84:1B:77:FD:B5:9B bt-v: 3.0
Drives:    Local Storage: total: 5.46 TiB used: 183.32 GiB (3.3%)
           ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 980 1TB size: 931.51 GiB
           ID-2: /dev/nvme1n1 vendor: Samsung model: SSD 980 1TB size: 931.51 GiB
           ID-3: /dev/sda vendor: Western Digital model: WD2003FZEX-00SRLA0 size: 1.82 TiB
           ID-4: /dev/sdb vendor: Western Digital model: WDS200T2B0A-00SM50 size: 1.82 TiB
Partition: ID-1: / size: 90.76 GiB used: 9.57 GiB (10.5%) fs: ext4 dev: /dev/nvme0n1p1
           ID-2: /boot/efi size: 499 MiB used: 288 KiB (0.1%) fs: vfat dev: /dev/nvme0n1p3
Swap:      ID-1: swap-1 type: partition size: 5.86 GiB used: 0 KiB (0.0%) dev: /dev/nvme0n1p2
Sensors:   Message: No sensor data found. Is lm-sensors configured?
Info:      Processes: 290 Uptime: 1h 41m Memory: 30.73 GiB used: 3.05 GiB (9.9%) Shell: Bash inxi: 3.3.07

Code: [Select]
# for m in /sys/class/hwmon/* ; do echo -n "$m = " ; cat $m/name ; done
/sys/class/hwmon/hwmon0 = nvme
/sys/class/hwmon/hwmon1 = nvme
/sys/class/hwmon/hwmon2 = k10temp
/sys/class/hwmon/hwmon3 = asus
/sys/class/hwmon/hwmon4 = iwlwifi_1
/sys/class/hwmon/hwmon5 = hidpp_battery_0
/sys/class/hwmon/hwmon6 = amdgpu

Code: [Select]
# sensors
amdgpu-pci-0b00
Adapter: PCI adapter
vddgfx:      718.00 mV
vddnb:       793.00 mV
edge:         +28.0°C 
power1:        0.00 W 

iwlwifi_1-virtual-0
Adapter: Virtual device
temp1:        +44.0°C 

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +31.9°C 

nvme-pci-0300
Adapter: PCI adapter
Composite:    +41.9°C  (low  = -273.1°C, high = +81.8°C)
                       (crit = +84.8°C)
Sensor 1:     +41.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +43.9°C  (low  = -273.1°C, high = +65261.8°C)

nvme-pci-0a00
Adapter: PCI adapter
Composite:    +33.9°C  (low  = -273.1°C, high = +81.8°C)
                       (crit = +84.8°C)
Sensor 1:     +33.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)

Title: Re: k10temp problem with latest kernel
Post by: samoht on 2021/11/13, 22:31:17
I am stubborn and out of curiosity:  ;)
I do not understand, why inxi -s shows some temperatures with kernel 14, and no temperatures at all with kernel 15?

Quote
$ inxi -s
Sensors:   System Temperatures: cpu: 40.0 C mobo: N/A gpu: amdgpu temp: 31.0 C
           Fan Speeds (RPM): N/A

versus

$ inxi -s
Sensors:   Message: No sensor data found. Is lm-sensors configured?
Title: Re: k10temp problem with latest kernel
Post by: Mister00X on 2021/11/14, 00:52:16
I assume for the same reason why htop does not display temperatures for me anymore which I think is that those application depended on k10temp or sensors providing them with the temperature Tdie. Now with Tdie being non existent on systems where its identical to Tctl those applications are looking for the wrong variable, are unable to find it and display an error or in case of htop N/A.

Title: Re: k10temp problem with latest kernel
Post by: samoht on 2021/11/14, 09:37:39
Thanks, @Mister00X, you opened my eyes,

My fault: I neglected - did not understand or had forgotten your previous posts  >:(

Quote
So I have now also tested it on my ryzen desktop PC Tdie is missing on the 5.15.1 Kernel too and with it the temperatures in htop.

Edit: This appears to be intended though according to https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/hwmon/k10temp.c?id=02a2484cf8d17a2acf3b9b151147bafaa55ad38c

Unfortunately, @jyp's statement seems to be true:

Quote
To say the least, it seems that, on amd ryzen, temperatures monitoring is not very reliable.

Thanks to all for comments,
Greetings,
Tom
Title: Re: k10temp problem with latest kernel
Post by: unklarer on 2021/11/14, 13:00:46
@Mister00X,
I didn't know that htop can also be used to monitor temperatures. Only through man htop I have now noticed. Thanks for that.
My programs for this were always glances or bpytop. But you also need sensors.   :P

@jyp,
thanks for your details.

With hwmon you could monitor 6 temperature values on your system:   ;D
- 2 SSD (nvme), but probably not worth it because they don't get hot.
- CPU (k10temp)
- MB (probably asus)
- wireless module (iwlwifi_1)
- battery (hidpp_battery_0)
- graphics card (amdgpu)

A nice place to start for the whole problem is the still up-to-date wiki of my friend Sector11. (http://conky.pitstop.free.fr/wiki/index.php5?title=Using_Sensors_(en))

And, last but not least, the temperatures of the WesternDigital hard disks can certainly be read out with hddtemp ;)  , although that is no longer true either.
Quote
hddtemp (0.3-beta15-54) unstable; urgency=medium

  hddtemp has been dead upstream for many years and is therefore in a minimal
  maintenance mode. It will be shipped in the Debian Bullseye release, but
  will not be present in the Debian Bookworm release.

  Nowadays the 'drivetemp' kernel module is a better alternative. It uses the
  Linux Hardware Monitoring kernel API (hwmon), so the temperature is returned
  the same way and using the same tools as other sensors.

  Loading this module is as easy as creating a file in the /etc/modules-load.d
  directory:

    echo drivetemp > /etc/modules-load.d/drivetemp.conf

 -- Aurelien Jarno <aurel32@debian.org>  Tue, 02 Feb 2021 20:27:44 +0100
Title: Re: k10temp problem with latest kernel
Post by: jyp on 2021/11/15, 22:03:34
@unklarer

Thank you for the link; I will carefully read that.
Also, hwmon needs to get acquainted with.


Title: Re: k10temp problem with latest kernel
Post by: unklarer on 2021/11/16, 10:38:21
^You're welcome.
I'm happy to help.   :)