seduction
 Language:
Welcome, Guest. Please login or register.
Did you miss your activation email?
2020/11/28, 17:44:00


Help

Author [EN] [PL] [ES] [PT] [IT] [DE] [FR] [NL] [TR] [SR] [AR] [RU] Topic: Kernel 3.15 series -- new ATA error output  (Read 2844 times)

0 Members and 1 Guest are viewing this topic.

Offline dibl

  • siduction community member
  • Global Moderator
  • User
  • *****
  • Posts: 2.058
    • Land of the Buckeye
Kernel 3.15 series -- new ATA error output
« on: 2014/05/15, 15:40:50 »

I have been investigating a boot error message for a week or two, and I finally realized that it comes only from the 3.15rc kernel series.


The system is this:

Code: [Select]
don@imerabox:~$ inxi -v7
System:    Host: imerabox Kernel: 3.15-rc5-siduction-amd64 x86_64 (64 bit gcc: 4.9.0)
           Desktop: KDE 4.12.4 (Qt 4.8.6) info: plasma-desktop dm: lightdm
           Distro: aptosid 2011-02 Ἡμέρα - kde-lite - (201107131633)
Machine:   Mobo: ASUSTeK model: P6X58D-E v: Rev 1.xx Bios: American Megatrends v: 0803 date: 08/06/2012
CPU:       Quad core Intel Core i7 950 (-HT-MCP-) cache: 8192 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 25178
           Clock Speeds: 1: 3146 MHz 2: 3146 MHz 3: 3146 MHz 4: 3146 MHz 5: 3146 MHz 6: 3146 MHz 7: 3146 MHz
           8: 3146 MHz
Graphics:  Card: NVIDIA GF100 [GeForce GTX 480] bus-ID: 05:00.0 chip-ID: 10de:06c0
           Display Server: X.Org 1.15.1 driver: nvidia Resolution: 1920x1200@59.9hz
           GLX Renderer: GeForce GTX 480/PCIe/SSE2 GLX Version: 4.4.0 NVIDIA 337.12 Direct Rendering: Yes
Audio:     Card-1 NVIDIA GF100 High Definition Audio Controller
           driver: snd_hda_intel bus-ID: 05:00.1 chip-ID: 10de:0be5
           Card-2 Intel 82801JI (ICH10 Family) HD Audio Controller
           driver: snd_hda_intel bus-ID: 00:1b.0 chip-ID: 8086:3a3e
           Card-3 Logitech QuickCam Communicate STX driver: USB Audio usb-ID: 004-002 chip-ID: 046d:08d7
           Sound: Advanced Linux Sound Architecture v: k3.15-rc5-siduction-amd64
Network:   Card: Marvell 88E8056 PCI-E Gigabit Ethernet Controller
           driver: sky2 v: 1.30 port: d800 bus-ID: 06:00.0 chip-ID: 11ab:4364
           IF: eth0 state: up speed: 100 Mbps duplex: full mac: 20:cf:30:5c:41:1d
           WAN IP: 99.126.220.109 IF: vmnet8 ip: 172.16.27.1 ip-v6: fe80::250:56ff:fec0:8
           IF: eth0 ip: 192.168.1.70 ip-v6: fe80::22cf:30ff:fe5c:411d
           IF: vmnet1 ip: 172.16.203.1 ip-v6: fe80::250:56ff:fec0:1
Drives:    HDD Total Size: 2136.5GB (51.3% used)
           ID-1: /dev/sda model: OCZ size: 60.0GB serial: OCZ-Q17BKCW4VFG6ZBG7
           ID-2: /dev/sdb model: OCZ size: 60.0GB serial: OCZ-0C2Z27B5QEE4L0H2
           ID-3: /dev/sdd model: WDC_WD1002FAEX size: 1000.2GB serial: WD-WCAW34278337
           ID-4: /dev/sdc model: WDC_WD1002FAEX size: 1000.2GB serial: WD-WCAW34194718
           ID-5: /dev/sde model: KINGSTON_SS100S2 size: 16.0GB serial: 16GAA0002142
           Optical: /dev/sr0 model: ASUS DRW-24B1ST rev: 1.04 dev-links: cdrom,cdrw,dvd,dvdrw
           Features: speed: 48x multisession: yes
           audio: yes dvd: yes rw: cd-r,cd-rw,dvd-r,dvd-ram state: running
Partition: ID-1: / size: 19G used: 16G (89%) fs: ext4 dev: /dev/sda1
           label: N/A uuid: bea3a748-3411-4024-acd0-39f3882ddaf9
           ID-2: /mnt/REVODATA size: 55G used: 11G (20%) fs: ext4 dev: /dev/sdb1
           label: revodata uuid: ec21f5b3-7fd4-4f4b-af8d-cf787b147ae8
           ID-3: /mnt/SDA2 size: 37G used: 31G (84%) fs: ext4 dev: /dev/sda2
           label: SDA2 uuid: 8cfe2acc-7572-4b45-b25f-ed021bb1d78b
           ID-4: /boot size: 495M used: 223M (48%) fs: ext2 dev: /dev/sde1
           label: N/A uuid: ac7da829-aebb-46f0-806c-04a4d81a945a                                                                                       
           ID-5: /mnt/DATA size: 1.9T used: 951G (52%) fs: btrfs dev: /dev/sdc label: N/A uuid: N/A                                                     
           ID-6: swap-1 size: 15.48GB used: 0.00GB (0%) fs: swap dev: /dev/sde2                                                                         
           label: N/A uuid: 0d939b7d-48f1-47dd-aebe-77e7bd8c3503                                                                                       
RAID:      No RAID data: /proc/mdstat missing-is md_mod kernel module loaded?                                                                           
Unmounted: No unmounted partitions detected                                                                                                             
Sensors:   System Temperatures: cpu: 42.0C mobo: 35.0C gpu: 0.0:45C                                                                                     
           Fan Speeds (in rpm): cpu: 1962 psu: 0 sys-1: 0 sys-2: 0 sys-3: 0                                                                             
Info:      Processes: 325 Uptime: 3:43 Memory: 2074.2/5965.6MB                                                                                         
           Init: systemd v: 204 runlevel: 5 default: 5 Gcc sys: 4.8.2 alt: 4.6/4.7/4.9                                                                 
           Client: Shell (bash 4.3.111 running in konsole) inxi: 2.1.28


The hard drive layout is a little complex, because /dev/sda, an OCZ Revodrive PCIe SSD is not recognized as a bootable device and I had to add a little Kingston SSD for /boot. /dev/sdb is the second part of the OCZ SSD -- it was intended for Windows users to use as RAID0.

Code: [Select]
root@imerabox:/# blkid -c /dev/null -o list
device                                    fs_type        label           mount point                                   UUID
------------------------------------------------------------------------------------------------------------------------------------------------------------
/dev/sda1                                 ext4                           /                                             bea3a748-3411-4024-acd0-39f3882ddaf9
/dev/sda2                                 ext4           SDA2            /mnt/SDA2                                     8cfe2acc-7572-4b45-b25f-ed021bb1d78b
/dev/sdb1                                 ext4           revodata        /mnt/REVODATA                                 ec21f5b3-7fd4-4f4b-af8d-cf787b147ae8
/dev/sdd                                  btrfs                          (in use)                                      9025bea6-b615-470a-8759-df1b13f63b52
/dev/sdc                                  btrfs                          (in use)                                      9025bea6-b615-470a-8759-df1b13f63b52
/dev/sde1                                 ext2                           /boot                                         ac7da829-aebb-46f0-806c-04a4d81a945a
/dev/sde2                                 swap                           <swap>                                        0d939b7d-48f1-47dd-aebe-77e7bd8c3503


The OS is installed on /dev/sda1.  VMs are saved on /mnt/SDA2 and /mnt/REVODATA. Other user data are on /mnt/DATA, a BTRFS filesystem on two WD1002FAEX hard drives.


The boot error, pasted from dmesg, looks like this

Code: [Select]
[    5.361251] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[    5.371405] ata2.00: ATA-8: OCZ-REVODRIVE, 1.37, max UDMA/133
[    5.371409] ata2.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32)
[    5.381394] ata2.00: configured for UDMA/100
[    5.381720] scsi 1:0:0:0: Direct-Access     ATA      OCZ-REVODRIVE    1.37 PQ: 0 ANSI: 5
[    7.471353] ata3: SATA link down (SStatus 0 SControl 0)
[    9.561492] ata4: SATA link down (SStatus 0 SControl 0)
[    9.561764] scsi 4:0:0:0: Direct-Access     ATA      WDC WD1002FAEX-0 05.0 PQ: 0 ANSI: 5
[    9.562366] scsi 5:0:0:0: Direct-Access     ATA      WDC WD1002FAEX-0 05.0 PQ: 0 ANSI: 5
[    9.563109] scsi 11:0:0:0: Processor         Marvell  91xx Config      1.01 PQ: 0 ANSI: 5
[    9.578165] ata12.00: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0
[    9.579190] ata12.00: irq_stat 0x40000001
[    9.579816] scsi 11:0:0:0: CDB:
[    9.579817] Inquiry: 12 01 00 00 ff 00
[    9.579824] ata12.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 2 dma 16640 in
[    9.579824]          res 50/00:02:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
[    9.582188] ata12.00: status: { DRDY }
[    9.586256] scsi 13:0:0:0: CD-ROM            ASUS     DRW-24B1ST       1.04 PQ: 0 ANSI: 5


It always occurs immediately after the Marvell controller is found. Until today, I assumed that this was some problem with an hdd or ssd hardware. But yesterday I ran the extended SMART test on all drives, and all passed, with no scary messages. (The WDs take 4 hours to run that test.)

Today I booted a couple of older kernels to check it, and to my surprise, this error only occurs when booting a 3.15 kernel.

The Marvell 91xx is actually a 91a3 "SATA III" or 6GB/s SATA controller. It only has 2 ports, and the WD drives are connected to it -- they are 6GB/s drives. I ran dmidecode but it does not list the Marvell controller, only the two 6GB/s SATA ports.  Hardinfo shows it like this:

Code: [Select]
f7d00000-f7dfffff           : PCI Bus 0000:01
f7de0000-f7deffff           : Marvell Technology Group Ltd. Device 91a3
f7dff800-f7dfffff           : Marvell Technology Group Ltd. Device 91a3
f7dff800-f7dfffff           : AHCI SATA low-level driver


I booted 3.12-6, 3.13-4, 3.14-0.towo.2, and 3.14-1, and none of them produced the error. But the error occurs on 3.15-rc2 (at 9.593092), and the current 3.15-rc5 (at 9.579824).  I zipped the dmesg output for all of these, and it is attached.

If there is more information that I can give, just ask.
« Last Edit: 2014/05/15, 15:43:52 by dibl »
Asus ROG STRIX X299-E, Core i7-7740X, Nvidia GTX-1060, dual monitors, SSD 860 EVO, 2@WD1003FZEX in BTRFS

Offline dibl

  • siduction community member
  • Global Moderator
  • User
  • *****
  • Posts: 2.058
    • Land of the Buckeye
Re: Kernel 3.15 series -- new ATA error output -- UPDATE
« Reply #1 on: 2014/05/30, 02:42:07 »
Being a cautious and conservative type, I bought a pair of replacement drives and SATA 6GB/s cables, and installed them, and restored my backup today.  As I suspected, running kernel 3.15-rc7, the ATA error message remains:


Code: [Select]
[    7.630148] ata3: SATA link down (SStatus 0 SControl 0)
[    9.707003] ata4: SATA link down (SStatus 0 SControl 0)
[    9.707211] scsi 4:0:0:0: Direct-Access     ATA      WDC WD1000DHTZ-0 04.0 PQ: 0 ANSI: 5
[    9.707688] scsi 5:0:0:0: Direct-Access     ATA      WDC WD1000DHTZ-0 04.0 PQ: 0 ANSI: 5
[    9.708295] scsi 11:0:0:0: Processor         Marvell  91xx Config      1.01 PQ: 0 ANSI: 5
[    9.723637] ata12.00: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0
[    9.724661] ata12.00: irq_stat 0x40000001
[    9.725289] scsi 11:0:0:0: CDB:
[    9.725291] Inquiry: 12 01 00 00 ff 00
[    9.725297] ata12.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 2 dma 16640 in
[    9.725297]          res 50/00:02:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
[    9.727653] ata12.00: status: { DRDY }
[    9.731814] scsi 13:0:0:0: CD-ROM            ASUS     DRW-24B1ST       1.04 PQ: 0 ANSI: 5
[    9.741275] scsi 14:0:0:0: Direct-Access     ATA      KINGSTON SS100S2 D100 PQ: 0 ANSI: 5
[    9.745773] sd 0:0:0:0: [sda] 117231408 512-byte logical blocks: (60.0 GB/55.8 GiB)
[    9.745776] sd 1:0:0:0: [sdb] 117231408 512-byte logical blocks: (60.0 GB/55.8 GiB)
[    9.745789] sd 4:0:0:0: [sdc] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[    9.745792] sd 4:0:0:0: [sdc] 4096-byte physical blocks

Asus ROG STRIX X299-E, Core i7-7740X, Nvidia GTX-1060, dual monitors, SSD 860 EVO, 2@WD1003FZEX in BTRFS