I have been investigating a boot error message for a week or two, and I finally realized that it comes only from the 3.15rc kernel series.
The system is this:
don@imerabox:~$ inxi -v7
System: Host: imerabox Kernel: 3.15-rc5-siduction-amd64 x86_64 (64 bit gcc: 4.9.0)
Desktop: KDE 4.12.4 (Qt 4.8.6) info: plasma-desktop dm: lightdm
Distro: aptosid 2011-02 Ἡμέρα - kde-lite - (201107131633)
Machine: Mobo: ASUSTeK model: P6X58D-E v: Rev 1.xx Bios: American Megatrends v: 0803 date: 08/06/2012
CPU: Quad core Intel Core i7 950 (-HT-MCP-) cache: 8192 KB
flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 25178
Clock Speeds: 1: 3146 MHz 2: 3146 MHz 3: 3146 MHz 4: 3146 MHz 5: 3146 MHz 6: 3146 MHz 7: 3146 MHz
8: 3146 MHz
Graphics: Card: NVIDIA GF100 [GeForce GTX 480] bus-ID: 05:00.0 chip-ID: 10de:06c0
Display Server: X.Org 1.15.1 driver: nvidia Resolution: 1920x1200@59.9hz
GLX Renderer: GeForce GTX 480/PCIe/SSE2 GLX Version: 4.4.0 NVIDIA 337.12 Direct Rendering: Yes
Audio: Card-1 NVIDIA GF100 High Definition Audio Controller
driver: snd_hda_intel bus-ID: 05:00.1 chip-ID: 10de:0be5
Card-2 Intel 82801JI (ICH10 Family) HD Audio Controller
driver: snd_hda_intel bus-ID: 00:1b.0 chip-ID: 8086:3a3e
Card-3 Logitech QuickCam Communicate STX driver: USB Audio usb-ID: 004-002 chip-ID: 046d:08d7
Sound: Advanced Linux Sound Architecture v: k3.15-rc5-siduction-amd64
Network: Card: Marvell 88E8056 PCI-E Gigabit Ethernet Controller
driver: sky2 v: 1.30 port: d800 bus-ID: 06:00.0 chip-ID: 11ab:4364
IF: eth0 state: up speed: 100 Mbps duplex: full mac: 20:cf:30:5c:41:1d
WAN IP: 99.126.220.109 IF: vmnet8 ip: 172.16.27.1 ip-v6: fe80::250:56ff:fec0:8
IF: eth0 ip: 192.168.1.70 ip-v6: fe80::22cf:30ff:fe5c:411d
IF: vmnet1 ip: 172.16.203.1 ip-v6: fe80::250:56ff:fec0:1
Drives: HDD Total Size: 2136.5GB (51.3% used)
ID-1: /dev/sda model: OCZ size: 60.0GB serial: OCZ-Q17BKCW4VFG6ZBG7
ID-2: /dev/sdb model: OCZ size: 60.0GB serial: OCZ-0C2Z27B5QEE4L0H2
ID-3: /dev/sdd model: WDC_WD1002FAEX size: 1000.2GB serial: WD-WCAW34278337
ID-4: /dev/sdc model: WDC_WD1002FAEX size: 1000.2GB serial: WD-WCAW34194718
ID-5: /dev/sde model: KINGSTON_SS100S2 size: 16.0GB serial: 16GAA0002142
Optical: /dev/sr0 model: ASUS DRW-24B1ST rev: 1.04 dev-links: cdrom,cdrw,dvd,dvdrw
Features: speed: 48x multisession: yes
audio: yes dvd: yes rw: cd-r,cd-rw,dvd-r,dvd-ram state: running
Partition: ID-1: / size: 19G used: 16G (89%) fs: ext4 dev: /dev/sda1
label: N/A uuid: bea3a748-3411-4024-acd0-39f3882ddaf9
ID-2: /mnt/REVODATA size: 55G used: 11G (20%) fs: ext4 dev: /dev/sdb1
label: revodata uuid: ec21f5b3-7fd4-4f4b-af8d-cf787b147ae8
ID-3: /mnt/SDA2 size: 37G used: 31G (84%) fs: ext4 dev: /dev/sda2
label: SDA2 uuid: 8cfe2acc-7572-4b45-b25f-ed021bb1d78b
ID-4: /boot size: 495M used: 223M (48%) fs: ext2 dev: /dev/sde1
label: N/A uuid: ac7da829-aebb-46f0-806c-04a4d81a945a
ID-5: /mnt/DATA size: 1.9T used: 951G (52%) fs: btrfs dev: /dev/sdc label: N/A uuid: N/A
ID-6: swap-1 size: 15.48GB used: 0.00GB (0%) fs: swap dev: /dev/sde2
label: N/A uuid: 0d939b7d-48f1-47dd-aebe-77e7bd8c3503
RAID: No RAID data: /proc/mdstat missing-is md_mod kernel module loaded?
Unmounted: No unmounted partitions detected
Sensors: System Temperatures: cpu: 42.0C mobo: 35.0C gpu: 0.0:45C
Fan Speeds (in rpm): cpu: 1962 psu: 0 sys-1: 0 sys-2: 0 sys-3: 0
Info: Processes: 325 Uptime: 3:43 Memory: 2074.2/5965.6MB
Init: systemd v: 204 runlevel: 5 default: 5 Gcc sys: 4.8.2 alt: 4.6/4.7/4.9
Client: Shell (bash 4.3.111 running in konsole) inxi: 2.1.28
The hard drive layout is a little complex, because /dev/sda, an OCZ Revodrive PCIe SSD is not recognized as a bootable device and I had to add a little Kingston SSD for /boot. /dev/sdb is the second part of the OCZ SSD -- it was intended for Windows users to use as RAID0.
root@imerabox:/# blkid -c /dev/null -o list
device fs_type label mount point UUID
------------------------------------------------------------------------------------------------------------------------------------------------------------
/dev/sda1 ext4 / bea3a748-3411-4024-acd0-39f3882ddaf9
/dev/sda2 ext4 SDA2 /mnt/SDA2 8cfe2acc-7572-4b45-b25f-ed021bb1d78b
/dev/sdb1 ext4 revodata /mnt/REVODATA ec21f5b3-7fd4-4f4b-af8d-cf787b147ae8
/dev/sdd btrfs (in use) 9025bea6-b615-470a-8759-df1b13f63b52
/dev/sdc btrfs (in use) 9025bea6-b615-470a-8759-df1b13f63b52
/dev/sde1 ext2 /boot ac7da829-aebb-46f0-806c-04a4d81a945a
/dev/sde2 swap <swap> 0d939b7d-48f1-47dd-aebe-77e7bd8c3503
The OS is installed on /dev/sda1. VMs are saved on /mnt/SDA2 and /mnt/REVODATA. Other user data are on /mnt/DATA, a BTRFS filesystem on two WD1002FAEX hard drives.
The boot error, pasted from dmesg, looks like this
[ 5.361251] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[ 5.371405] ata2.00: ATA-8: OCZ-REVODRIVE, 1.37, max UDMA/133
[ 5.371409] ata2.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32)
[ 5.381394] ata2.00: configured for UDMA/100
[ 5.381720] scsi 1:0:0:0: Direct-Access ATA OCZ-REVODRIVE 1.37 PQ: 0 ANSI: 5
[ 7.471353] ata3: SATA link down (SStatus 0 SControl 0)
[ 9.561492] ata4: SATA link down (SStatus 0 SControl 0)
[ 9.561764] scsi 4:0:0:0: Direct-Access ATA WDC WD1002FAEX-0 05.0 PQ: 0 ANSI: 5
[ 9.562366] scsi 5:0:0:0: Direct-Access ATA WDC WD1002FAEX-0 05.0 PQ: 0 ANSI: 5
[ 9.563109] scsi 11:0:0:0: Processor Marvell 91xx Config 1.01 PQ: 0 ANSI: 5
[ 9.578165] ata12.00: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0
[ 9.579190] ata12.00: irq_stat 0x40000001
[ 9.579816] scsi 11:0:0:0: CDB:
[ 9.579817] Inquiry: 12 01 00 00 ff 00
[ 9.579824] ata12.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 2 dma 16640 in
[ 9.579824] res 50/00:02:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
[ 9.582188] ata12.00: status: { DRDY }
[ 9.586256] scsi 13:0:0:0: CD-ROM ASUS DRW-24B1ST 1.04 PQ: 0 ANSI: 5
It always occurs immediately after the Marvell controller is found. Until today, I assumed that this was some problem with an hdd or ssd hardware. But yesterday I ran the extended SMART test on all drives, and all passed, with no scary messages. (The WDs take 4 hours to run that test.)
Today I booted a couple of older kernels to check it, and to my surprise, this error only occurs when booting a 3.15 kernel.
The Marvell 91xx is actually a 91a3 "SATA III" or 6GB/s SATA controller. It only has 2 ports, and the WD drives are connected to it -- they are 6GB/s drives. I ran dmidecode but it does not list the Marvell controller, only the two 6GB/s SATA ports. Hardinfo shows it like this:
f7d00000-f7dfffff : PCI Bus 0000:01
f7de0000-f7deffff : Marvell Technology Group Ltd. Device 91a3
f7dff800-f7dfffff : Marvell Technology Group Ltd. Device 91a3
f7dff800-f7dfffff : AHCI SATA low-level driver
I booted 3.12-6, 3.13-4, 3.14-0.towo.2, and 3.14-1, and none of them produced the error. But the error occurs on 3.15-rc2 (at 9.593092), and the current 3.15-rc5 (at 9.579824). I zipped the dmesg output for all of these, and it is attached.
If there is more information that I can give, just ask.