Dec 17 20:20:26 why udevd[11705]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb1' [12021] Dec 17 20:20:27 why udevd[11705]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb1' [12021] Dec 17 20:20:28 why udevd[11705]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb1' [12021] Dec 17 20:20:28 why dbus-daemon[1173]: dbus[1173]: [system] Activating service name='net.reactivated.Fprint' (using servicehelper) Dec 17 20:20:28 why dbus[1173]: [system] Activating service name='net.reactivated.Fprint' (using servicehelper) Dec 17 20:20:28 why dbus-daemon[1173]: Launching FprintObject Dec 17 20:20:28 why dbus[1173]: [system] Successfully activated service 'net.reactivated.Fprint' Dec 17 20:20:28 why dbus-daemon[1173]: dbus[1173]: [system] Successfully activated service 'net.reactivated.Fprint' Dec 17 20:20:28 why dbus-daemon[1173]: ** Message: D-Bus service launched with name: net.reactivated.Fprint Dec 17 20:20:28 why dbus-daemon[1173]: ** Message: entering main loop Dec 17 20:20:29 why udevd[11705]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb1' [12021] Dec 17 20:20:30 why udevd[11705]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb1' [12021] Dec 17 20:20:31 why udevd[11705]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb1' [12021]
This is the tail of the /var/log/messages. With all those mentions of killings and daemons, it totally looks like notes from a homicidal maniac with a medieval tilt.
Farther back the log has these entries:
Dec 17 16:17:19 why dbus-daemon[1173]: dbus[1173]: [system] Successfully activated service 'org.freedesktop.PackageKit' Dec 17 16:26:35 why kernel: [65687.762948] CPU6: Package power limit notification (total events = 2) Dec 17 16:26:35 why kernel: [65687.762956] CPU2: Package power limit notification (total events = 2) Dec 17 16:26:35 why kernel: [65687.762956] CPU1: Package power limit notification (total events = 2) Dec 17 16:26:35 why kernel: [65687.762965] CPU0: Package power limit notification (total events = 2) Dec 17 16:26:35 why kernel: [65687.762969] CPU4: Package power limit notification (total events = 2) Dec 17 16:26:35 why kernel: [65687.762972] CPU7: Package power limit notification (total events = 2) Dec 17 16:26:35 why kernel: [65687.762977] CPU5: Package power limit notification (total events = 2) Dec 17 16:26:35 why kernel: [65687.762977] CPU3: Package power limit notification (total events = 2) Dec 17 16:26:35 why kernel: [65687.773971] CPU5: Package power limit normal Dec 17 16:26:35 why kernel: [65687.773975] CPU2: Package power limit normal Dec 17 16:26:35 why kernel: [65687.773982] CPU4: Package power limit normal Dec 17 16:26:35 why kernel: [65687.773984] CPU6: Package power limit normal Dec 17 16:26:35 why kernel: [65687.773988] CPU3: Package power limit normal Dec 17 16:26:35 why kernel: [65687.773992] CPU7: Package power limit normal Dec 17 16:26:35 why kernel: [65687.773997] CPU1: Package power limit normal Dec 17 16:26:35 why kernel: [65687.773999] CPU0: Package power limit normal Dec 17 16:26:48 why kernel: [65700.686031] [Hardware Error]: Machine check events logged Dec 17 16:26:48 why mcelog[1166]: Hardware event. This is not a software error. Dec 17 16:26:48 why mcelog[1166]: MCE 0 Dec 17 16:26:48 why mcelog[1166]: CPU 2 THERMAL EVENT TSC 7740a8dc9110 Dec 17 16:26:48 why mcelog[1166]: TIME 1324157195 Sat Dec 17 16:26:35 2011 Dec 17 16:26:48 why mcelog[1166]: Processor 2 below trip temperature. Throttling disabled Dec 17 16:26:48 why mcelog[1166]: STATUS c0000000881b0c00 MCGSTATUS 0 Dec 17 16:26:48 why mcelog[1166]: MCGCAP c09 APICID 4 SOCKETID 0 Dec 17 16:26:48 why mcelog[1166]: CPUID Vendor Intel Family 6 Model 42 Dec 17 16:26:48 why mcelog[1166]: Hardware event. This is not a software error. Dec 17 16:26:48 why mcelog[1166]: MCE 1 Dec 17 16:26:48 why mcelog[1166]: CPU 1 THERMAL EVENT TSC 7740a8dcb633 Dec 17 16:26:48 why mcelog[1166]: TIME 1324157195 Sat Dec 17 16:26:35 2011 Dec 17 16:26:48 why mcelog[1166]: Processor 1 below trip temperature. Throttling disabled Dec 17 16:26:48 why mcelog[1166]: STATUS c0000000881b0c00 MCGSTATUS 0 Dec 17 16:26:48 why mcelog[1166]: MCGCAP c09 APICID 2 SOCKETID 0 Dec 17 16:26:48 why mcelog[1166]: CPUID Vendor Intel Family 6 Model 42 Dec 17 16:26:48 why mcelog[1166]: Hardware event. This is not a software error.
It killed something in cold blood and is now gloating over it?
According to the followup to that post, it's a kernel issue that sprang up after 2.6.38.7 The kernel is not able to balance loads on the CPU/GPU well in processors with a sandy bridge and gives rise to these errors. Updating the bios does not solve the issue.
Why does it ALL have to happen in my machine?! Seriously, this is not even my super-sucky original machine. It's a machine from my bro. Does the linux universe hate me or what?!
I followed that thread to the bugzilla on kernels: :::link:::
Its apparently not a bios issue anymore. Its now seen as an issue with how the kernel handling CPU and GPU loads in intel chips with a sandy bridge processor. They made a patch for it, but now are worried how it would affect performance on these chips.
So if /var/log/messages claims its a hardware issue, it might not really be true and might actually be about how the new kernel is treating the CPU/GPU. Confusion ++
Not that the productivity ruining freezes are not serious, but the freezes don't seem to be near these bios related hardware thermal problems...
With my fabulous luck, what if I run into one of these issues with the new BIOS: :::link:::
I am wondering if it might not be a safer bet to live these thermal issues (or non-issues) rather than inherit some really serious problems...
Btw, I posted that VLC related error in the earlier post. Something connected with LIRC.
I have had this issue for quite a while. He didn't run into it. I am updating my bios as soon as I can (if there is an update, that is).
How about the other 1000 errors that are not connected to this particular error? :) The bios related error was a whole 3 hours before the freeze actually happened. The errors with the freeze (and around that pointframe) have no mentions of any thermal problems. They seem to be glitches in how fedora is handling mounted drives.
Oh I have the latest bios firmware on my laptops. However, if your bios is trigger thermal events it really is not fedora's fault. I never though to mention that but it does seem to be related to a hardware issue "Hardware event. This is not a software error." I guess that's a good thing in a way. There is also the possibility that you device is truly having thermal issues. Did you brother not run into this?