(The post below lays out a system crash that had been plaguing my home system for a few months now. The updates are at the top – think of it like an email thread. Start from the bottom if you want the whole saga)
Update 5: OK, bug report submitted to Apple along with a consistent test case. The issue does really seem like a memory leak in the Spotlight plugin for emails, but I’ll leave it to the experts to sort out. It’s Radar rdar://problem/15695276 and I’ve submitted a copy to OpenRadar here (sans the email file, since it has real email addresses in there).
I went through and deleted every email larger than ~27MB, then turned on Spotlight indexing for mail. After that finished, I turned on Time Machine again. I haven’t seen the memory spike at all. So, this seems like the culprit.
On a related note, as I was submitting the bug report to Apple, I copied the email file to my MacBook. Immediately, it started feeling sluggish and stuttering. Looked at Activity Monitor and, sure enough, memory was going absolutely nuts. The MBP has twice as much RAM as the iMac, though, so I think it’s been able to recover when this happens (though it has locked up a few times that I can remember… probably because of this).
I’ll update if/when Apple confirms anything.
Update 4: no luck – crashed again after an hour or so. I found a bunch of other large emails still on the disk, so I think I need to clean them up. Or, I’m just wrong. Either way, more debugging later this week.
Update 3: Solved, maybe! So, I was able to narrow this down to files in the mail folder. I happened to inspect the
mdworker processes in Activity Monitor and saw they were always in
.emlx files around when the memory would spike. So, taking that as a clue, I told Spotlight to ignore that folder under the Spotlight Privacy tab and suddenly, machine stayed up. But… I also had to shut down Time Machine because that somehow uses mdworker or caused it to hit those folders, leading to another crash.
So, next step was to try and log what files mdworker was accessing. There’s probably a more elegant way to do this, but I ended up using
fs_usage and then
opensnoop, which are both part of OS X. They both let you see what files a process is interacting with while the process is running using DTrace hooks. The final command line was
opensnoop -a -n mdworker | tee mdworker.log.
I then unblocked the Mail folder from the privacy settings and let Spotlight go (left Time Machine off for the first run). I let the machine crash a few times. After a few restarts, it was clear that largest mdworker processes last touched large emails that were in partial emlx (
.partial.emlx) files. I manually ran
mdimport against some representative files (
mdimport -d4 /path/to/file) and was able to recreate the near 5GB
kernel_task behavior. One email (~30MB on disk), in particular, added 7-8GB of SWAP space. It was crazy.
So, went and disabled TM & Spotlight again, went into Mail.app and tried to delete the files and kicked off spotlight again. It all worked. Time Machine just finished, too. I think this may be sorted out. Fingers crossed – will wait a few days before declaring victory.
So, the only bummer is that in my zeal to see if those emails were the issue, I deleted them before backing them up somewhere. So… no test case to send off to Apple. These sorts of emails show up now and again for me (they’re basically digests of an attachment heavy PR mailing list), so I will probably have another sample case soon.
Update 2: crashed again. FML. I posted a screenshot of the Activity Monitor at time of death: https://twitter.com/sujal/status/410482157644423168
Update: So, this last reboot,
mdworker was still running, but memory was fine. Then, Time Machine kicked on and started prepping a backup. That’s when memory usage spiked and memory pressure went red. I killed the TM backup, memory returned to normal, but then a few moments later, went crazy again. Hmm – it looks like at least one mdworker is indexing Mail right now… wonder if this is a variation of the Gmail thing in Mavericks?
I’m hoping the Mac mavens among you can help me find some ideas on how to debug an issue I’m seeing now with both of my Macs running Mavericks. I’m going to file a bug report with Apple soon, but based on history, that will take a while and I really can’t deal with this for much longer. I may just downgrade.
after some amount of time, measuring in minutes to a few hours, my iMac 27″ (from 2010) will randomly freeze, hard. Tapping on the Magic Trackpad won’t do anything, hitting a key on the keyboard will sometimes get the backlight to go on, but no screensaver will be visible. Just a dark screen. Only way to recover is to reboot.
Disk Utility & DiskWarrior say everything is fine with the drives
memtest passed when I ran off the recovery partition, but running it in my normal logged in state, the whole memory pressure red/swap going nuts thing happened and the system froze.
the most consistent symptoms I can see pre-crash, based on logs and live monitoring using Activity Monitor are the following. This is the situation just before it crashes:
— Activity Monitor shows memory pressure is high
— kernel_task memory usage is listed at 4.68GB (or more, but in that ballpark)
— mdworker has 3-4 processes running, each listed at ~500MB
So, why would mdworker make kernel_task use so much RAM?
(Also, I’ve tried resetting my spotlight cache, removing old unused Spotlight plugins… no luck)
- I tried turning off Time Machine, which seemed to help. My current theory is that when this crash happens, my computer is in the high memory_pressure state, caused my mdworker, and then Time Machine kicks in trying to backup and the world just stops.
I’m trying to catch that, but I’m busy enough that I doubt I will catch it happening…
Any ideas on where to look next or something to try?
(iMac has 8GB of RAM, but today my MacBook with 16GB of RAM just exhibited the same symptoms, and has been less stable than I’d like with Mavericks… I’m wondering if it’s just more stable because it has more RAM…)