2.6 — Instruments primer
Opening scenario
The app is shipping. QA reports: “The Feed scrolls smoothly on iPhone 15 Pro but stutters on iPhone 12 mini.” App Store reviews are starting to mention it. Your tech lead asks: “Can you confirm whether it’s CPU, GPU, or main-thread blocking — and pinpoint the function?”
You can answer “I think it’s the image decoder,” or you can answer with a flame graph, a CPU sample, and a 4-line function name. The second answer comes from Instruments.
Instruments is the perf-engineering tool that ships with Xcode. It’s a separate app (Xcode → Open Developer Tool → Instruments, or ⌘I to launch with the current build). It has ~30 templates, each focused on a kind of measurement. You’ll regularly use four of them.
The instruments you’ll actually use
| Template | Question it answers |
|---|---|
| Time Profiler | “Which functions are eating CPU?” |
| Allocations | “Where is memory growing?” |
| Leaks | “Which allocations are leaked (never freed)?” |
| Hangs | “Where is the main thread blocked?” |
| Animation Hitches | “Why are we dropping frames?” |
| Energy Log | “Why is the battery draining?” |
| Network Link Conditioner | (Not Instruments per se — but adjacent) “How does the app behave on a slow connection?” |
| App Launch | “Why does cold launch take 1.4 seconds?” |
| Core Data | “Which fetches are slow?” |
| File Activity | “Why is the app I/O-bound?” |
Concept → Why → How → Code
Time Profiler — the workhorse
The single most useful Instrument. Records the call stack of every thread at a fixed sampling interval (usually 1 ms) and aggregates them. The output:
- Call Tree view — Time spent in each function, hierarchically. Toggle “Invert Call Tree” to see leaf functions sorted by total time (where the work is actually done).
- Flame Graph view — visual stack-depth chart.
The workflow:
- ⌘I in Xcode → choose Time Profiler template
- Click record (red button), perform the slow action, click stop
- In the Call Tree, Hide System Libraries (sidebar) to drop noise from
libdispatch,Foundation, etc. - Sort by “Weight” descending
- The top entries are where time is going
For a typical scroll-stutter bug, you’ll see something like:
89% MyApp -[FeedCell drawRect:]
72% UIGraphicsImageRenderer.image
72% ImageDecoder.decode(_:)
65% data(contentsOf:) ← synchronous I/O on the main thread
Diagnosis in a single screen: image decode is synchronous and on the main thread. Fix: move decode to a background queue, cache decoded images.
Allocations — where does memory go
Records every allocation and deallocation. Useful views:
- All Heap & Anonymous VM — total memory over time
- Persistent Bytes — what’s still alive
- Mark Generation — record a “snapshot” before an action and after; the diff shows what was newly allocated. Excellent for finding “this screen leaks 2 MB every time I push it.”
The workflow for finding a transient memory spike:
- Start recording
- Mark generation (button: small flag icon) — call this baseline
- Perform the action (push screen, return)
- Mark generation again
- Inspect “Allocations between snapshots” — anything that should have been freed but wasn’t is your leak
Leaks — find what’s never freed
The Leaks instrument runs alongside Allocations and detects classic leaks: blocks of memory with no reference path from any root. In Swift with ARC, pure leaks are rare; cycle-based leaks are more common (and don’t always trigger Leaks — they may need the Memory Graph Debugger from Ch 2.5).
When Leaks fires, it gives you the allocation stack — the line of code that allocated the leaked object. Usually that’s enough to find the cycle.
Hangs — find what blocks the main thread
The Hangs instrument flags every span where the main thread was blocked > 250 ms. Drill in: you see the call stack at the moment of the hang. The fix is usually to move the work to a background queue.
In iOS 16+ and Swift Concurrency, hangs are also surfaced in Xcode’s Organizer → Hangs for shipped apps with hang-rate metrics from real devices.
Animation Hitches
For scroll smoothness specifically. Records when the GPU misses a frame deadline. Tells you whether the bottleneck is CPU prep, GPU draw, or main-thread blocking. The new tool for what used to be diagnosed with MetricKit + intuition.
App Launch
Launches your app with detailed timing of:
pre-main()— dynamic linking, ObjC runtime setupmain()to first frame — yourapplication(_:didFinishLaunchingWithOptions:)and initial view rendering
Goal: cold launch < 400 ms on a mid-tier device. Apple’s published guideline. App Store reviewers notice anything > 1 second.
The most common bottleneck: a startup work list — analytics SDK init, crash reporter init, push notification setup — all happening synchronously in application(_:didFinishLaunching…). Defer non-critical SDKs to after first paint.
Counters / os_signpost
To attribute time to your logical operations (not just method names), use os_signpost in your code:
import os
let signposter = OSSignposter(subsystem: "com.example.MyApp", category: "FeedLoad")
let interval = signposter.beginInterval("loadFeed", id: signposter.makeSignpostID())
await loadFeed()
signposter.endInterval("loadFeed", interval)
In Instruments → Points of Interest track, your loadFeed interval will appear as a colored band on the timeline. You can correlate it with CPU spikes, memory growth, network traffic — across all timelines simultaneously. Critical for understanding “what was happening when the frame dropped?”
Network Link Conditioner
Not part of Instruments — it’s a separate tool (download from developer.apple.com → Additional Downloads → “Additional Tools for Xcode” → install Network Link Conditioner). It throttles your Mac’s network to simulate 3G, Edge, lossy WiFi, etc. Run your app under “Edge” once a quarter — you’ll find loading states you didn’t know were absent and timeouts that are too aggressive.
On a device: Settings → Developer → Network Link Conditioner (appears after you’ve connected to Xcode with developer mode enabled).
In the wild
- Lyft’s iOS engineers measure cold launch with Instruments and
os_signpostinstrumentation, and gate releases on a launch budget (configurable per device class). - Apple’s WWDC sessions on perf (every year, multiple) demo the Time Profiler and
os_signpost. The 2023 “Analyze hangs with Instruments” session is a 25-minute crash course. - Robinhood’s iOS team uses MetricKit (shipped data from real users) + Instruments (local reproductions) as a complementary pair: MetricKit surfaces the symptom on real devices, Instruments reproduces it locally for fixing.
- NYT iOS publishes scroll-smoothness perf budgets internally and gates merges on Animation Hitches numbers.
Common misconceptions
-
“Profile in Debug builds.” No! Always profile in Release (Edit Scheme → Profile → Build Configuration: Release). Debug builds include optimization-disabled code, sanitizers, and debug symbols that make every function appear 5–10× slower. Your Time Profiler results will be lies.
-
“More cores → faster” is not a fix you can produce. Instruments shows you wall-clock time. If your work is single-threaded by design (UI updates on the main thread), buying more cores doesn’t help.
-
“Leaks instrument catches all memory leaks.” It catches the classic “no path from any root” leaks. It does not catch retain cycles where two objects retain each other but are unreachable from your code — Memory Graph Debugger catches those.
-
“Time Profiler tells you where the bug is.” It tells you where the time is. The bug may be that you’re calling that function too often, not that the function itself is slow. Always check both: “Is this function slow?” and “Why is it called 1000 times when 1 would do?”
-
“
os_signpostis for advanced users.” It’s for anyone who wants meaningful flame graphs. Add 5 signposts to your app on day one — login, feed load, image decode, push handler, scroll lifecycle. You’ll thank yourself the first time you debug a perf issue.
Seasoned engineer’s take
Always profile on the lowest-tier supported device. Your M3 MacBook running an iPhone 16 Simulator is not your audience. Your audience is the iPhone 12 mini with 4 GB of RAM, two upgrade cycles old. If the app is smooth there, it’s smooth everywhere. If you only profile on flagships, you’ll ship a stuttering app and not know it.
Set perf budgets per scenario. Cold launch < 400 ms. Feed first-frame < 300 ms. Image load on-screen → 100 ms. Tap → screen-push → 16 ms (one frame). Write them down in a PERF_BUDGETS.md. When you ship a feature, measure against the budget. You can’t manage what you don’t measure.
Wire os_signpost to your top 10 critical paths on day one. App startup, login, feed load, image cache hit/miss, push tap handling, deep link resolution, screen transitions. Then any perf investigation starts with “let me look at the existing signposts” — not “let me instrument this from scratch under pressure.”
TIP: Build an
OSSignposterhelper for measuring async functions in a single line:await signposter.withInterval("loadFeed") { await loadFeed() }. Two-line wrapper, used everywhere. Means you’ll actually add signposts instead of skipping them for “later.”
WARNING: Don’t profile in the iOS Simulator for CPU- or GPU-heavy work. The Simulator runs on your Mac’s CPU (which is much faster than any device); GPU translation is approximate. Always profile on a physical device for any perf claim you’ll act on.
Interview corner
Question: “How would you investigate a report of scroll stuttering in a feed?”
Junior answer: “I’d add print statements to the cell layout code and see what’s slow.” → Won’t pass — print doesn’t quantify.
Mid-level answer: “I’d profile in Instruments with the Time Profiler template on a physical lower-tier device, in a Release build. I’d start recording, scroll the feed for ~10 seconds, stop, and look at the call tree with system libraries hidden. The top functions by weight are the candidates. If image decoding is high, that points to synchronous decode; if drawRect: is high, that points to Core Graphics work happening per-cell; if layoutSubviews dominates, that’s AutoLayout. I’d cross-check with Animation Hitches to confirm the frames being missed correlate with the heavy work.” → Strong.
Senior answer: Plus: “I’d also instrument the scroll lifecycle with os_signpost so the timeline shows when each cell appears, which lets me correlate CPU spikes with specific cells. Beyond Instruments, I’d reach for MetricKit to confirm whether real users experience the same hitches — sometimes local repro is a Simulator-only artifact. Once I have a fix, I’d put a perf budget in CI: a script that fails the build if scroll hitches exceed a threshold on a known device baseline. Otherwise the regression will sneak back in 6 months later.” → Senior signal: production data + perf budgets in CI.
Red-flag answer: “I’d add a usleep(1000) to slow things down so it doesn’t look like it’s stuttering.” → That candidate is going to be the source of your perf regressions, not the solution.
Lab preview
Lab 2.3 gives you a starter app with a deliberate CPU hotspot and a deliberate memory leak. You’ll use Time Profiler + Allocations to find both, fix them, and confirm the fix in a second profiling run.
Next: the device that makes the bug reproduce — and the simulator that hides it. → Simulator vs device