7.7 — ARKit basics
Opening scenario
A retail client wants an “place this couch in your living room” feature. You open Xcode, see ARKit, RealityKit, SceneKit, Reality Composer Pro, USDZ, QuickLook — and wonder which one you’re supposed to use. The answer: in 2026, RealityKit on top of ARKit for new code, with QuickLook + USDZ for any “view this product in AR” feature that doesn’t need custom interaction. SceneKit still ships but is in maintenance; SpriteKit is unrelated (2D). This chapter is the just-enough-to-be-dangerous tour.
| Context | What it usually means |
|---|---|
| Reads “ARView / RealityView” | Has placed objects in AR |
| Reads “plane detection” | Knows the world-tracking basics |
| Reads “anchors / entities” | Has structured a scene |
| Reads “occlusion / people occlusion” | Has worked on realism |
| Reads “image / object tracking” | Has built marker-based AR |
Concept → Why → How → Code
Concept
- ARKit — the sensor + tracking layer. Owns the camera feed, world tracking, plane/image/object/face/body detection, LiDAR depth, ARCoachingOverlay.
- RealityKit — the rendering + simulation layer. PBR materials, physics, audio, ECS-style
Entitymodel. - RealityView (SwiftUI, iOS 18+) — modern SwiftUI host. Older code uses
ARView(UIKit). - Reality Composer Pro — visual scene editor; outputs
.realityand USDZ bundles. - QuickLook + USDZ — drop a USDZ file into a SwiftUI view (
ARQuickLookView) and Apple handles AR-place, scale, rotate, share. Zero custom code.
Why
ARKit is the only way to use the device’s full sensor fusion (camera, gyro, IMU, LiDAR) for AR. RealityKit gives you a modern, Swift-native scene graph with PBR rendering that visually matches Apple’s reference apps without authoring an OpenGL/Metal renderer.
How — capabilities & Info.plist
NSCameraUsageDescription = "AR features require the camera to overlay 3D content on your view."
ARKit availability check:
import ARKit
guard ARWorldTrackingConfiguration.isSupported else {
// Older device — fall back to 2D or QuickLook USDZ
return
}
Simplest AR — QuickLook with a USDZ
import QuickLook
import SwiftUI
struct ARQuickLookView: UIViewControllerRepresentable {
let url: URL // path to a .usdz file
func makeUIViewController(context: Context) -> QLPreviewController {
let controller = QLPreviewController()
controller.dataSource = context.coordinator
return controller
}
func updateUIViewController(_ controller: QLPreviewController, context: Context) {}
func makeCoordinator() -> Coordinator { Coordinator(url: url) }
final class Coordinator: NSObject, QLPreviewControllerDataSource {
let url: URL
init(url: URL) { self.url = url }
func numberOfPreviewItems(in c: QLPreviewController) -> Int { 1 }
func previewController(_ c: QLPreviewController, previewItemAt index: Int) -> QLPreviewItem {
url as QLPreviewItem
}
}
}
Tap the AR button, the user gets free plane detection, scale, rotate, occlusion, share. 90% of e-commerce “view in your room” features should stop here.
Custom AR with RealityView (iOS 18+)
import RealityKit
import ARKit
import SwiftUI
struct PlaceCubeView: View {
var body: some View {
RealityView { content in
// One-time setup
let arConfig = ARWorldTrackingConfiguration()
arConfig.planeDetection = [.horizontal]
arConfig.environmentTexturing = .automatic
content.camera = .spatialTracking
// Anchor will attach to first detected horizontal plane
let anchor = AnchorEntity(plane: .horizontal, classification: .floor, minimumBounds: [0.2, 0.2])
let mesh = MeshResource.generateBox(size: 0.1)
let material = SimpleMaterial(color: .systemTeal, isMetallic: false)
let model = ModelEntity(mesh: mesh, materials: [material])
anchor.addChild(model)
content.add(anchor)
}
.ignoresSafeArea()
}
}
Older path — ARView (UIKit, still common in 2026 codebases)
import ARKit
import RealityKit
final class ARSceneController: UIViewController {
private let arView = ARView(frame: .zero)
override func viewDidLoad() {
super.viewDidLoad()
view.addSubview(arView)
arView.frame = view.bounds
arView.autoresizingMask = [.flexibleWidth, .flexibleHeight]
let config = ARWorldTrackingConfiguration()
config.planeDetection = [.horizontal, .vertical]
config.frameSemantics = [.personSegmentationWithDepth] // people occlusion (A12+)
arView.session.run(config)
let coaching = ARCoachingOverlayView()
coaching.session = arView.session
coaching.goal = .horizontalPlane
coaching.frame = view.bounds
view.addSubview(coaching)
let tap = UITapGestureRecognizer(target: self, action: #selector(handleTap(_:)))
arView.addGestureRecognizer(tap)
}
@objc func handleTap(_ tap: UITapGestureRecognizer) {
let point = tap.location(in: arView)
guard let result = arView.raycast(from: point, allowing: .estimatedPlane, alignment: .horizontal).first else { return }
let anchor = AnchorEntity(world: result.worldTransform)
let box = ModelEntity(mesh: .generateBox(size: 0.1),
materials: [SimpleMaterial(color: .systemPink, isMetallic: false)])
box.generateCollisionShapes(recursive: true)
anchor.addChild(box)
arView.scene.addAnchor(anchor)
arView.installGestures([.translation, .rotation, .scale], for: box)
}
}
Image tracking — marker-based AR
let config = ARImageTrackingConfiguration()
guard let images = ARReferenceImage.referenceImages(inGroupNamed: "ARImages", bundle: nil) else { return }
config.trackingImages = images
config.maximumNumberOfTrackedImages = 4
arView.session.run(config)
Reference images live in Assets.xcassets → AR Resource Group, tagged with their physical size (the tracker needs the real-world width).
Object tracking & body / face tracking
ARObjectScanningConfiguration+ARReferenceObject— scan a 3D object once, then track in subsequent sessions.ARFaceTrackingConfiguration— front camera, requires A12+ Bionic. Powers Memoji, FaceTime filters.ARBodyTrackingConfiguration— full skeleton tracking, A12+.
LiDAR niceties (Pro iPhones, iPads)
config.sceneReconstruction = .meshWithClassification
config.frameSemantics.insert(.sceneDepth)
Gives you a real-time mesh of the room with surface classification (wall/floor/ceiling/window). Use for occlusion of virtual objects behind real geometry and for physics interactions with real walls.
visionOS — same APIs, different scale
On visionOS, RealityKit is the primary UI framework, not a special-case AR layer. Most of the scene-graph code transfers; the input model (gaze + pinch) and the spatial UX patterns are different. Covered in Chapter 12.
In the wild
- IKEA Place, Wayfair View in Room — QuickLook USDZ + custom Reality Composer scenes.
- Apple Maps Look Around — not strictly AR but uses related camera+geo fusion.
- Snapchat / Instagram AR filters — they ship their own engine on top of
ARFaceTrackingConfiguration. - Measure (Apple) — pure ARKit + raycast + simple geometry.
- Pokémon Go — runs both ARCore on Android and ARKit on iOS, falls back to gyro-only.
Common misconceptions
- “AR drains the battery so much I shouldn’t ship it.” Modern devices handle 15-30 minutes of AR fine. Be mindful of idle AR — don’t keep
ARSessionrunning on a screen the user isn’t interacting with. - “SceneKit is the modern choice.” SceneKit is in maintenance. RealityKit is where Apple invests; new features (Object Capture, hover effects, visionOS) land in RealityKit.
- “I need a Mac with USDZ tools to make a model.” Reality Composer Pro (free with Xcode) and even SwiftUI’s
Model3Dview handle USDZ. For complex models, Blender → USDZ via the official exporter works. - “Plane detection works in any lighting.” It needs visual features. Plain white walls, glossy floors, and low-light all degrade tracking. Use
ARCoachingOverlayViewto guide the user. - “ARKit can save the user’s AR session to share with another user.” It can — via
ARWorldMapserialization — but only between the same user’s devices on iOS. Cross-user shared AR isMultipeerConnectivity+ sharing collaboration data.
Seasoned engineer’s take
The hardest part of shipping AR is the UX before the AR session starts. Users don’t know they need to wave the phone around in good light; they tap the AR button, see a black screen for 3 seconds, and bounce. Standard mitigations:
ARCoachingOverlayView— Apple’s official “move your phone” UI. Always present it.- Pre-AR onboarding — a single still-frame explainer: “Point at the floor and move your phone slowly.”
- Fallback to 3D-only mode when tracking fails (Reality Composer scene viewed without the camera background).
Engineering: keep the ARKit code in a dedicated controller, don’t sprinkle anchors across view models. ARSession is a long-lived stateful object; restarting it costs ~500ms of re-tracking, which feels like a freeze.
For most apps that “want AR,” the answer is QuickLook + USDZ. Custom RealityKit work is justified when you need interactive behavior (drag-to-resize beyond QuickLook’s, multi-object scenes, physics) or specialized tracking (images, objects, body).
TIP: Cache USDZ files in
Application Support, not in the bundle, when products are downloaded dynamically. Bundle USDZs inflate IPA size dramatically and trigger over-the-air install warnings at 200MB.
WARNING: Never store raw
ARFramecapturedImage data to disk without a strong privacy reason and user consent. The camera frame contains the user’s living room.
Interview corner
Junior: “What’s the simplest way to let a user place a 3D product in their room?”
Ship a USDZ file and present it with
QLPreviewController. The system handles AR-place, scale, rotate, occlusion, and the AR button. No custom AR code needed.
Mid: “How would you let the user tap to place a model and then drag, rotate, scale it?”
Run an
ARWorldTrackingConfigurationwith horizontal-plane detection. On tap,arView.raycast(from:point, allowing: .estimatedPlane, alignment: .horizontal)returns a world transform; create anAnchorEntity(world:)and aModelEntity. CallgenerateCollisionShapeson the entity, thenarView.installGestures([.translation, .rotation, .scale], for: entity). UseARCoachingOverlayViewto guide the user through plane detection.
Senior: “Design an AR retail app that ships hundreds of products, with reliable performance on a 5-year-old iPhone XS.”
Two-tier model. (1) Lightweight catalog: thumbnails + USDZ files hosted on a CDN, downloaded on demand into Application Support, capped to ~50MB cache with LRU eviction. (2) AR mode: QuickLook for default “place in room”; custom RealityView only for SKUs that need configurator behavior (color swap, modular assembly). Tracking config: horizontal-plane only on older devices; horizontal+vertical+person-occlusion on A12+. Reserve LiDAR-only features for Pro devices. Optimize USDZ assets: <5MB per model, materials baked, normal maps preferred over high-poly geometry. Show
ARCoachingOverlayViewalways; on tracking failure, fall back to a 360° spin viewer (no camera background) so the user still sees the product. Telemetry: log “AR placed”, “AR dismissed without place”, “tracking lost” to identify catalog SKUs with broken assets.
Red flag: “We use SceneKit because it’s been around longer and is more stable.”
In 2026, Apple is investing in RealityKit, not SceneKit. New AR features (people occlusion, scene reconstruction, hover effects, visionOS) land in RealityKit first. SceneKit still works but is a dead-end choice for greenfield code.
Lab preview
ARKit doesn’t have a dedicated lab in Phase 7 (the 4 labs prioritize the more common patterns), but the Lab 7.1 — Weather + Map app stretch goal includes “place a 3D weather icon in AR at the location’s coordinate” — a 50-line addition that exercises this chapter end-to-end.
Next: 7.8 — CoreML & Create ML