Vision framework API, VNDetectHumanHandPoseRequest, VNDetectHumanBodyPoseRequest, person segmentation, face detection, VNImageRequestHandler, recognized points, joint landmarks, VNRecognizeTextRequest, VNDetectBarcodesRequest, DataScannerViewController, VNDocumentCameraViewController, RecognizeDocumentsRequest
View on GitHubSelect agents to install to:
npx add-skill https://github.com/CharlesWiltgen/Axiom/blob/main/.claude-plugin/plugins/axiom/skills/axiom-vision-ref/SKILL.md -a claude-code --skill axiom-vision-refInstallation paths:
.claude/skills/axiom-vision-ref/# Vision Framework API Reference Comprehensive reference for Vision framework computer vision: subject segmentation, hand/body pose detection, person detection, face analysis, text recognition (OCR), barcode detection, and document scanning. ## When to Use This Reference - **Implementing subject lifting** using VisionKit or Vision - **Detecting hand/body poses** for gesture recognition or fitness apps - **Segmenting people** from backgrounds or separating multiple individuals - **Face detection and landmarks** for AR effects or authentication - **Combining Vision APIs** to solve complex computer vision problems - **Looking up specific API signatures** and parameter meanings - **Recognizing text** in images (OCR) with VNRecognizeTextRequest - **Detecting barcodes** and QR codes with VNDetectBarcodesRequest - **Building live scanners** with DataScannerViewController - **Scanning documents** with VNDocumentCameraViewController - **Extracting structured document data** with RecognizeDocumentsRequest (iOS 26+) **Related skills**: See `axiom-vision` for decision trees and patterns, `axiom-vision-diag` for troubleshooting ## Vision Framework Overview Vision provides computer vision algorithms for still images and video: **Core workflow**: 1. Create request (e.g., `VNDetectHumanHandPoseRequest()`) 2. Create handler with image (`VNImageRequestHandler(cgImage: image)`) 3. Perform request (`try handler.perform([request])`) 4. Access observations from `request.results` **Coordinate system**: Lower-left origin, normalized (0.0-1.0) coordinates **Performance**: Run on background queue - resource intensive, blocks UI if on main thread ## Subject Segmentation APIs ### VNGenerateForegroundInstanceMaskRequest **Availability**: iOS 17+, macOS 14+, tvOS 17+, axiom-visionOS 1+ Generates class-agnostic instance mask of foreground objects (people, pets, buildings, food, shoes, etc.) #### Basic Usage ```swift let request = VNGenerateForegroundInstanceMaskRequest() let handle