Experimenting with Object Capture in XCode … Puppy!

A while ago I heard about the Object Capture feature in Apple’s RealityKit, and wanted to see how it performs with a few things I encounter in the real world. The Object Capture API itself is free to Mac owners using XCode, and there are multiple videos online explaining its usage, including one from WWDC 2021. Basically, you can take several photos of any object from multiple angles, and the software will process all of those photos and generate a 3D model of that object through a process referred to as “Photogrammetry.” Pretty incredible when you think of it.

Having just moved to Bilbao last week, what better object to try this with than Puppy, the fan favorite statue from outside the Guggenheim Museum. The WWDC video I linked provides better instructions on how this is all done, and I’ll leave the source code descriptions to others (you can find plenty if you look), but the workflow itself will be something like:

  • Take several photos (usually dozens) of your subject from many different angles. You can see a few of the photos I took of Puppy above, I used 18 in total.
  • In your script, specify the file paths for your folder of photos, and names of the files you will generate.
  • Create a PhotogrammetrySession object that points to your folder, along with any other configuration details.
  • Optionally create an asynchronous task to display status as the process runs, to entertain you for a couple minutes while you wait.
  • Run the .process method with arguments defining outputs and level of detail.

And, after all that, it actually worked! Below is a .usdz formatted 3D model of Puppy that i generated, being rendered inside of XCode.

I found I was also able to render the same model directly on my iPhone. There’s even an AR mode where you can view the object in the context of the room you’re in. Here is Puppy sitting alongside my laptop at the table I’m working on.

This is an incredibly powerful feature of XCode and the Swift programming language, and one I look forward to further experimentation with. As a mechanical engineer, I’ve pondered the potential for this as a reverse engineering tool. Say, rather than this being an artistic sculpture, instead I had found a machine part that I wanted to replicate, but didn’t have the drawing for. Or, maybe I know something is made of steel, and I want to know approximately what it weighs, but don’t have a scale handy. Or maybe its a large item that is fixed to the ground. These aren’t new ideas, optical scanners have been around for many years now, but being able to do it with your cell phone camera is undeniably cool.


A Few Challenges and Shortcomings

I was altogether impressed with Object Capture, but also observed a few things that I hope might see continued development and improvements, as well as a couple of lessons learned on my own usage. First, its important to note that while you may use your phone to take the pictures, the API itself is macOS only. I built up my script in a Swift Playground, which when using default settings, would display an error when trying to instantiate the PhotogrammetrySession object.

This is fixed by changing the platform to macOS, which is in a panel on the right hand side of the Swift Playground.

For capturing the subject itself, the WWDC video in fact recommends something on the order of 100 images for best results, greatly exceeding the 18 that I got of Puppy. You can see why if you look more closely at the rendering. In the three images below:

  • The top of Puppy’s head is filled in with gray when viewed from above. This is as-expected, Puppy is quite tall, and I didn’t get any photos from above.
  • The fence around Puppy gets super-imposed on Puppy’s legs. I imagine in a flat image, the computing process has no way of differentiating between the fence in front and the subject behind. I wonder if I had taken the same angles, but at a few different distances, that it may have been able to see past the fence.
  • A section of Puppy’s belly appears to be missing. This is on the shadow side. I noticed this with another object I tried scanning, if there is too much shadow to be able to discern the form, the process instead just plugs in empty space.

I should also say, Puppy was a “good” example where the process “just worked.” That wasn’t the case in all my attempts, in fact, my first try was with Maman the spider statue from the other side of the Guggenheim. I’m not even sure what I’m looking at here.

Thats not really a fair example though, Maman is a substantially more complicated geometry, and with a whole lot more clutter in the background from buildings, bridges, and people. Altogether though, learning Object Capture has been a positive experience, and is a tool that I hope to make use of in my iOS application development career.

1 thought on “Experimenting with Object Capture in XCode … Puppy!

  1. Pingback: Photogrammetry with a Puppy for all seasons, and other 3D content – DC Engineer

Leave a Reply

Your email address will not be published. Required fields are marked *