This presentation is part of the 2017 3D Digital Documentation Summit.

Cloud Based 3D Digital Photogrammetry Pertev Pa?a Mosque (Izmit, Turkey)

Jonathan:            Good afternoon. I’ve really enjoyed today and hopefully we’ll be able to share with you a project that’s got a little bit different goals than some of the things we were looking at earlier today. We’re going to be talking about a project we worked on outside of Istanbul in Turkey, the Pertev Paşa mosque where we were documenting … This mosque is part of a broader efforts for plan conservation of the city center of Izmit as well as several other mosques in the region.

When we started this project, we established a few goals and one of them was, “What can we do to document this mosque very quickly with a great deal of accuracy and using standard technology?” We didn’t have the ability to carry in large pieces of equipment, we didn’t have high-end computers and we wanted to do this quickly, cost-effectively, and we wanted to create a 3D digital model.

Ultimately, as we’ll talk about a little bit later in the presentation, the goal became to turn this into a virtual environment. So, we were able to turn this into a VR environment. Unfortunately, you’ll have to see that on a monitor, because to bring all the rest of the equipment to create that virtual environment was a little bit overbearing.

The idea of 3D documentation of, this idea of photogrammetry isn’t anything new and it’s been around, as we heard a little bit earlier, for quite a number of centuries, but it was really in the 19th century that it took hold and with this idea of overlaying multiple photographs to create a 3 dimensional model of an object. Whether that be something that’s interpreted, whether that’s something that’s developing precise photogrammetry or you’re presenting precise measurements. And was relatively the same processes that we were using even when I was in the school in the 80’s, when we were doing stereo-photogrammetry, overlaying two images to create a three dimensional object. It’s really not until the late 80’s into the 90’s that we started to get into the digital technology. But even then, it was really dealing with overlaying two objects and then tracing or creating a two dimensional drawing from this idea of photogrammetry.

One of the biggest significant developments that we have right now, and we’ve eluded to it in several of previous presentations, is this idea of using these photographs to develop a highly accurate three-dimensional digital models. Doing this through a series of two-dimensional photographs. The idea of this, what we call convergent photogrammetry of these multiple viewpoints that can be correlated to create a three-dimensional representation.

Fundamentally what we’re talking about is two algorithms in contemporary 3D photogrammetry. The first one is what we call a structure for motion algorithm and the second one is a multi view stereo algorithm. The structure for motion algorithm takes a series of photographs, or takes a photograph and based on the lighting, based on the physical parameters of the image, will determine the physical characteristics, the geometry of the camera and the location of the photograph relative to the position of the object.

When you take multiple views and multiple photographs, it will then construct this three-dimensional matrix which will locate where each of those photographs were taken from and then it can start to stitch together common points within each photograph to create a geometry. Based on that framework, that matrix that’s developed by the structure for motion, then uses the next level, which is a multi view stereo algorithm. What the multi view stereo algorithm does is it looks at the color, the shadowing, the lighting parameters of the photograph and it comes up with the most likely three-dimensional shape that explains the photographs that were taken. These have been being developed for actually about 10-15 years and just in the last, say, 3-5 years they’ve become extraordinarily sophisticated and very accurate.

The fundamental process when you’re dealing with photogrammetry and you’re dealing with creating this three-dimensional shape is you’re going to first collect your images, or your photo sets, the computer algorithms will compute the parameters for each camera image, it will next create this matrix and finally construct the three-dimensional form based on that set of images. But it’s also important to note that the accuracy of this final three-dimensional model is only good as the quality of your images, and the number of images that you used. If you’re using a cellphone and you can only upload maybe 40 photographs, you’re not going to get as accurate, versus some of the software that was mentioned earlier, which had these, for practical purposes, no limit to how many photographs you can upload and the level of detail you can get is obviously relative.

There’s a lot of software out there, a lot of applications. So, when we’re thinking about this, we’re not creating software, we’re just trying ti think about how we can use that within our cultural heritage, documentation and interpretation. There’s everything from the simple cellphone applications that you may see all the way to some very much more sophisticated software. These are changing all the time. For example, the Autodesk 123 is a very common one, it was very appropriate for, say, your cellphone or some very simple photogrammetry. I think it had a maximum of about 40 pictures for your cellphone application, 70 for the desktop and we just got an email within the last few weeks that says they’re going to be phasing that out.

Chris:                     It’s discontinued already.

Jonathan:            It’s already disc … Actually by March 31st that has been phased out. So you can still mind your data, but you’re not going to be able to create that. An that was sort of a staple, a lot of people were using that. When we started our project we were in Turkey we were using Autodesk Recap Photo, which is a little bit more high-end, you can get up to about 250 photos per photo set. That was good till about last year, now they’ve [remorphed 00:07:20] it, and it now’s become Autodesk Remake. There’s a little bit differences on how the interfaces work. You have to download an application to your computer, it’s not all cloud-based, and then you have your options for cloud-based versus a local, a processing of your data. Which can add some advantages or disadvantages depending on how want to work at it.

There’s a variety of different methods. We’re using SLR cameras to collect our data, we’re using Autodesk Remake, which does have some limitations of 250 photos per photo set that it will work, but they’re also like the earlier earlier mentioned Agisoft PhotoScan, which is a little bit more powerful, it can use many more photographs in their photo sets and get larger models created. Once we use this, we process these photos … So, we’ve collected our data, we’ve processed them into the initial three-dimensional digital model, we then go through a series of post-processing refinements were we actually have to manipulate that model to make it accurate. There’s a lot of distortions that have to be corrected, there’s some geometries that can be simplified, there’s some geometries that can be overlaid as well, and it all depends on what that final use for. We’ll go into a lot more detail of that a little bit later. We’re using [Rhino 00:08:50], we’re using Autodesk Maya and Mudbox. Maya is a three-dimensional modeling, Mudbox deals more with surfaces and then also sometimes they’ve been getting into us using some Makerware.

We’ll talk a little bit about the mosque to give you some perspective of what we’re presenting today. We’re at the Eastern edge of the sea of Marmara in the town of Izmit. This project, our data collection all happened in September of 2015. We were part of a partnership with Kocaeli University in Izmit, Turkey that had this large project funded by the Scientific and Technological Research Council of Turkey, known as TUBITAK. We were looking at a series of mosques as well as some urban centers surrounding these mosques.

Kocaeli, the region of Kocaeli in the city of Izmit has a very long history. It was first settled around 1200 BCE, became very prominent during the Roman period, in fact it was one of the four regionals capitals during the Roman times. During the Byzantine area it kind of fell off as important as Constantinople rose to importance, a lot of different people came in, there’s a lot of raids, a lot of destruction, so what we see from he Roman period really is pretty minimal. Again, during the Ottoman era it grew to prominence and it continues with that prominence today. It’s really a gateway city. It’s a gateway between the ports and the roads for trade and traveling East to West and vice versa. It did go through some serious earthquakes. There’s been a lot of serious earthquakes there, but the most recent one was in 1999, and that’s what prompted the initiative that we’ve been part of.

The mosque itself was built in the 16th century, in the latter part of the 16th century and was part of a larger complex that included a mosque, a poor house, a travelers lodging, a bath, school, market, fountains and a kitchen, Today, really only the fountains and portion of the school remain as well as some of the walls and gates in the mosque itself. The interior of the mosque is what we’re really going to be focusing on today. We did look at both the interior and the exterior, and we’ll talk about it at the very end, some of the challenges that we had with that dealing with the photogrammetry. But we focused mainly on this room called the Musalla, which is the prayer hall. And focus on documenting the prayer hall.

There’s a lot of unique features, and there’s a lot of commonalities, and these become very important on how you process the data and what we need to ensure that the 3D documents really are accurate. Some of the things that you’ll notice that are important is you got the band of calligraphy, that on first glance may look very similar, but it’s all about unique lettering and unique text, which becomes critical in trying to match up those photographs. You’ll also notice that we have what we’ll call squinches, which are these four half domes at the four corners of the building. And, again, those become very critical that inside there’s decorative calligraphy in those as well, and that became a critical component on how we were able to match up and how the software was able to accurately create the model.

With that I’m going to turn over to Chris and he’s going to walk you to some of the processes that we went through and some of the captures that we were able to create.

Chris:                     Sure. As we got into the photogrammetry process, we broke it down. One of the other goals while we were there was to have to teach them a documentation method, to teach the students a documentation method that they can take with them. We had to achieve that from start to finish including an example project within a week. So photogrammetry suited itself pretty well to that, whereas something like laser scanning is a lot more intensive, time intensive both to capture and then process. So, photogrammetry went itself pretty well.

This was an example of … We went out and I sketched the mosque and its grounds, and then we assigned some different smaller portions … We made what was ultimately an unsuccessful attempt to capture the outside of the mosque that was hindered mostly by a lot of trees that made it difficult to ever get any decent matchable photographs from various vantage points. We broke the process of photogrammetry down into four pretty succinct steps. That is you plan your campaign, you capture your photos, you process it, you send it to whatever software program you’re using and then optionally there’s extra post-processing you can do. For us for the mosque that really didn’t end up happening until well after we returned back to the States after the trip.

Of course, depending on the size and complexity of the project you can repeat those steps as many times as you’d like. Indicated here you can see the … we plan to break down the porch and the sides, and the outside of the mosque into separate campaigns, just due to the limit of photographs you can submit at once. And then the idea would’ve been to combine all of those completed photogrammetry models into one super model. Something very similar to the laser scanning method were you do multiple scans and then combine them later.

I’m sure a lot of you are already very familiar with the photogrammetry process, so I’ll try to keep it brief, but for those of you still learning we taught talked a lot of the basics of how to do a photogrammetry campaign and the golden rule that we’ve discovered along the way was that you need to always minimize the number of inconsistencies that you have, whether that’s your camera settings, whether that’s controlling the subject itself. Whatever you do, try to minimize any sort of change that happens from when you take your first photograph to when you take your last one.

Here’s your exposure settings. On one side you’ve got something overexposed and then underexposed. So then you end up with the problem of one picture looks like a black and blue dress and the other one its white and gold and then everyone gets in a big fight and the software gives up and it goes home. The other one is shadows. Here intentionally lint over and got in the way of a light, you created the shadow that creates an artificial point that the software might try and latch on to and say, “Oh, this is a point, let’s look for it in other pictures.” Can’t find it, so it gives up.

Same sort of thing would happen on a really large building on a cloudy day or a partially cloudy day when you’ve got clouds moving in and out, you’ve got shadows that are appearing and disappearing. So, again, try and minimize that. Either shoot on a day with full sun or ideally when it’s overcast, it’s actually your best conditions.

The second thing to do is to determine the two basic methods of capture, one is inside-out, one is outside-in. Outside-in is when you’re looking inward at a sculpture and you can get around the outside of it and everything is … it generally ends up being a convex shape more-less. And then inside-out tends to be a little more complex of a capture method were you have to, I think the best method that we’ve developed so far is to move around a space that you’re trying to capture the interior of or something that wraps around you, and move from spot to spot taking lots of panoramic pictures. Just kind of sweep, blaster your camera all over the place.

The method that I, the analog that I kept trying to tell them was imagine like you’re spray painting and you want to put multiple coats on everything so you can see this wall from here, now go over there, paint it, move over there, paint it and keep moving on till you’ve seen everything a few times and covered it with multiple coats of paint.

For the mosque itself the method that we used … Actually most of the mosque was captured in one very quick round. The first day when we were just turning around the entire site, the multiple mosques, we were able to enter the mosque and access only that balcony. So, while we were there our tour guides were telling us about the place, and I was moving around in between and from these six points along the balcony … You’ve got these three balconies, and I stood at either end of each one, and just did a low pass, a mid pass and a high pass looking into the mosque itself. Just on a whim with a little bit of planning, but then processed them overnight and then had a model that was effectively the floor, the back or front wall, the two sides and most of the dome, so that was almost 80% coverage of the entire interior of the mosque just on one very quick photo campaign.

That was entirely captured out of just 129 photographs. That was done in I would say probably about half an hour. And then a couple days later we were allowed to come back and access the entire mosque. So, this is looking essentially from the ventage point of one of those balconies and showing all the other photo locations that we were able to get to including the mimbar, they were able to climb on the steps and get some high vantage points from lots of different angles, that was very advantageous, and then capture back towards the balcony that, of course, you can’t capture when you’re a part of it.

This is a look at the interface of Autodesk Recap as it stands right now, because as we already mentioned, it’s a very dynamic field. So, the interfaces, the entire programs themselves are constantly changing. So this is an optional registration step were you can capture, once you have all your photos, pic two different pictures and then identify manually various points that you know show up in different pictures. These are two different pictures, but we’ve identified where the railing meets this post, the top of this arch and the bottom corner of this doorway. Identified it in two photographs and then you can manually do that through all 250 photographs with the multiple points and end up with countless dozens of point, just to give the two algorithms a helping hand to get them started.

That’s the finished product, what it looks like in the Recap. So, as Jonathan mentioned earlier, one of the unique advantages we had that lent itself, it was a very lucky coincidence almost, was that the style of ornamentation in a mosque is always calligraphy and they do different calligraphy all the way around, which, to someone who does not read Arabic, it all looks very similar to me. But when you look at the geometry of it, it’s highly variable, which creates essentially a tracking target for the software to latch on to.

This is a closeup example of those four squinches. There’s the calligraphy inside the dome and then these two spots also on either side of each dome, and each one is unique so that when you captured a photograph that partially contained one of the squinches, the software was able to identify which dome, because as you look at the mosque, the bottom has a bilateral symmetry, so either side reflects itself. When you get up to the level of the squinches, you’ve got four directional rotational symmetry and then when you get up to the dome itself it’s completely identical in 24 different directions from a geometric standpoint, but once you add the texture and the visual component, then there’s actually some differentiation to identify which way you’re looking.

As we mentioned, there’s the other goal that trying to developed out of this was, in addition to capturing accurately, we also wanted to have this as an experience that we could bring back and give to other people without having to take them to Turkey. Because that, obviously, gets cost prohibited very quickly. So, what we had to do then was start to simplify the model that we had. So, after processing the mosque and having the entire model we decimated the mesh, which is essentially reducing it to try and preserve the overall shape, the overall form and any substantial changes, but at the same time trying to eliminate any duplicate meshes or lots of detail in places where it’s completely unnecessary.

You can see this was our original mesh, which is very dense, and that was the final mesh that it was decimated down to. Then there’s the associated texture maps above. The original one was pretty detailed, you can see a slight loss of quality, but it was still enough so that you could see the detail in the calligraphy in the overall impression of what the space was like.

The original model was 252 megabytes and composed of about 11.8 million polygons. But once it was remastered into the lower optimized version, it was an 8.9 megabyte file that was made up of only 340 thousand polygons. One of the processes that we went through, this was something that a [GA 00:24:35] and my lap did, was create a proxy geometry. I think it gets back to the [inaudible 00:24:43] presentation of what is the accurate thing and what is the intended thing and where do you draw that line, and when you’re doing trying to optimize something for virtual reality or to run live-rendered view on a computer, you’re going to want to trend toward maybe less accurate but more of what the intent of the design was.

Here you can roughly see the highly dense dome mesh that was created, and then in red a proxy geometry dome that is in actual sphere, versus the rough modeled look of the photogrammetry capture. But more importantly, you can see how massive those mesh face are versus the original mesh. That’s just one example of the many different spots where the model was carved up but then replaced with something that was nearly accurate but, more importantly much less heavy from a computer processing standpoint. That’s just another view of it.

The other thing in addition to reduce some of the geometry is also reducing the texture maps of the model itself. On the left side you’ll see one of the raw texture files that comes out of photogrammetry. If you’ve ever dug into a photogrammetry model, you’ll probably recognize the very haphazard [quote 00:26:10] work that is that is one of those texture maps. It’s very perspective-based and kind of random, but then on the right was the remastered file that our GA processed. Everything is kind of chopped up a little more logically. The larger areas get more presence, the smaller areas get less. The floor area was completely repainted, essentially, he took it into the combination of Mudbox and Zbrush, and straightened out some lines, evened out color tones and all of that.

Then we’ll go to the live model here of what we ended up with. You can see the floors. We originally had a problem were it was kind of bumpy, but it was all leveled out replaced with flat mesh. A lot of these niches were ballooned out because we didn’t focus on them directly. Then from the outside, if we go behind the walls, they were replaced with very simple, just six-sided, actually just five-sided little cubes. Then out here you can see the entire scope of the model and how everything was … I guess it would’ve been good to show the original with all the balloony stuff. There you go.

Then, also, for example the chandelier was completely replaced and modeled by hand, because something that small and that detailed is, of course, a very poor candidate for photogrammetry campaign, so that was recreated. Additionally, for virtual reality the other consideration you have to have trying to run this live on a computer were you can move around quickly without dropping frames, which, when you were in virtual reality, that leads to some motion sickness, and then also given the consideration that when you’re doing VR, you have to render two cameras at once, rather than just the one for your screen, you have to do one for each eyeball, so you’re effectively doubling the demands on your machine. That was the ‘why’ of what we did.

I’m going to give it back to Jonathan at this point.

Jonathan:            There we go, thanks.

If you started looking at the advertisements and you look at the little videos that you get on YouTube about 3D photogrammetry, about 123D Catch, even getting into some of the remake and into some of the other software package, they make it look really easy and really fast. But the truth it does take a lot of post-processing, it does take a lot of understanding the geometries and it does take an opportunity, unless you have really strong computer powers, a reduction in that to make it feasible and make it workable on my desktop or my laptop that I can carry with me under the site. With that, there’s a lot of drawbacks.

One of the biggest drawbacks that we found is just simply the size of the files and the number of photographs that you can upload reasonably. So, if you’re working in the field, and for us it was, I think, we had a total of 238 images to create just the Musalla, the prayer hall. It took about two hours to upload, because we had a very slow internet connection. But then we were able to process it on the cloud so we came back in six hours and we had the basic geometry and the basic model created. It did take about 20 hours of post-processing back at the lab to get it into the state that you saw there that Chris was zooming in and out of.

If you get into larger complexes and you still want a certain level of detail, and you want to use these small photo sets, you run into the issue of trying to piece things together. Doing that in a [Rhyno 00:30:55] or some other type of software once you pull them all together.

Another limitations that’s really quite important is this idea of surfaces. This is actually one of the airplanes that Chris flies. When he was first trying this out, this structure for motion algorithms have a hard time distinguishing between surfaces and materials that don’t have any distinguishing features. So, in this case, particularly when they’re smooth, particularly when they’re reflective and when they’re monochromatic. So, in the mosque where you had very similar, the squinches, the quarter domes, the fact that that calligraphy was unique was critical to the success of this project. But when you have long, white walls or long surfaces that have not a lot of detail it becomes more difficult.

One of the solutions that Chris has come up with is post-it notes. Different color of post-it notes putting them in several places, just trying to distinguish them. But, what ends up happening is you get wholes and voids where the software isn’t being able to process this.

The idea of repetitive features, this is the twelve-sided fountain that’s outside the mosque. So, [inaudible 00:32:21] about a fountain outside of a mosque were you wash before you enter for prayers. This was a perfectly symmetrical around with twelve surfaces that the software just couldn’t distinguish. So, one of the challenges we ran into here is you get very distorted geometries because the software can’t distinguish one photograph from another.

One of the things we also forgot to mention a little bit earlier is the goal of having each point found in at least three photographs. It’s really critical as you’re doing this. But then the computer can’t distinguish which points come from which surface, you get very distorted geometries. So, as much as there are some real advantages to this in terms of very quick on-sight data collection, by using the cloud-based processing you’re minimizing the need to have powerful computing power at your location or back at your offices or at your labs. And then being able to take those geometries and then manipulate them or reduce them as you see fit. There’s also some limitations in what can be done in these larger scales photogrammetric processes. Thank you.



The use of photogrammetry as a tool to aid in the documentation of cultural heritage has a long history as a means to create scalable documents from 2?D photographs. Recent advances in technology paired with availability of cloud?based processing present ever growing opportunities to document heritage sites. This can be achieved with minimally training, a consumer?grade camera or smartphone, and an internet connection. Application dependent web servers manage almost the entire photogrammetric process, including image registration, object matching, photo?stitching, 3?D mesh generation and rendering. These cloud based applications make 3?D photogrammetry more accessible and cost?effective as ever.

The increasingly rapid advancements in photogrammetric technology have become possible due to significant progress in calculation software, three?dimensional generation software, automation, and sensor technology. Digital 2?D images captured from consumer grade cameras, combined with the development of easily access photogrammetric software, provide opportunities to document heritage sites quickly, easily, and without the need for expensive, difficult to transport equipment. There is no longer the need for time consuming post image capture processing to orient and stitch together images to generate a useable model.

This paper presents the application of digital photogrammetry completed of the 16th century Pertev Pa?a Mosque in Izmit, Turkey using close?range, cloud based 3?D photogrammetry. Pertev Pa?a Mosque is a single domed mosque completed in 1579 built during the Ottoman reign of Selim II. Located at the eastern end of the Sea of Marmara in the city of Izmit, it is part of a complex that includes walled courtyard with gates, a fountain, and a school. The mosque sustained damage during an earthquake in 1999.

Cloud Based Digital Photogrammetry starts with the on?site collection of digital photographs. While the photographs can be taken with a device as common as a smart phone, a camera with greater resolution, clarity, and picture quality will produce a more accurate and detailed digital 3?D model. Digital SLR cameras with a quality lens are capable of capturing information in order to have consistency and quality. Field planning prior to capturing the photographs include a few control measurements to provide accurate scale to the finished 3?D model. Variables such as lighting, access, obstructions, and photo sequence are important considerations, especially with complex subjects or subjects with repetitive features. With proper planning, the required photography can be completed quickly. The interior 3?D model of the Pertev Pa?a Mosque was created utilizing 235 photographs which were captured in less than two hours.

Photographs are uploaded to the cloud?based server for processing by automated modeling software (i.e. Autodesk ReMake). The software analyzes the 2?D images to create a 3?D polygonal textured mesh model.

The 3?D mesh models are generated as several file types. These files can be downloaded to a local computer for post processing or manipulations. For example, for larger projects, multiple 3?D Mesh models can be created and merged together. A building’s interior 3?D model can be inserted into the 3? D mesh model of the building’s exterior. This can be done with a surface modeling software (i.e. Maya or Rhino). In the case of Pertev Pa?a Mosque, the model was further refined through Maya and Mudbox to eliminate “ballooning” and other deviant geometry inherent to the photogrammetry process.

Through cloud?based 3?D photogrammetry, the ability to accurately document heritage sites, even in remote locations, is more accessible than ever before. There are still challenges to be overcome, such as the scale of the building, occlusions, and repetitive details. This method does have a base level of dimensional accuracy, it does not have the precise dimensional accuracy of 3?D laser scanning. However, the low cost, accessibility, minimal field time, and simplicity of use make it a viable options in our documentation tool bag.

Speaker Bio

Jonathan Spodek, FAIA, FAPT is a Professor of Architecture at Ball State University teaching in the areas of architectural design studio and building technology that include building documentation, historic building construction materials and techniques, and evaluation/diagnostic methods. Jonathan’s research interests focus on non?destructive building evaluation. For more than ten years, Jonathan has co?lead international workshops on Heritage Conservation exposing a diverse group of emerging architects to international perspectives of building conservation, architecture, and heritage. Beyond the university, Jonathan has been involved with several professional organizations including the American Institute of Architects, the Historic American Buildings Survey, and the Association for Preservation Technology.

Christopher Harrison, Assoc. AIA is an architectural graduate and Virtual 3D Designer and Modeler for the Institute for Digital Intermedia Arts (IDIA Lab) at Ball State University. Chris was part of the team to digitally recreate Hadrian’s Villa working with virtual archaeologist and scholar Dr. Bernard Frischer. He is currently working on the digital photogrammetry of the pre?historic earthworks of the native American Adena?Hopewell people at Indiana’s Mounds State Park.

National Center for Preservation Technology and Training
645 University Parkway
Natchitoches, LA 71457

Email: ncptt[at]
Phone: (318) 356-7444
Fax: (318) 356-9119