Mind Your Image Metadata | Stefanie Molin
Photo by Johan Viirok for PyCon Estonia.
This is me presenting at the PyCon Estonia conference earlier this year. The picture was taken by one of the official event photographers. I don’t remember seeing the photographer take this picture, and, even if I had, I most definitely couldn’t tell you the make and model of the camera or the settings used to take the photograph. Except, now that I have access to the resulting image, I can.
Most devices (cameras, smartphones, etc.) record a variety of metadata when generating images. Some things like information on the camera itself may be innocuous (although, you may not want people seeing what kind of device you have), but others like the latitude and longitude can be extremely dangerous, depending on where you happen to be when the picture is taken and what you do with it afterwards.
Image metadata is stored using the Exchangeable Image File Format (EXIF). To see all of an image’s EXIF metadata, you can run the exiftool
tool on the command line (may require installation first) or use the Pillow Python package. Here, I’ve run it on the original file and reorganized some of the output for readability and brevity (there are well over 300 pieces of information on this particular file):
If your device includes location metadata by default, GPS coordinates for the exact location of every picture you take will be stored. In this case, the latitude and longitude correspond to Mektory in Tallinn, Estonia:
Having location information on your photos can be a nice feature as far as recording memories (e.g., travel photos). However, odds are you don’t change your location settings with each picture you take, and, depending on where you choose to upload your pictures, your sensitive metadata may be accessible to others. Some platforms may remove the metadata for you, but you would need to remember to test it out first.
For example, imagine you take a headshot from the comfort of your home and put it on your website. Unless you removed the metadata before uploading it (or configured your device not to record it), someone could download it from your website and look at the metadata to see where you live. Depending on what other things you put on your website, they could also figure out places you frequent, where your friends and family live, etc.
I created exif-stripper to make it easy to protect myself. It can be used as a pre-commit hook, command line utility, or Python package. For my website, I use it as a pre-commit hook, so I can add images without having to think about it. Just add the following to your .pre-commit-config.yaml
file:
When used as a pre-commit hook, exif-stripper
blocks any commits that have image metadata (EXIF and also extended attributes on some operating systems). In addition to blocking the commits, it also removes the image metadata. Here, I ran the hook manually on the file we have been working with. Notice that the check fails, and the tool has removed the image’s metadata:
When I run exiftool
on the image afterward, only the following general information remains:
Note that this time I’m showing the full output of the tool, meaning we removed roughly 300 metadata values from the file. As you would expect, this also helps reduce file sizes: the image was initially 942 kB and is now 669 kB (nearly a 30% reduction in size). The reduction achieved will depend on how much metadata the file starts with.
For the time being, as this was something I initially built for my website, I’m being overly cautious and removing all metadata. In the future, I plan to add ways to control what is removed or kept; for example, there could be an option to keep information about how the picture was taken (aperture, shutter speed, flash, camera specifications, etc.), while removing everything else. Contributions are welcome, but please review the contributing guide first.
Let me know in the comments below or on social media (LinkedIn or X) how you keep your sensitive metadata safe.