It's considerably more expensive than $0, and just because $79 doesn't seem like a big issue in the heat of passion doesn't mean that you won't kick yourself for it later.
+1 for TestDisk. That gem literally saved my butt years ago when a controller failure on my old NAS almost destroyed all my data by turning all disks (Linux soft RAID1) into unreadable doorstoppers. And it wasn't like just a damaged partition table or anything like that; each file had to be recovered and copied elsewhere.
Time Machine backups should live on Apple Filesystems. The Ext4 or btrfs in Synology's is a timebomb for Apple's proprietary data blobs. It has something to do with the filesystem attributes.
I asked my father to just give up on it. Just manage pictures through the file system. Photos (former iPhotos) databases also get corrupt on a Synology NAS. It's a matter of time. The strange thing is that it can go well for months on end, giving you a false sense of security.
Do iCloud or local drive backups or stay away from Time Machine and Apple Databases is my advice (although my father had a lot of problems as well with an "incorrectly unplugged" external HFS+ drive). I can't find the sources right now but after my father's last drama (and there have been several) I did some intensive searching and this was my conclusion.
Filesystem likely doesn’t have anything to do with it.
What’s more likely, is Apple’s notoriously unreliable implementation of SMB causing the problem (and that’s the only option now that AFP support on Mac is dead)
I have a Synology DS220+ and connecting to it from a Windows machine vs a MacOS machine is like night and day.
On Windows, it literally feels like the NAS is an extension of my local hard drive. Browsing huge directories of thumbnails is snappy, file and folder names appear instantly. It’s a dream.
On MacOS, connecting to anything over SMB is a total nightmare. Aside from the constant mounting and unmounting (fun!), it’s just plain unreliable and slow.
And people have been complaining about this for years.
What’s even more funny, I have a friend who works for Apple and apparently they use NAS storage in some teams and deal with the exact same annoyances!
If Apple’s own employees have this problem, it’s hopeless they'll ever fix it for customers.
I’d put my money on Apple’s SMB implementation being the root cause of this file corruption issue that has been all over the Reddit Synology user forums lately.
The Time Machine over NAS situation is very frustrating.
It used to be over Apple's proprietary AFP protocol. With the exception of Apple's now-discontinued Time Capsule product line, all NAS implement AFP using the open-source Netatalk, presumably with reverse-engineered AFP protocol. And it's unreliable.
With recent versions of macOS, Time Machine switches to SMB protocol. Apple has a custom SMB implementation, and all NAS use Samba. And it's still unreliable!
I guess the only reliable solution to use Time Machine over network is to use a Mac with File Sharing over SMB enabled. At least both ends run Apple's SMB implementation.
Compound this issue with not being able to disable spotlight indexing of time machine backups on a NAS. Often I find mds is indexing for 4-6 hours at a time, using 20-40% cpu, with no way to stop or disable it (aside from disabling time machine). Its a real shame that Apple is allowing a growing list of paper cuts to fester on an otherwise pretty solid os (looking at you thunderbolt display kernel panics, unreliable external display support, etc).
Cool TIL about mdutil. I still cannot disable it as mdutil seems to be having trouble parsing the path to the share. Ive tried in bash/zsh/sh with all quoting/escaping patterns I can think of and I must be missing something.
(shows up in in `mdutil -s -a`)
/Volumes/Backups of Saucy's MacBook Pro:
Indexing enabled.
sudo mdutil -i off /Volumes/Backups\ of\ Saucy's\ MacBook\ Pro
Error: could not resolve path `/Volumes/Backups of Saucy’s MacBook Pro'.
In my experience, if you use a file share consistently, it becomes stale over time and has to be remounted.
One thing that I’ve noticed with Apple is that there is a “happy path” they they design and test. If you happen upon the magic combo, you’re good, if you’re off the path, you are on untested ground. My guess is that Apple tests against a specific Windows or samba smb version/config, and doesn’t look for any regressions outside of that.
I had a Time Capsule. It had the same issue with Time Machine rejecting the backup and requiring a re-do every year or so. (the Time Capsule ran NetBSD, I wonder what they used for a AFB/SMB stack. Did they port the macOS one over or use Netatalk/Samba?)
Time Machine over a network has always been unreliable.
Time capsule was apple’s wireless router with built in hard drive for backups (time machine)if people don’t know.
I had one too and reading this article was wondering how reliably that worked. (I think thankfully I only had to do some very minor “ get back previous file” on it. I remember it being slow..)
Time Machine's "Floating Time Tunnel" user interface for browsing backups and restoring files is such a useless pretentious piece of shit. I DO NOT CARE for it taking over the entire screen with its idiotic animation, that prevents me from browsing current Finder folders at the same time or DOING ANYTHING ELSE like looking at a list of files I want to retrieve on the same screen.
It even sadistically blacks out every other connected display, and disables Alt-Tab, as if it was so fucking important that it had to lock you out of the rest of your system while you use it.
You can't just quickly Alt-Tab to flip back to another app to check something before deciding which file to restore and then Alt-Tab back to where you were. No, that would be too easy, and you'd miss out on all that great full screen animation. It not only takes a long time to start up and play its opening animations, but when you cancel it, it SLOWLY animates and cross fades back to the starting place, so you LOSE the time and location context that you laboriously browsed to, and then you have to take even more time and effort to get back to where you just were.
It was designed by a bunch of newly graduated Trump University graphics designers on cocaine, with absolutely NO knowledge or care in the world about usability or ergonomics or usefulness, who only wanted to have something flashy and shiny to buff up their portfolios and blog about, and now we're all STUCK with it, at our peril.
Crucial system utilities should not be designed to look and operate like video games, and turn a powerful mutitasking Unix operating system interface into a single tasking Playstation game interface. ESPECIALLY not backup utilities. There is absolutely no reason it needs to take over the entire screen and lock out all other programs, and have such a ridiculously gimmicky and useless user interface.
Whatever the fuck is wrong with Apple has been very very wrong since the inception of Time Machine and is STILL very wrong. How can you "Think Different" if you're not bothering to think at all?
>Core Animation will allow programmers to give their applications flashy, animated interfaces. Some developers think Core Animation is so important, it will usher in the biggest changes to computer interfaces since the original Mac shipped three decades ago.
>"The revolution coming with Core Animation is akin to the one that came from the original Mac in 1984," says Wil Shipley, developer of the personal media-cataloging application Delicious Library. "We're going to see a whole new world of user-interface metaphors with Core Animation."
>Shipley predicts that Core Animation will kick-start a new era of interface experimentation, and may lead to an entirely new visual language for designing desktop interfaces. The traditional desktop may become a multilayered three-dimensional environment where windows flip around or zoom in and out. Double-clicks and keystrokes could give way to mouse gestures and other forms of complex user input.
>The Core Animation "revolution" is already starting to happen. Apple's iPhone at the end of the month will see people using their fingers to flip through media libraries, and pinching their fingers together to resize photos.
>The "Delicious generation" is a breed of young developers who embrace interface experimentation and brash marketing. The term "Delicious generation" was meant as an insult, but they wear it as a badge of honor.
>Image: Adam BettsShipley's initial release of Delicious Library, with its glossy, highly refined interface, gave birth to a new breed of developers dubbed the "Delicious generation." For these Mac developers, interface experimentation is one of the big appeals of programming.
[...]
>Apple has been ignoring its own HIG for some time in applications like QuickTime, and is abandoning them completely in upcoming Leopard applications like Time Machine.
>Functionality-wise, Time Machine is a banal program -- a content-version-control system that makes periodic, automated backups of a computer's hard drive.
>But Apple's take on the age-old task of incremental backups features a 3-D visual browser that allows users to move forward and backward through time using a virtual "time tunnel" reminiscent of a Doctor Who title sequence. It's completely unlike any interface currently used in Mac OS X.
[...]
>While it seems logical to speculate that interfaces like those of Time Machine and Spaces will lead to the end of the familiar "window" framework for desktop applications altogether, many Mac developers predict that the most basic elements of the current user interface forms won't disappear entirely.
Not sure if the tone gets you downvoted but I can completely feel your rage…
I thought the animation was intentionally there to keep you engaged and hide the fact that Time Machine restoring is super slow, especially over network.
Sorry for my frustrated tone, I've been grinding my gears about that for many years. Backups are not a game. The fact that I always have to "play the Time Machine Game" when under the stress of needing to retrieve something from my backup tends to make me pretty angry at the interface, yes!
Even if Time Machine were 100% reliable and didn't randomly trash your backup all the time, asking users to wait through all those gratuitous vanglorious Doctor Who animations to find out whether or not they're screwed is not very "Delicious".
If only they'd applied all that unbridled creativity to something harmless like the About This Mac box instead of the backup interface.
I have a media server I keep movies and TV shows on, and connect to Samba from my Mac. I agree, the performance is ridiculously bad. Part of it, I suspect, is Finder.
I wonder if NFS works any better? Or maybe Apple's old AFS/AFP, which I think used to be more solid than SMB on Macs? Though I read something about Apple deprecating (or removing?) AFP support recently.
From my long experience building high-end NAS servers and struggling with Macs, generally using 10GigE connections:
Using SMB, the typical Mac pro can't do much better than 150/200MB/s to/from the NAS.
The very same Mac booted on Windows via Boot Camp reads/writes at 1GB/s on the same NAS.
Back to MacOS, using either NFS or AFP the Mac easily reads/writes at 1GB/s on the NAS.
The SMB implementation of MacOS is utterly broken, and has been for ages. NFS works fine generally, but some programs such as Quicktime pro and more annoyingly, the Finder sometimes have trouble with it.
Unfortunately, the fastest and most reliable option by a large margin still is AFP, using Netatalk. If you take care of cleaning regularly the CNID database, it works like a charm. I have many customers using servers with hundreds of TB of storage with AFP and it just works.
There's no SMB on my home network at all, since I don't have any Windows machines to support. For file storage/access, I use NFS which works beautifully across my Macs and Linux boxes. Speeds are appropriate for a 1Gbps link. For Time Machine, it's netatalk pointed at a HFS+ volume:
[Time Machine]
path = /mnt/backups
time machine = yes
This setup has been rock-solid for close to a decade, and persisted across different NAS hardware. Current NAS is a Buffalo Terastation minus it's crappy built-in software, plus a barebones Debian install.
They do, under the hood. (or did at one point.) the problem is it doesn't work. it's functional, but just doesn't give the level of performance needed at the top end (where wifi isn't a good enough connection).
They used to, a very long time ago, and when they did it was an outdated version. Nowadays they have a proprietary alternative that sucks.
Modern Samba is limited by the TCP stack. You can get 10gbps to work as long as your TCP stack is performant enough with very basic tuning on the server side. The client "just works".
macOS stopped shipping Samba with 10.7 Lion. Even before Samba was outdated.
Not to get political but I always thought it was a mistake to make "modern" samba GPLv3. All it did was make all these hardware vendors either stick with the older GPLv2 stuff or... I dunno what else they can do besides write their own implementation.
Samba is too much of an "infrastructure" codebase. It should have been BSD licensed. Vendors would have strong incentives to merge changes back into the mainline... none of them want to have their own wacko implementations.
Samba operating on the application layer and working the way it does means you can totally ship a computer that runs Samba with the immediate tooling around Samba also bring GPLv3 and the rest being proprietary.
Even when they did, somewhat 10+ years ago, it is still far from perfect for whatever reason. But no one cared much because we were all taught to just use AFP.
Ten years ago Apple abandoned Samba. But Samba back then was much worse than Samba today, which is of equal quality to the Windows Server stack but harder to configure.
Yes, you can use iscsi. I used iscsi initiator on macOS to make time machine backups on an exported volume from an opensolaris zfs tank. Worked fine for many years, although I no longer use that approach.
You have to install iscsi initiator on the Mac, since macOS doesn’t come with it.
Unfortunately, despite being a nightmare, SMB seems to be still be the best option for networked filesystems on macOS. AFP is now officially deprecated/unsupported, and the NFS client seems to be even worse than Samba/SMB. One last hope was macFUSE/sshfs, however this seemed also to be more or less broken when I tried it (extremely slow speeds, issues with disconnecting, etc).
I only rarely need to work with Macs, but Apple rewrote its SMB implementation sometime in the 10.12 or 10.13 timeframe (to avoid GPL licensing of the existing open-source one, I believe) and it's been terrible ever since. Even the OSS one wasn't great but I remember reliability was much improved when forced to CIFSv1 (the original simple-but-insecure protocol) than when using any of the newer versions.
...I suppose anyone who has the time to work on this problem could find the last open-source release that Apple used, and port it to the newer versions of macOS.
It works. and it is impressive work what they've done, but let's face it, it is not even close to being as ergonomically and plug and play as it is on a windows only environment.
It is a shame that microsoft still haven't truly make SMB an open protocol.
I will have to check this out. Do you believe that this will help iOS?
I have a Synology, which I tend to run a little behind updates, since it is not directly connected to the internet. Trying to look through files on an iPad or iPhone is ... painful. The search does not work in the slightest. It loses the connection. Scanning a directory takes forever and is not cached ...
You can’t downgrade and there are some issues with upgrading (eg takes some work to properly migrate some packages like Plex over), so upgrade carefully and thoughtfully, but it’s been fine for me.
> What’s more likely, is Apple’s notoriously unreliable implementation of SMB causing the problem (and that’s the only option now that AFP support on Mac is dead)
How is AFP support on Mac dead? I’m doing Time Machine backups to synology via AFP
> What’s more likely, is Apple’s notoriously unreliable implementation of SMB causing the problem (and that’s the only option now that AFP support on Mac is dead)
Are we sure this isn't a Synology issue? I'm all-SMB for both shares and Time Machine/CCC backups on a QNAP NAS and have never had an issue. (Caveat: I moved from Lightroom to Photos this year, and am now using iSCSI APFS volume for that.)
>And people have been complaining about this for years.
And we are now coming to 2022, let me say this has been the cases for decades. Their SMB implementation has gotten better in the past 5-6 years but it is still far from the fit and finish on windows and linux.
WebDAV is also really crappy on MacOS, I was sitting next to my father and my Ubuntu laptop with Gnome Files literally got 4x his MacOS performance connecting to my Nextcloud box.
Are these applications really adding so much value that it's worth it with their proprietary catalogs? My personal solution for photo storage is a normal directory tree with files. I have Piwigo set up to catalog those so that I can browse by shooting date and tags, but the file hierarchy stays untouched. I can pull files from there to edit with any application and there's a lot of proven tools to keep the files in good shape and backed up.
Yes, definitely. In the case of Lightroom, it is preserving your original input photo along with a list of edits and workflows you've applied to it. It also manages a database of preview images, so browsing full resolution albums is fast
There are no destructive edits in Lightroom unless you really go out of your way to cause the destruction
It also has client-side face recognition / clustering which relies on a local database, indexing by geographic location for GPS-tagged images, etc.
Essentially nobody needs Lightroom until they try it, after which it easily becomes impossible to live without and there is no replacement
I've used Capture One Pro for a similar amount of time and anything else (LR included) feels like a step down. Probably mostly inertia and familiarity by this point.
Yeah, my goal is to be able to run my next photo software natively on a linux box, and to my knowledge Capture One is only available on MacOS and Windows. It does look like great software, though.
I switched from only LR to only RawTherapee for all new shots going forward about a year ago. Never looked back. Might not be suitable for a professional expecting smooth conventional workflow, but excellent for an enthusiast who enjoys full control and nerding over raw data interpretation.
You don't need the Lightroom catalogue to do most of what it does, including caching previews. I strongly prefer to use Adobe Bridge and Adobe Camera Raw because I have access to nearly all the tools in Lightroom without its wonky and cumbersome catalogue management process.
If you're in the Apple ecosystem with an iPhone, some iPads and a Macbook, you sort of roll into it. Is it worth it? Not sure, it's sure easy and just works. But you're on your own when you try to interface with your data in "non conventional" ways.
I myself use NextCloud for everything, I recently moved from Android to iOS and it's nice to see most things working... except that NextCloud has issues making previews from .heic pictures (or I should say heic picture containers containing heif images coded in hevc :s), and so the drama starts again. It's always plug and pray outside the Apple ecosystem, always ymmv.
Haha, yes it felt wrong subconsciously to call it non-conventional. Now that you mention it, I totally agree that it is Apple who in the games of doing things the non-conventional way. The opposite of the Unix philosophy. "Take all things, throw them together and make sure they only work in one, blessed, linear way."
Every once in a while I'll make an external backup of my parents' photo collection. Every single time, it's a royal pain to figure out where Apple has decided to put the iPhoto directory. There's no UI that points to it, no settings to configure the save location. It's like you're supposed to forget that these are your pictures to access as you please, and instead treat them as iPhoto's pictures, only to be accessed in the blessed interface.
I love that I can quickly click on a person and then get all the photos with that person. The same for a location.
The "memories" slideshows that iPhone or gphoto generate are sometimes also very nice to see.
The search functionality also comes handy once in a while. So that I can search for pictures of certain things.
For sharing photos the shared albums are also very easy to work with. Both to create and the receiver to import any interesting photos to my/their library.
Giving up the file system is difficult but it’s a godsend. I hated losing the Events when moving from iPhoto to Photos, but in practice no one wants to organize photos and I certainly did not enjoy it, as anal I as I was.
Nowadays we don’t have events anymore, it’s just a continuous flow or random photos that may or may not belong to a specific event. The great part is that photos are always in chronological order and I never have to deal with “files” (copies, same names, etc).
The only exception are professionals and Photos.app definitely isn’t intended for them.
I too prefer chronological order instead of some "organized" collection of directories. My point is that this is something that a photo indexing app of some sort (Piwigo in my case) should do and leave the files be.
Like many here I had the exact same issue mentioned by the poster, Time Machine on Synology just kept failing, with incredibly unhelpful error messages that basically said, let's trash and backup again.
I used AFP which was recommended back then and that worked really well for years (since 2011). But since (maybe) Catalina, issues started creeping up and it would just randomly fail. It used to be once in a while, then it became a weekly occurence before I gave up.
Samba isn't better, my mounted shares get randomly disconnected overnight fairly often too (even now on Monterey), and switching from the old Synology to a fresh dedicated NAS machine didn't change a thing.
At that point I think in general it's just "local networking" that became less reliable around that time, whether it's some power saving feature, or something else up the stack, I don't know.
The only scenario where Time Machine works flawlessly for me is using an external SSD drive for backup, formatted as APFS. At least for now.
At this stage I think Time Machine is barely fit for purpose for backing up over the network. I've lost days on this issue over the years too.
I have always been totally confused as to whether I should be using AFP or SMB (tried both). As others have said, SMB often seems very unreliable, and AFP is supposedly being deprecated...
Mounting network shares from my Synology to Mac(s) is never flawless either. As other comments have noted, this experience is very much worse than what it used to be like in Windows (not that I've mounted network shares in Windows for a while).
And same exact experience on mounting shares by the way.
When setting up my new NAS last week, I ended up booting a Windows PC to check if Samba was correctly configured because macOS kept throwing weird inscrutable errors semi randomly.
I know it's a terrible protocol but it certainly became worse on macOS on the past 3 years-ish.
I have a couple of 6tb usb disks formatted as a Btrfs RAID 1 volume plugged into a Rpi4, it's not the fastest but it has been reliable for over a year. I always make sure I stop the backup before pulling the cord on my laptop though. My wife hadn't been doing this until recently and she has had more problems with both over the network and directly connected usb disks; she has both. At some point I'm going to create a duplicate setup in my house and send Btrfs snapshots over the internet to have an offsite backup, but I haven't got round to this yet. Currently our offsite backup is just taking a laptop home...
Time Machine backups on remote drives do live inside an Apple file system. They are stored as a mutable disk image.
The critical setting for reliability is to use AFP and not SMB. To this end I have two Synology NAS devices—a multi drive unit shared as SMB for general use and a single drive unit shared as AFP for Time Machine. While I do have occasional backup trees go bad (once every two or three years) the backups themselves are still fine and so I just start a fresh one.
That's not what Apple recommends. Time Machine is pretty shit these days. With their cloud/services strategy, it simply get abandoned.
I use time machine, but I don't trust it, so I have other solutions as well
It doesn't matter what Apple recommends—the recommendations are just dead wrong. AFP works and SMB does not. I religiously check backups and I know they are working.
Slackware is actually the one distro I've succeeded deeply with AFS. One thing that made everything less buggy is to find out what your UID , User ID, was, on the OSX UNIX-compatible system e.g. 1001, let's say, and just make sure the co-responging AFS user/share on the other end shared the same. - No bugged idea why it matters but within linux' AFS implementation paired with OS X it seems to be crucial for some reason, and a lot less headaches.
So therefore: user ' osxking ' with UID ' 1001 ' connects best to user ' osxking ' UID ' 1001 ' on your Slackware AFS server. Good luck man! It will work! < 3 Happy AFS'ing, Slackware served me well!
> The critical setting for reliability is to use AFP and not SMB.
I've been using Time Machine with a QNAP SMB target (for as long as Time Machine has supported SMB) without problems. Reading this thread makes me wonder if the problem is actually Synology NASs.
We tried it with AFP as well, did not help, after some time those time machine backups got corrupted.
We suspected that maybe AFP writes failed because computers were disconnected from network before buffer was written to the server. But there are no visual indicators for that and we did not want to debug it. We just switched to 3rd party backup solution.
I did use time machine on external USB HDD and after some time it got corrupted. Happened multiple times over the years.
Then I tried Time Machine using AFP and SMB on remote NAS. Those also got corrupted multiple times.
The lesson I learned was - do not use TimeMachine.
> It has something to do with the filesystem attributes.
Synology can do all the attributes the HFS+/APFS can do. They do not use standard samba *_xattr modules though, they use their own and the result is all the @eaDir stuff. Do not delete it!
On the other hand, Time Machine is perfectly capable of damaging its archive on the local drive just fine.
omg reading this made me relive that horror I encountered too, of course with pics of the little one.
Now I have 3 backups (of which 1 is just an export again of all the pictures in raw) in 2 locations, using Lightroom, I don't trust photos anymore, and yes on apple filesystems.
My solution was for the LR database to live on local SSD, and the library/catalog on an SMB share on a ZFS server, with weekly backup to AWS Glacier. Other than having to reconnect anytime I wake up the machine from sleep, it works pretty well, and I never have worried about corruption.
Only flaw in the plan AFAICT is that if a bug in LR or the Mac introduce corruption XFS will happily store the checksummed corrupted file. I should probably add ZFS snapshots.
I suggest using Carbon Copy Cloner [0]. I have been bitten by Time Machine corrupting itself and I'm never going back. It works well and they have excellent documentation for pretty much every scenario. And USB backups are bootable. I'm mostly using it with backup to a NAS.
I have a 3x backup combo of Time Machine to local Synology (and I may as well not bother with this), Arq Backup to Arq Cloud (not flawless but I trust it more than Time Machine) and CCC to local USB-C SSD drives.
CCC is the only backup I actually trust out of the 3, but it's not automatic and relies on my plugging my backup SSDs in occasionally to clone the whole drive. (Which is more an issue with my workflow than CCC itself).
That last iPhoto update killed my album, and I've lost a shit tonne of photos and videos. And I couldn't even use my backups either because when I open then up, they too needed to read the entire album to do facial recognition etc... took a few days for each backup, but I only knew photos/videos were gone after all drives had been "updated".
One of these days, I'll plan on writing my own photo management system, with recoverable indexing, optional facial scanning, and zero phoning home.
I have all my photos prior to a certain date in an old iPhoto collection. Can anybody recommend a good way to get photos out of iPhoto without having to manually export thousands of photos?
If you‘re ok with just the raw files, right click the library and "Show Package Contents". Then sort by folder size to find the bulk of your photos or create a spotlight query that aggregates all images above a certain size from subfolders. If you edited a lot of photos, it might be messy. If you didn‘t, I think they are all in a folder called Originals, grouped by year then import session.
It's so easy to think you have backups when you don't.
I have the following setup: my main machine is Windows and my photos are on a local NTFS drive, that is backed up (mirrored) on a Netgear NAS via rsync; I rotate the backup drives on a weekly basis.
Every time the backup job runs it sends an email to tell me how it went. The email is sent via gmail. This used to work well but at some point (a year ago maybe?) gmail decided this was "unsecure" and stopped forwarding the emails. Instead it sent a notice that "somebody tried to send an email and we blocked it".
I couldn't be bothered to fix it and accepted the gmail warning notice in lieu of the actual Netgear backup report.
Then eventually I upgraded the email notification system... only to find out that the backups were failing systematically.
Luckily nothing was lost as the main drive was fine; I was able to fix the problem and do the backups correctly.
But of course it could easily have gone a different way: the main drive could have failed and with empty mirrors there would have been no solution, and I would have had only myself to blame.
It is so easy to tell oneself that everything is a-okay when it really isn't.
This is like a story where someone says they strap a blindfold on before driving and then the moral is "it's so easy to drive on the wrong side of the road and cause accidents"
Yeah okay, but I should have added that during the (long) time when the emails were being sent correctly, the backups never failed. So I had a high confidence everything was ok.
I don't remember the exact specifics but it had to do with filename encodings. One file had special characters in its name, rsync on Windows (DeltaCopy) was reporting it existing, but it could not be transmitted and so the remote server kept asking for it, or something like that, and after some back and forth the whole process failed.
I picked up a new DSLR and found out my copy of Lightroom 3 didn't support the new RAW format. Not willing to pay $125USD/year for the "latest and greatest" version of Lightroom, I found out I could use Adobe's free DNG conversion tool to just convert the fancy new format to DNG and continue using my bought-and-paid-for copy of Lightroom.
Also, major plus: it supports lossy conversion, which churns out files ⅓ the size of the originals, with no perceptible loss in quality. I ended up converting my entire photo catalog, saving hundreds of GBs of disk space. The tool has a CLI as well.
> DNG strips out most of the unrecognized meta data (such as Active D-Lighting, Picture Controls, Focus Point, etc) from RAW files, making it impossible to retrieve this data from DNG in the future.
The last time I used a DSLR was when I was doing color research in 2012. At the time, raw format was the only thing that preserved all necessary information to make scientific observations.
Most people don't need to care about such things. I just wanted to mention you're irretrievably throwing away metadata when you DNG-dong your pictures.
It's probably a worthwhile trade in most cases though.
Great point about throwing away metadata, and probably worthwhile to safeguard that one's flavor of RAW files (CR2/NEF/etc) can be reliably read in the future when the software necessary to read them inevitably disappears from the cloud.
To be fair: this will use more disk space, but if you want to be able to keep the RAW and use DNG this is the way to go. I wasn't sure if I needed it so when I imported in Lightroom I quickly switched to embedding the original RAW image in the DNG just in case.
Adobe never back ports the raw converters to older versions of Lightroom. If you shoot raw and get a new camera you have to upgrade the software unless you do your additional convert to dng step. Now Adobe wants a subscription it puts us that like Lightroom but don’t use it super frequently in a bind.
The dng converter is a useful tool, though if using Lightroom it’s an extra step. Usually camera makers supply some software that can do the same.
Generally the raws have a lot more information than the lossy photos, so if you need to do some editing (up shadows or darken highlights.. it’s worth keeping the raw around.) but generally jpgs are quite good. (In photoshop I’ve converted a raw and loaded both images and did an diff.. you can see where the changes from compression happen but it’s quite minor)
I use Lightroom enough that I don't mind the subscription but that's the problem with subscriptions in general. Assuming they're priced reasonably fairly (which IMO Adobe's photo subscription is), they're fine for programs you use routinely. They're not so good for something you just need now and then and only need to upgrade for specific reasons.
Agree. The 'Photographer Bundle' also comes with PS and includes all the mobile apps. It's actually a good deal if you use any of them regularly. I even pay for the 1tb in space so it also acts as another place my photos are copied (in addition to iCloud, TM, and backblaze).
The other thing is I just compared (again) raw conversion from my z5 in LR, Raw Power, Affinity, and some others and LR still does the best job. DAM is also an issue outside LR.
I completely understand I'm an outlier though in that I still use large cameras - the z5 is nearly brand new.
I use Lightroom Classic (LRC), but recently tried Lightroom CC and its pretty good. I can sync an album from LRC to LRCC, and then can share the album instantly online - so friends/family can see the RAW quality photos. When I make some edits or want to adjust the album - it all syncs automatically.
For me this is the perfect combo/workflow and I'm happy to pay the subscription. LRCC has some bugs and the gallery view could be better though.
The best thing is that I manage the photo files and folder structure locally with LRC, and can back them up easily, and maintain a good archive without having to sync everything.
I found Apple Photos way of taking full control over your photos super painful. The app is terrible and some tasks are so inefficient. Changing your primary photo library to another disk for example will take days - unless you have an SSD. This is absurd. The library is one folder called `Photos.photolibrary`, but the app seems to need to read every single file. And so many other issues too.
The problem with CC is if you're storing a lot of RAW photos, you'll run out of cloud storage within a year. I started off going this route and now I feel stuck.
Yeah. I have 100K photos which I've done some pruning of but I really don't want to lock myself into having to pay Adobe for sync space. I believe Lightroom Classic also still has some organizational features that haven't been brought forward yet. For the most part, I'm content with working on my photos on my desktop system. I don't feel a lot of need to have everything synced everywhere.
I only use CC for albums I want to share. LRC will always be my source of truth. The cool part of the workflow, is you can just tag photos you want to share in LRC, and then they sync and they're in the cloud for people to view at max resolution.
Although the gallery feature of CC is a bit buggy. A better flow might be to sync an album to Dropbox - but syncing plugins are always unreliable I find.
Only a year! Try a day or two. Their base subscription gives 20GiB AFAICT. Yesterday I went birding, and got 13.8GiB of RAW files, AFTER deleting out-of-focus or mis-exposed shots. I started with ~500 photos, after that pruning I've got 240. Each one is ~60MiB, since I've got a 60.2MP camera (Sony α7Riv). File size is definitely a disadvantage of such high-resolution cameras.
When Lightroom CC first came out it had _very_ few features compared to Lightroom Classic. I don't need the presentation tools, but have the develop and organization features caught up at all?
Personally I wouldn't use that NAS for storing data until figuring out what went wrong, since every single file was corrupted. This is way too high an error rate and unlikely to be caused by a hard drive or network issue. It's probably an issue caused by NFS/SMB/... client/server bugs.
Edit: Quick googling finds many people using Macs report files are corrupted when written to a Synology NAS over SMB.
This is confusing, it sounds like the RAW files themselves are corrupt so why bring any discussion of the Lightroom catalog into it?
Having said that people need to take care backing up their catalog because without that single catalog file you will lose decades of photo editing work in Lightroom.
Given how photos are "imported" into a Lightroom Catalog, some might think thats where they live... On a Mac, IIRC (its been a while since i used LR on a Mac) if you look at your catalog, its a single large "file" with all the meta data, edits, originals, etc, in one place...
This isn't really correct, the original image is always stored in the original place or the place you select to copy it too.
That's why I also think the title is a bit miss leading, cause it was never the Lightroom catalog that was the issue, it was the RAW-images them self.
This sort of thing scares me. It's why I started running consistency checks on my important archives (like my photo library), which I keep backed up in multiple places. We tend to think that in a digital world bits are just bits and do not get corrupted — which is decidedly untrue.
I wrote my own consistency checker, as I wasn't happy with what was out there. I wanted it to be simple, and maintainable in the long term (>10 years horizon). See https://github.com/jwr/ccheck if you need something like this. I now update my checksums regularly and check for corruption.
It used to scare me, too. Then I changed my attitude to the one presented by Stanley Kubrick's famous movie character and stopped worrying. What was the worst thing that would happen if I lost it all? It turns out, for most files, it wouldn't matter much. So basically I identified just a couple of crucial files and I keep their encrypted copies everywhere. As for the rest, I don't care that much, three copies (on my main machine, external drive, and Hetzner) are enough.
I do care very much about my family photo archive. I have photos on paper and glass that are over 120 years old, and I'm afraid the digitalization made us care not nearly enough about longevity of our data.
I take yearly backups on to Blu-Ray M-DISC discs. They're ceramic instead of organic so they're not susceptible to the same kind of oxidation issues most regular optical media has. I make a few copies of the last couple of years of important documents and images (I get some overlap) and store those at a few different locations. Usually other family members I trust.
The amount of important documents and images that I really care about is only a small percentage of all my data. Most of the two year combinations fit on a single 25GB disc but its not terribly expensive to get a 50GB or 100GB disc if needed.
Of course, there's a chance I won't be able to find a Blu-Ray reader in 20-30 years but I imagine there will be some other way to transfer over this dataset when that time comes.
As to the durability of these M-DISCs, I have three 25GB discs I've been testing durability of over the last several years. One sits somewhere outside, often somewhere around the patio table or on a cart under some shade unprotected. Another kicks around on my desk unprotected and gets moved around a lot. Finally a third one sits in the same disc case with the actual data I'm trying to preserve. All of the discs have the exact same data. Every now and then I compare them. The outside one has definitely had a bit of corruption but is still mostly readable. The desk one has a couple of files that it does some retries on (the seek time to the file is higher than expected) but has no corruption. The one in the sleeve is practically perfect after several years.
Photos from 120 years ago survive, but photos from 50 years ago are already fading. Color photos tend to not last unless they were printed just right (I have no idea how to know which chemistry was right - though by now everyone knows).
How many of those paper photos survived vs how many were not made in the first place thinking you could not store the money that well in the first place?
> We tend to think that in a digital world bits are just bits and do not get corrupted — which is decidedly untrue.
That it’s not true is pretty much the reason why ZFS was created, though lots of people still don’t want to hear it, including companies (APFS only cows and checksums metadata for instance).
Agreed. I’m researching changing my unraid server over to ZFS as soon as possible. Looks very doable although my hope is that the developers support it out-of-the-box sometime soon.
Still, using rsync with checksumming enable can get your Photos library messed up if you sync (or just copy) to a non APF or HFS+ file system. Be very careful.
I read that some of the key information is in attributes unique to Apple's file systems, I can't find the source but that was my conclusion after trying to figure out what went wrong in the Time-Machine/Synology/Rsync-via-webdav system I set up for my father some years ago.
If you rsync ext4 files to exfat, you also have issues, but those are very clearly reported when you attempt to do so.
Extended attributes are messy. E.g. the equivalent on Windows (ADS, not EA, which also exist separately) can store arbitrary amounts of data, I'd expect resource forks on macOS to be similar. Meanwhile Linux only supports 64 KB of data for them, though most Linux-native file systems have an even lower one block limit. And then there's Solaris where extended attributes are actually an FS namespace.
So even if both file systems support extended attributes, does not mean that you can actually preserve them.
This is the advantage of the "newfangled" backup tools which don't just copy files from A to B and can do (but not always do!) a much better job at a) not caring about your backup destination's file system b) not losing data that's more complicated than "name and contents".
A mirrored ZFS setup with one offline backup and one cloud backup will prevent nonsense like this.[1] If you know what you are doing, you can do a lot of stuff with ZFS even on a single disk.[2]
The problem is I get home at night and I want to spend time with my kids, not admin my ZFS system. Which is why I'm running a really old version of FreeNAS, it has owncloud (not nextCloud) which fails to start since the last time I updated freeNAS, and I haven't figured out how to make it work again - mostly for lack of time/interest in doing it.
There is still administration required. Sure ZFS itself is easy, but the rest of the ecosystem on top of that is hard. You need a lot more software than zfs to make something useful.
I am not sure. Quite possible bad data was written over SMB, not necessarily that the drives failed. As you see in the article, the damage to images is systemic and drives don't seem to be failing. More like Lightroom/macOS/macbook RAM/Samba wrote (accepted) bad data.
His entire approach to storage of critical data was a disaster waiting to happen. Storing data on FAT-formatted USB drives and then transferring them to a Btrfs file system over Samba? Why?
If you run an OS with first-class ZFS support and your files arrive on the platform as soon as possible over as few intermediaries as possible, the chances of such mishaps are greatly reduced.
Okay, let’s see how we can practically improve this (I am interested in a good solution as well):
1. Camera stores images/videos on an SD card.
2. When the card is full, images are copied in the field on the USB drive. I imagine the user has a 128G SD card and multiple photo sessions will exhaust internal storage quickly, especially on a mac with 256G-512G SSD, hence an external drive. APFS may be used instead of FAT but then the drive won’t be readable on Windows or Linux. Mac doesn’t allow writes to an NTFS filesystem unless you buy a 3rd party driver of unknown reliability. I guess one can try to combine your linked guide with https://github.com/spl/zfs-on-mac to get ZFS on a USB drive but all of that is done at your own risk.
3. When the user gets home, they can plug the drive into a Linux/BSD machine with ZFS. They copy the files locally.
4. They proceed with their favorite workflow on another OS accessing files on ZFS over SMB. Bad writes from macOS over the network can still bork files on ZFS, though you can apply a readonly policy to RAW images.
Step 2 and 4 still look quite hard to work around for me. Bad SD card reader and bad RAM or bugs in any of the 3 machines are still a risk.
> get ZFS on a USB drive but all of that is done at your own risk.
I don't consider ZFS on portable devices to be more risky than FAT32. You can always copy the files to two devices instead of one. Or carry around multiple SD cards.
> though you can apply a readonly policy to RAW images.
Snapshots will protect you if files are corrupted after they make their way to the file system.
> Bad SD card reader and bad RAM or bugs in any of the 3 machines are still a risk.
Sure they are. But they are not in your control (except RAM, where you can always pay for ECC). I would worry about things that are.
Don't complicate things. I did lots of research of different NAS solutions etc. but in the end I just got a huge external USB drive and then automatic backup with backblaze for $60 per year. Works great and super cheap, plus I have offsite backup in case of fire or theft.
And you can add one or two additional USB drives for redundancy. I suspect that a lot of people building NASs using btrfs configured as RAID whatever would be better with a few local USB drives plus Backblaze.
I did the whole "Save it to a large usb drive" method for backing up personal files for a while.
I finally upgraded to a synology nas and I feel a little silly for not doing it earlier.
It's far more convenient given I bounce between 3 machines during the day (personal laptop, work laptop, desktop), my wife can easily access files on her machines as well.
Added bonuses are that it comes back up on its own after a power outage, doesn't require my desktop to be on for me to hit it externally or for a successful backup, and I don't have to remember where I last placed the usb drive.
Basically - USB drives totally work, but the NAS is better in pretty much every way outside of price (and possibly some configuration, if the person isn't very technical).
you can still make good use of the usb drive with hyperbackup on synology! i have a backup job set to run nightly but keep the drive disconnected (so something like a powersurge can't take out all the disks at once) and then once a month plug in the drive before i go to sleep then plug it back out the next day. the backup job gracefully fails for the rest of the month when the drive is not plugged in, but basically it just means i don't have to log into the synology just to run the backup job.
i have a second drive with an identical backup job and the drive lives in my sister's house 364 days of the year.
there's still the risk of losing a months worth of files but its a pretty low effort and cheap backup system apart from that
1. I will need to connect to USB every time I use the file.
2. Wireless HDD are expensive. I think most people actually dont need NAS where the files are shared by everyone in the network. They just want to wireless access their personal file.
This is incredibly frightening. I’ve seen data rot up close, and as a former data janitor, I can sympathize.
My advice is to control everything down to the wall socket.
I considered the Synology boxes for my home NAS, but my “spider sense” was tingling, so I built around an HP MicroServer Gen8 with an LSI PCIe card, and mirrored drives. It’s on a double-conversion UPS (Smart-UPS RT). It runs FreeNAS on VMware ESXi, and the drives present as raw volumes to ZFS. No data integrity issues after ~3 years. Note the trend: clean power, good server-y hardware, matched drives in a simple RAID, properly resourced ZFS, etc.
But yeah, spinning disks are generally not a “backup.”
Even in this setup there's a caveat: HP sells Microservers with non-ECC hardware too. I think after human error, in your setup the likeliest threat to data integrity is faulty memory modules propagating corruption to disks (in essence you'll have a dirty buffer in faulty RAM). ZFS helps you catch this, but it's not magic so be sure to have the ECC version.
Lightroom is not a word I am comfortable reading. Not since my archive copy became uninstallable in macOS (32 bit installer) and Adobe required a monthly subscription just for me to have access to my catalog.
I don't use Apple's products. I've never used Time Machine. I have a Synology NAS though. My understanding based upon reading the article, an the comments here, is that Apple's proprietary Time Machine format is to blame here for this corruption. Can anyone here shed some light on the exact cause for this, and whether this is of any concern to a non-Apple user using a Synology NAS?
My first though was, that there must be something wrong with the cables, and occasionally a packet is damaged. And at the end of the article, there it goes, powerline.
If you are using unreliable connection, use something that can verify the transfer, rsync, for example.
> My understanding based upon reading the article, an the comments here, is that Apple's proprietary Time Machine format is to blame here for this corruption.
How did you get that? From reading the article it sounded like they just moved their RAW files to the NAS (no Time Machine involved). One thing they use for backup is Time Machine, the RAW files came from a USB drive that they had wiped, but they forgot to attach it during any backup so they were omitted from the Time Machine backup. The details were light in the post so I'm inferring most of this. I'm not sure how "Apple's proprietary Time Machine format" (it's a disk image and the actual data is just files) were to blame.
Time Machine on a NAS (unrelated to Synology) can have problems. macOS generally tells you "verification" failed and recommends starting a new backup...which sucks, is time consuming, and you're at risk of data loss during the initial backup, but is way better than keeping around a corrupt backup. I've used a Time Machine network backup as one form of backup for many many years; both hand rolled and on a Synology. The times I get a corrupted backup I've: manually mounted the backup disk image while a backup started or I disconnected or closed my laptop while backing up. I'm not sure if the problem is on Apple's end or on the server's implementation of Time Machine or the SMB or AFP implementation. I've generally been able to "repair" these backups by running disk utility (fsck) and changing the value in a config file, but that can take a very long time over a network and I'm not sure I'd trust that backup anyway.
I think the cause of the problem was close to what the author suspects. Something got corrupt in reading, transferring, or writing the data. Maybe it was non-ECC ram, bits flipped in transit (due to hardware or protocol), or a corrupt disk. The source data seemed ok since he could recover it, the disks appear ok since they wrote ok a second time--but maybe it wrote to a different area of the disk? I've had RAM issues that only showed up under certain circumstances. Maybe "Enable data checksum for advanced data integrity" was on the second time? I kind of think confirming data was written correctly and using a filesystem that protects against bitrot (or keeping hashes of the files yourself) are the most accessible ways to prevent this. Unfortunately, knowing the hashes no longer match means you just know the files are corrupt and won't help you unless you have another copy.
I do md5sum (shasum more recently) out of habit for anything large and at least a bit important to me. I started this about 15 years ago when I had a couple of eSATA drives and noticed corrupted files when copying. I've been checking file integrity since, and occasionally I encounter corrupted files - often enough to make me continue my checks.
It was an eye-opener to me to realize that data corruption is much more common than most people think.
On the flips-side because single (or few) bit flips often go unnoticed, people overestimate the impact of data corruption. The idea that if your image or video looks good then it must be intact is flawed. Even software binaries survive a bit of corruption pretty well.
I think in the long run file system based integrity checks everywhere would be great. For now, in a world with a multitude of storage technologies and file systems, shasums will have to do.
In addition to backups, error correction data is a must for things you care about. With PAR2 you can add that to any type of file: https://en.wikipedia.org/wiki/Parchive
Bitrot is a thing, but mitigating it using par files seem suboptimal to me. The biggest problem is that you have to manually create the files (what if you forgot?) or write a script to do that for you (what if the script silently failed?). You're better off using RAID or a filesystem that has it built in (eg. ZFS or REFS).
It doesn't need to be manual. par2 is a command-line program, so it can be automated. Yes, ZFS is better for in-flight data, but it makes sense to create par2 files for your backup archive files, if you're then sending those files somewhere else (which of course you should).
>It's likely I messed up by not checking the "Enable data checksum for advanced data integrity" option when I created the shared folder. Still, this option is unchecked by default and it's not really clear that leaving it that way could lead to data corruption
What's the point of using btrfs if Synology disables one of its signature features?
I found myself, that my enthusiastic approach to photography does not utilize the CC plan on Adobe to pay ~10$/month.
What I do is to import images from SD + do a base color correction. What I found unique and useful is face recognition that Lightroom offers.
Nonetheless, I decided to move on to either open source software or software that I can buy. At the moment I'm testing On1 - but I wanted to check if somebody from HN has either own solution that wanted to share or something to recommend.
Managing large volumes of digital photos is still an unsolved problem.
I have tens of thousands and they are probably my most treasured possession.
It’s quite concerning that they are languishing on Google Photos, with a few partial backups floating around that I’m not confident in.
I have had a few attempts at cleaning it up but haven’t found the right software.
I would also like to print my favourites, which probably also amounts to thousands of pictures, but working through the sheer volume is quite intimidating.
Feels like the whole thing needs a better workflow around it.
Lightroom. Import all photos from all sources and have lightroom MOVE THE ORIGINALS into a new root folder with the following structure YYYY/MM
Keep this folder backed up! It's more important than the metadata catalog but now all the raw imagery is in one place.
Then maintain the lightroom catalog with lots of metadata, including using the facial recognition system for offline person recognition, the flag system enabling a 2-pass deletion review, star system for ranking photos by desirability, and add tags to everything. Go through each filter mechanism and develop a strategy for using each filter.
Use the image gallery filter system to the limit.
If you go through images by date & time captured you can usually bulk tag images much more quickly than if you have to switch between events.
I've got my photos in a directory that I sync between all of my devices (that have a large enough hard drive) and a VPS using Nextcloud. I have Photoprism installed for organizing and viewing my photos on the web/Android. Daily backups with Restic to Storj DCS via an s3 API, and monthly manual backups with Restic to my USB hard drive. My photos are currently about 100GB and VPS/s3 storage is cheap.
I think it's a solved problem, just lacking in good turn-key solutions.
This (well, worse corruption than this) is one of my worst data nightmares. Losing pictures of my children, relatives who are now deceased, etc.
I know it may sound like overkill, but I switched from a Synology NAS to ZFS for the checksumming specifically to avoid this fate.
Not a knock on Synology specifically; I amazing all are about the same in this regard. What I wish is that there were a consumer friendly low power NAS running ZFS … (?) so I don’t have to maintain a server.
The whole ordeal was largely of my own making. I never anticipated that the simple act of copying files from one place to another could go so horribly wrong.
It isn't user's own making or user's fault at all! ( A voice is literally screaming this inside my head ) Tech shouldn't be like this. The DS218 support BTRFS which prevent certain file corruption but this is not what's happening either.
I have been saying ( or ranting ) about this for now more than a decade. [1] And every time the answer was there are no market for it. Or consumer or market are not willing to pay premium for high quality NAS. The Kobol NAS was my hope that a high quality, reasonably priced, and somewhat reasonably easy to setup and managed NAS would be. But they are closed down as well [2].
I really hope Kobol release all of the work as open source including hardware design. May be someone ( or me ) could crowdfund it once the chip shortage is over.
I also have no idea how to organize photos and backups well.
As an engineer, I'm good doing it for enterprise stuff, but it's really challenging for personal stuff.
I don't trust Google products anymore, I got burned too many times when they killed their products. They work, but I just don't want to use them anymore.
I currently have a 100GB OneDrive subscription for $3, so that works for now, but I'm pretty close to the storage limit. I'm not a Microsoft fan, but considering it's integrated with Windows now, I assume it will stick around for long.
The same is probaly true with Apple and iCloud.
I've relocated internationally a dozen times, so a NAS doesn't work for me.
There are also other options for just storage like S3 or Azure storage, but the price seems to be almost the same as OneDrive.
> I've relocated internationally a dozen times, so a NAS doesn't work for me.
Same here.
I used to use 2x Seagate portable 5TB. Would mirror one to the other every week.
Now I use a Western Digital MyBook 20TB in RAID 1, and backup this weekly to my portable externals using Carbon Copy Cloner for macOS (ideally I would have an identical second one for this).
And run Backblaze for offsite backup.
I avoided a NAS because I prefer the speed of USB3 versus network - super quick to view RAW and video, and for ease of relocation.
I also have a 1TB portable SSD for my Apple Photos syncing iPhone content without filling up my MacBook disk.
The WD MyBook's are a bit dodgy though, there are so many horror stories. They encrypt all data, so if your RAID config is corrupted...which is supposedly stored on the hard disk, then you are screwed. Also, when you tell it to sleep when inactive, whenever you open a file dialog, the laptop freezes for 5/10s while it wakes up. There is also a bug in the utility application that says "can't access drive" which makes you think your drive is broken but it's actually just a software bug. Not running the utility software fixes it. I also think it can cause some unable to sleep issues for macOS. Sometimes it is super loud too, and its quite noisy.
+1 for Onedrive. I pay usd6 I think for 1 TB with a valid Office365 license. That's a pretty sweet deal. Not a microsoft fan either, however this works well. Tight integration with Linux CLI tools, Android Onedrive app is excellent et cetera.
In the vast majority of cases spontaneous data corruption happens in _transit_ due to RAM glitches.
All modern drives implement forward error correction on per sector basis. This allows the drive to automatically repair up to 10% of damage to any given sector... in which case correct data is returned to the requestor and the sector is either tagged for relocation or is relocated right away. In cases when the data can’t be cleanly recovered, the read request us failed.
That is, chances of a read request returning mangled data from a disk is next to absolute zero. Meaning in turn if you do see data corruption, it happened before this data hit the disk - i.e. it happened in transit.
Software bugs is an altogether different issue. As it is a far more exotic one.
The context of the OPs post and my reply is the case when copying in bulk with a mature tool yields corruption in a small fraction of the data. In this case the cause is the in-transit corruption (rather than at-rest, which is a fairly common belief dubbed as a “bitrot” phenomenon).
File IO is one of more error prone activities with low understanding by general software dev community.
It is actually rare to see a piece of code that is NOT broken.
I had once observed a piece of software write a million files, all empty, because it could not handle situation where it could create file but not write it.
It is quite normal for files to be truncated because written directly to destination file rather than through temp. Then script gets killed and next instance assumes file exists so it resumes from next.
Most software can do the job if everything works, but almost none can correctly handle every error condition.
I tend to end up with multiple copies of photos and other docs: SD card, laptop, backup drives, and USB thumb drives. Deduposaur helps me to move files into my latest backup drive. It flags duplicates, modified files, and files that I previously deleted. It also checks each file's SHA-256 digest to detect corruption.
The program is usable but needs some usability improvements and more tests. The license is Apache 2.0. I am happy to receive problem reports, feature requests, and pull requests.
I once got a bit of a scare in that many of my raw files has weird color patterns in them. Even the backups. Turns out, I had faulty RAM and Lightroom happened to always hit the faulty parts when reading a raw image, apparently.
One thing that surprised me, if they are savvy enough to use the command line and write custom scripts, they would probably have no issue with using PhotoRec instead of buying a commercial recovery software.
Not that this kind of money is relevant when it's your personal files that are at stake, just a small observation plus a recommendation for whoever reads this and didn't know about it:
I use Lightroom on Windows and also Logic Pro X on Mac, and both save to my Synology. This post is terrifying me, because I don't know an easy way to validate that my Logic Pro audio projects are not corrupted on the NAS. With a photo you can just look at it. With an audio project I would need to listen to the whole thing, and when there are 32 tracks or more to be checked, that can be unmanageable.
For Lightroom, I keep the library on my local SSD drive and the photos on the NAS. One thing to note is that you can import photos into a Lightroom library in-place (from wherever they currently reside), or you can have Lightroom copy or move them to a location of your choice, with a few options around how to organize the files. In this case, the Lightroom software itself could well be the cause of the corruption, as it is doing the copying. I copy the files to a destination on the NAS using a drive-mapped letter, but it works with UNC paths as well. On Windows, I have ever experienced any corruption in about 8 years of using the Synology/
It looks like he’s using HFS or APFS. These filesystems still lack data checksums for data on disk, aside from any of the other issues he ran into.
The only sane way to deal with unique/irreplaceable data (i.e. one’s own photos) where the authoritative copy is on a mac is to make several independent copies, with several kept offline, and assume that the macOS or Adobe’s terrible house of cards that is the creative suite is going to fuck everything up at some point.
I say this as someone with >1000GiB in a Lightroom catalog, run on a mac.
I have a copy on an always-connected external disk (made with rsync && rsync -c), a copy on a ZFS NAS (made with rsync && rsync -c user@host:), a Time Machine backup that is only connected weekly, and an offsite rsync to a USB drive, with a SHASUMS file sidecar.
I also copy the RAWs into the LR masters photo structure when I import them, and they go first from the camera into a Syncthing folder (pre-import) that is immediately and automatically replicated onto four machines in three buildings (two of which are ZFS with autosnapshots).
I had a similar experience. Lost close to 9 yrs of photos and my wedding photos. I eventually got them back via a similar method as author. It took me a year to go through them all. After that experience I fully went to the cloud. The cause was a REFS mirror deciding to “mirror” my corrupt drive to my working drive. Great job MS. I also stopped using Microsoft for my raid stuff.
Yes I forget if it was on or not. You are correct. On the Windows OS (not server) you have to run a command to enable it. I will never forget the feeling I had watching the space on my good drive dropping precipitously as it was wiped away into “oblivion.”
While I use Photoshop CameraRAW to edit RAW files I don’t use Lightroom or any file library management software. Everything goes into folders on my PC on a dedicated drive which is shared so our Macs can access them and also backed up to BackBlaze. Any JPEGs that I want to share with family are then copied to iCloud photo album or posted to Flickr etc.
I was using a WD myCloud that I retired when we moved house a couple of years ago as a secondary backup but that’s not a runner anymore. I’ve 32TB RAID6 volume on my personal office server and I’ll probably use that as a separate offsite backup.
We do use Time Machine with a Time Capsule that I’ve recently replaced the disk in. I don’t really trust it and also pay for iCloud Drive for most of my other data. Work also provides Google Drive which I use for data that is shared with others.
It’s all a bit of a mess but I think I’ve got some protection in place.
The (not last) version of Lightroom that I use store and organize all the pictures in an accessible directory. The only issue I had when I started using Lightroom was with the import method, I had not selected the "Copy" method and realized that after I deleted the originals. Now I always double check. I would also advise to use multiple Lightroom catalogs, Lightroom makes it easy. Also I don't think Lightroom is the source of the issue here. Anyway that's why it's important to always have at least 2 backup solutions with one being offline.
Something to be aware of the Lightroom catalog itself (which is unrelated to the OP's problem) is that it's an SQLite database. (Or it was when I ran into my problem about a decade ago bit I believe it still is.)
I had hardware-related flakiness with a Windows system at the time. I swapped it out for an iMac but sometime later I discovered that there was some residual corruption of my Lightroom catalog even though I had restored a backup that seemed to be OK. Fortunately I was able to use some SQLite tools to largely fix the catalog.
Samba (on Linux, not just Mac) has always had corruption issues for large files related to locks and filesystem caching quirks. Windows and the protocol itself has a different approach to locks than *NIX. Many have gotten more stable by disabling caching (though this hurts performance on low bandwidth / high latency fileshares).
Also from what I recall, Apple SMB is based on a forked version of Samba from 10 years ago. They did this when Samba went GPLv3. Not clear how well they’re keeping up with quality improvements.
My solution: use WebDAV on your Synology for transferring files from a Mac or Linux host. It has its own quirks but is far more reliable. Unfortunately this won’t work for time machine. But would work with “rsync -aXv —partial”.
Important backups should be stored in a variety of ways. I store my backups on multiple external HDD’s (solid state and spinning disks) in redundancy. Using multiple filesystems (ext4, and exFAT). I do use ZFS for daily/active storage but not for backups. Besides that I also store the data in AWS S3 and a duplicate in Backblaze B2. These solutions are scalable. iCloud/Google Drive is good if you only have 2TB or less.
Just store the raw files directly in a regular file tree structure and avoid any kind of catalog/index/db file.
I do find ZFS very interesting but in practical use it has been trouble working with, getting weird errors when trying to remove a vdev.
Slightly off topic:
I recently switched to Capture One for all of my photography needs. I would highly recommend it over Lightroom. Rather than subscribing to a service you can just buy it outright.
It has all the features of Lightroom with some extras.
Similarly I recently switched the other way. I recommend Lightroom over Capture One. The subscription is a slight downside, but Capture One is similarly expensive if you want to keep up with the latest version, I know you don't have to, but if you do it is expensive.
Moreover Capture One is missing some features which Lightroom has: The ability to filter collections based on their name. A shortcut for 'increase/decrease rating'. Filter photos for 'x stars and higher'. Show all images in a folder, including those in subalbums/subfolders. Being able to mark photos with a 'Reject' flag/rating (and easily hiding, not removing, those).
I tried switching to Capture One as well, and enjoyed using it for the most part, but I couldn't fully switch for one primary reason.
I'm still fairly new to photography and use a few preset packs that I've paid for as a base before making modifications to them to my liking. I downloaded a program that converts Lightroom presets to LUTs, and those worked decently, but they were visibly inferior due to the fact that the converter cannot handle camera/lens profiles.
I ended up just going back to Lightroom Classic for now, but hopefully once I progress enough to make my own edits from scratch I'll be able to switch back to Capture One.
Switched to using SD cards for Time Machine-like backup (using BackInTime) plus monthly archival on big disks on Linux for many years now; requires some organization due to space restrictions, though. Distinctly remembering when iPhoto (5?) wanted me to convert/store my private photos "in the database" rather than as plain files on my old PowerBook which I refused to do however, knowing all too well this would've been a bad idea.
Nikon RAW files store a full resolution jpg version of the photo inside of the RAW file (this may be true of other RAW formats too). I had a similar problem to what's described in the article a few years ago, and for photos that were damaged which I didn't have backups I was able to get the jpg out of the RAW by running this command:
exiv2 -e p4 image_name.NEF
Apparently this jpg is what the camera shows on its backscreen when you're previewing the image/RAW file.
Anyone using lightroom should also consider configuring it to always drop the XMP sidecar file with your raw files. If the DB is ever corrupted beyond repair, you'll at least have a record of your current edits per file.
I keep my stuff in files only carefully categorised in directories. I use Beyond Compare to do a byte level comparison between what’s on my disk and what’s on the (one of the several disconnected) backup volumes I use. That identifies corruption issues at either side of the fence.
I lost about a year of family photos thanks to this sort of issue once so I am very wary of problems.
We use beyond compare to do the same for a vast set of municipal documents. We periodically update the collection with new documents. Beyond compare shows shows the new documents as blue(orphans) in the working directory. Interestingly, we occasionally see a red document in the backup indicating a differences the byte level. The documents appear the same in the word processor so we don’t know what’s going on. These are windows boxes and a symbology nas (smb) so no apple involved.
Last year all of my photos and presets had disappeared after updating the Lightroom app. That was 2+ years of edits that are just gone, lost, unrecoverable.
I do photography as a hobby so I never saw a need for backing up photos and I never paid for the subscription (which would include cloud storage) because I didn’t use any of the tools that came along with the subscription.
Based on this, and on my own experiences, it looks like it's time to move on from Synology. But... to what?
I've had a DS412+ since, well, 2012. I've dutifully replaced drives as they went bad, run disk scrubs, file system scrubs, and (in the last nine years) had to occasionally run time machine restores from it. It worked like a champ each time, including "restoring" from an old computer to a brand new one. However, in the last couple of years, we couldn't get my husband's machine to back up or restore successfully, and now out of our three macs, only one is able to successfully back up at this point, and (looking at this) I'm skeptical that it's actually working.
That's not the conclusion the author came to - why is it the one you came to?
I use a Synology with a Mac nearly every day. I don't use it for Time Machine, but that's just because I use different backup strategies, like storing the files directly on the Synology, or copying them over after local editing depending on the type of file. Also, like the author learned, I check my backups periodically to make sure the file is the same on the other side.
> I'm skeptical that it's actually working
You should test it then. If you've never tested you backups, you have none.
> it looks like it's time to move on from Synology
I don’t think this is a Synology problem. No Windows users are reporting this issue. On the other hand, NAS devices have always been treated as second class citizens by Apple.
If reliable network attached storage is important to you, maybe it’s time to move on from MacOS?
As much as I hate Windows, using my NAS with it is a dream. It’s how the experience of network storage was supposed to feel.
My bet is the problem is related to the USB transfer. I've read about similar problems migrating data to Synology via USB. Don't remember where I read that, though...
This is mostly not about Lightroom, or photos, or any of those specifics. This is about (a) a failure in a migration followed by (b) *failure of the user to verify the migration before deleting the original copies*:
>To make matters worse, I deleted the photos from the USB hard drive, without verifying the new catalog.
That line is, not coincidentally, where I stopped reading.
It's not obvious to amateur users that these safety measures must be in place for data migration projects. Members of my family only learnt after the first data loss.
What I find work best for storing data is a regular USB drive. With no software like Lightroom, no NAS. Just a disc, you move data with a regular file manager.
You don't get all the cool features, but process is pretty safe and you don't worry about hackers because it is offline most of the time.
Arq.app can backup external disks to most of the big cloud storage providers really well. Unlike Backblaze, they don’t delete these backups if you don’t connect your drive for a specific amount of time.
I’ve used Arq to backup about 2TB of external drives to AWS Glacier. It even let me set the encryption key for the data myself!
I used to backup with Arq to S3 and the restores were very fast. With Glacier, Arq initiates the data retrieval and waits for file availability in the UI. Glacier can take a day or so, laptop needs to be open and connected to internet during the time for the transfer to complete. Given I'm primarily backing up external drives, I find this UX not an issue, but if you wanted backups with instant accessibility, backing up to S3 really isn't that pricy.
I have second backup disk that has snapshots of the backup disk. It is unlikely two disks from two manufacturers will break in the same time. So it gives me fairly good level of safety
Apple wants you to buy a whole new MacBook for the extra storage, and only use their icloud for backup. Apple intentionally has bad implementations of network filesystems. They know it, but as per usual, they know their customers won't do anything about it.
I use lightroom and i have over 3TB of photos. This is my current workflow. There's no reason why it would not work on a Mac
1. Lightroom runs on Windows 10 system
2. I add new images onto it from one set of cards. The cards are pulled out and sat on a desk. A previous set of cards is added back to the camera and wiped. At this point there are images in the Lightroom and images on the the cards.
3. Every night, windows box rsyncs the Lightroom catalog and raw data onto a Linux box running in a RAID-1
4. Every night, after the rsync is completed Linux box triggers a backup of the rsync'ed catalog and files into S3 compatible remote bucket.
This means that I have all images and catalog stored at least twice locally and once remotely. I have also validated that I can recover from a failure of Windows disk, Linux server and remote backup.
It will be able to help you more often than not, and it's FOSS.
[0] https://www.cgsecurity.org/wiki/PhotoRec
[1] https://www.cgsecurity.org/wiki/TestDisk