The scenario they wanted to handle is not a failing deletion request, but multip...

nso · on Jan 28, 2017

There could be other necessary cleanups in the delete-function that needs to be completed aside from the deletion of the physical file. Other than that I generally agree, but also know never to assume to much about code written by other people.

I guess another way of solving this would be by doing something like putting each file in its own folder on the file system, giving the folder the same name as the original file and renaming the file to something. "file" or whatever. Then you create one symlink per email linking to the file and the email links to that symlink. When deleting the email you delete the symlink. After the deletion of the symlink you could check if there are no more symlinks in the folder, and safely delete folder if none present.

On receiving the first email with filehash X:

  mv ./original_unique_name.file ./original_unique_name.file/newfilename
  ln -s ./original_unique_name.file/newfilename ./original_unique_name.file/symlink_for_email_1

On receiving subsequent emails with filehash X:

  ln -s ./original_unique_name.file/newfilename ./original_unique_name.file/symlink_for_email_2
  ln -s ./original_unique_name.file/newfilename ./original_unique_name.file/symlink_for_email_3
  ln -s ./original_unique_name.file/newfilename ./original_unique_name.file/symlink_for_email_4
 ... etc

On delete email Y:

  unlink ./original_unique_name.file/symlink_for_email_$Y
  (pseudo) if no symlinks in ./original_unique_name.file/ { rm -rf ./original_unique_name.file/ }

Thus avoiding the need for a counter as each email is linking to its own symlink.

Will cost a few bytes per email, but seems like they have some to spare!

PSIAlt · on Jan 28, 2017

1. This is nothing to do with distributed storage

2. Symlinks are not required to be linked to existing file

3. Symlinks are not atomic

4. You still need to maintain filedb (or how do you resolve which storage contains given file?)

nso · on Jan 28, 2017

1. Depends on your distributed storage. If you use clustered file system your argument is void.

2. While true I fail to see how that is relevant. What part of my described flow would be broken?

3. While true I again fail to see relevance. Are you just listing characteristics of symlinks? A field in a database could exist with a filepath pointing to a non-existing file as well.

4. Sure. I was describing how to avoid the counter and magic number, not the database.