
Checking for same sizes on images before loading?

jjweimer
I would like to confirm that the size of all image files in a folder is exactly the same *before I load the image files*. For example, I should throw a flag when I would select to open all image files in a folder with 10 images at 2MB each and one image at 1.5 MB. The intent is to prevent selecting a folder to load into a stack when the images in the folder are mis-matched in size.
Is there an efficient way to do this, even when it might involve breaking out to the shell or DOS level with ExecuteScript?
How about using Open and FStatus? It was fast enough (<0.1 seconds) for ~80 files.
Edit: Changed the file size wave from a double to a long unsigned integer. I don't think the size will ever be negative or a non-integer.
August 9, 2023 at 10:08 pm - Permalink
Instead of Open you could also try:
August 10, 2023 at 02:44 am - Permalink
The help for GetFileFolderInfo says that V_logEOF is the number of bytes in the data fork, while V_logEOF from FStatus is the total number of bytes in the file, which had always made me think that those values would be different. However, when I checked several files types (.h5, .png) the size was the same from both methods. Maybe one of the Wavemetrics folks can chime in.
However, using GetFileFolderInfo in the loop is several times slower than using Open (~0.12 s versus ~0.04 s for 80 files).
August 10, 2023 at 07:29 am - Permalink
I also didn't understand that part, but it is probably fine for finding very different file sizes. But it makes sense that GetFileFolderInfo is slower, since it grabs more info. Better use Open then.
August 10, 2023 at 09:20 am - Permalink
They are both the number of bytes in the data fork.
The FStatus documentation would be more precise if it said "The number of bytes in the opened fork" which is always the data fork.
The Open operation has always opened the data fork only.
Apple dropped support for resource forks a long time ago so the distinction between data fork and resource fork is moot at this point.
August 10, 2023 at 11:46 am - Permalink
Thanks @KZarzana. I will use the approach you've suggested.
August 10, 2023 at 03:50 pm - Permalink
In case anyone might need, here is a version that checks three things. In my applications, the num of files must be four or more, otherwise, the stack becomes an RGB image. I do not allow mixtures of file types to create a stack. Finally, I check for the same file size.
August 11, 2023 at 08:19 am - Permalink
Jeff- I see you allow tiff, png and jpg. If no compression is applied, then the number of bytes in the file will be the same as the number of bytes in the ultimate image. But if any compression is done, then different images may wind up with different file sizes. Especially with jpg, the file size will depend on the quality setting and the amount of high-frequency features in the image. In tiff and png images, I would imagine that large patches of zeroes would compress almost to nothing.
August 11, 2023 at 09:42 am - Permalink
Thanks for the heads up John. I've implemented a restriction to limit the creation of stacks to TIFF images only. As to the possibility of missing the true size differences for compressed TIFFs, only one case will fail in my revised approach. Failure will occur when the individual sizes of each one of a set of TIFF compressed files on the drive are **exactly** the same size but at least one (out of a minimum of four) loaded images is a different uncompressed size than all of the others. I'll take this as an edge case for someone with greater motivation to tackle.
August 11, 2023 at 03:53 pm - Permalink
If I read this correctly, the only passing cases will be sets of TIFFs with zero compression, the false positive edge case that you mention, or a set of tiffs where the compression fortuitously gives the same file size. Why not check the file header (actually, the Image File Directory/Directories) for the image width(s) and height(s)? That's what you're really trying to check, right?
August 14, 2023 at 02:32 am - Permalink
August 15, 2023 at 04:33 am - Permalink
LOL, I wonder if this is some deliberate joke by the creators of TIFF.
August 15, 2023 at 05:20 am - Permalink
Thanks Tony. I had thought that I eventually might do a read-only ImageLoad operation to capture the TAGs. Your approach may be less cumbersome.
August 15, 2023 at 10:04 am - Permalink
The TIFF format is described here
@chozo, the documentation describes it as "an arbitrary but carefully chosen number"
@jjweimer, i edited the snippet to properly handle the case where height and width are encoded as 2 byte integers. I doubt that there are actually any files where this is the case.
August 16, 2023 at 04:19 am - Permalink