We went through several “solutions” before finding the real solution (see updates below)
The problem turned out to be that Apple has added some stuff to their home-brewed implementation of smb. This results in OSX smb client asking the smb server for information that most smb implementations simply do not provide. This is the root cause of the issue.
In our case our smb server is Samba running on Linux. But, even though we “fixed” the issue on the server-side, it’s actually a client-side (OSX) problem not following the smb spec. And so if you’re using Windows or some other OS/smb client, the prinipals here still apply.
Spoiler Alert: Our fix was to enable the vfs_fruit module in Samba. If you’re not using Samba, you’ll need some way to mimic the behavior that the OSX client is expecting.
Our Network Topology
Here at the Plazko.com Tempe Technology Center, or the PTTC as we affectionately call it (ok I just kinda made that name up. It’s really just called “the office”.), we have a small network consisting of maybe ten Windows machines (mix of Windows 10, Windows 8, and Windows 7), a few Android phones and tablets, a few Ubuntu Trusty machines (some virtual, some physical), a couple of Apple OSX Yosemite iMacs and iBooks, an IPad, a couple of iPhones, and a Windows 10 Mobile phone. Most of these machines are connected together via a WiFi connection to our consumer-grade router, a Linksys EA4500, with some using the 5Ghz channel but most using the 2.4Ghz channel. A few of the machines connect via wired (RJ45) Ethernet, a couple directly, and the rest via a little Ethernet hub.
We also have a Western Digital My Passport removable hard drive attached directly to the Linksys router via USB2. I refer to this hard drive as the “shared drive” in the remainder of this article.
The router has a sharing feature that allows you to share partitions of the hard drive over the network using Smb, FTP, and/or UPnP Media Server. With regards to Smb, I don’t know enough about it to give the details. I always thought of “Smb” as a short way of saying “Samba” and so I assume that the Linksys is running a Samba Server. However, with regard to the Smb protocol I don’t know if the server that is presumably running on the Linksys is Smb1 or Smb2 or Smb3 or some combination of those. I also don’t know if/which of those protocols is compatible with one-another; so, for all I know, running Smb3 could imply that the server inherently supports Smb2 and Smb1, but I cannot say for sure. I also do not know what OS is running on the Linksys. We are using the Linksys firmware v 2.0.37; we are not using the cloud-based firmware.
Why we can’t just change the environment to make the problem go away
We have a few people here who use Mac’s, either because they insist on it and they have the front-office-power to make such demands, or because they are in the creative department and thus they require Mac’s, because, well, all the other creative professionals use Macs in all the commercials and magazine ads, but, besides creating a terrible run-on sentence, I also digress.
And what is the other thing that creative professionals usually need along with their 4k Retina Mac’s? They need access to lots of big files, of course. Nothing takes more disk space than, for example, uncompressed hi-resolution video. It would make sense to just segregate the creative department and give them their own network and simply share the My Passport off of one of their workstations. Except that we also have a few managers who want/need access to the files as well, and they are using Window machines. Now, for all I know, using the Apple’s as the Smb server might solve the issues for everyone, but I don’t know because I didn’t try it. I’m more experienced with networking in non-Apple environments and despite the fact the OSX is a Linux variant, and I am confident in my Linux networking, I am just not confident in my knowledge of the secrets of OSX networking. And so, I would prefer to not use the OSX workstations as the file servers. Another option would of course be to use one of the Ubuntu Servers as the file server, and connect the My Passport to that machine. But I just sort of assumed, based on the tons of complaints out there, that it was Apple’s Smb implantation that was broken, and changing the Smb server to another non-OSX machine would not help.
The Problems: Listing the files on the shared hard drive from the OSX machines is Painfully Slow and transferring files to/from the drive is Also Too Slow
On the Apple machines, when you use Finder to browse the contents of the shared drives, it sometimes takes several minutes (sometimes more than 10 minutes!) to list the contents of a single folder. I believe this problem, at least the variant we experienced, occurs when you have folders that have a ton of sub-folders. In our case, the main “videos” folder containers thousands of child folders, and that was the folder everyone wanted to access.
Finder doesn’t give you the impression it is finding any files or doing any process. It just shows a blank folder, as if there were no files/folders in it. And then after many minutes, the blank folder will all of the sudden be populated with all of it’s children. This is especially annoying when troubleshooting, and some of your folders are actually empty. You don’t know if Finder is still loading the file list or if the folder is in fact simply empty.
Some of the Solutions I Tried that Did Not Work, or I Did Not Try
The easiest/quickest work-around people offered is to force the client to use Smb1 by using the cifs:// protocol in the URI instead of smb://. Unfortunately, for me this did not work.
There are many other suggestions floating around the internet. Many of those deal with making changes to the number of “credits” the server issues to the client, but they are mostly specific to making Windows registry changes when Smb server is running on Windows Server. In our case, I have no idea how the Linksys implementation (presumably Samba Server) issues credits, but I’m pretty certain it doesn’t deal with Windows Registry!
Other common suggestions include to use/not use NetBios over TCP, to connect to the server IP Address instead of it’s NetBios name, to use/not use OpenDNS servers for the client’s domain name resolution, putting the IP address of the server in the client’s HOSTS file, using Path Finder or MuCommander or another third-party file browser on the Mac instead of using Finder, editing nsmb.conf to set notify_off=true, specify a workgroup, replace Apple’s broken custom implementation of Smb client with Samba, use third party network configuration apps, connect the Apple to the AD domain, rollback to OSX 10.5 or earlier, and changing settings in Finder to not show shares and to not calculate folder sizes. One of the humorous solutions that was offered was to throw away your Macs and replace them with Windows machines. One solution that I did not try, that actually sounds like it might work, is to delete all of the .DS_Store files on the shared drive and set the Mac clients to no longer write .DS_Store files by running defaults write com.apple.desktopservices DSDontWriteNetworkStores true.
Here are some of the endless posts and suggested fixes for this issue.
In that last one at discussions.apple.com, I actually found the solution that worked for us. It should be noted that many of the suggested solutions in the threads I listed worked for many people. They just didn’t work for us and our specific issue.
The Solution that Worked for Us We Thought Worked: Turn Off Delayed TCP Acknowledgements
!NOTE: This solution did not work permanently… See the updates later on in this post!
The solutions, advice, commentary, suggestions, and opinions provided in this article
may have negative and/or other effects that I have not considered or do not know about.
We offer no warranties or guarantees of any kind that this solution will work and/or
not work and/or that it will and/or will not not ruin your computer, ruin and/or damage
your network, ruin and/or damage your hard drive, ruin and/or damage your business,
ruin and/or damage your life, or ruin, damage, hurt, offend, or otherwise cause any
effect of any kind on any person or any place or any thing.
I don’t know exactly what a delayed TCP ack is or what it is used for. And, I don’t know that I actually turned it off. All I know is that I changed this setting, and the problem is completely gone.
To make the change one-time to see if it works for you, without rebooting (you will need to do for each Mac individually):
Open a terminal
Run this command:
sudo sysctl -w net.inet.tcp.delayed_ack=0
Close and re-open Finder and see if the problem goes away
To make the change permanent:
Create/edit the file /etc/sysctl.conf
sudo vim /etc/sysctl.conf
Add this line to the configuration file:
If there is already a line that says net.inet.tcp.delayed_ack, then change it’s value to 0 so it matches this example.
Why did I try all of the other solutions, but kept ignoring this one?
Many of the proposed fixes had a lot of “thanks, that fixed it!” type of responses. So, I would get excited and try the fix, only to find that it didn’t work for me.
I have seen the suggestion of turning off the delayed acks before, but I just ignored it for a few reasons:
It was rarely offered as a suggestion, whereas the other solutions seemed to be repeated many times by many people
It never got any “wow, that worked!” type of replies.
It seemed over-technical and over-specific. As if some super geeky network engineer who knows too much about TCP/IP made the suggestion based only on his knowledge that delayed acks can cause network latency. I suspected, based on this, that it might improve overall network performance by some amount, but that it didn’t actually directly affect the exact problem we were having. Which, this still may be true. Maybe Finder is still doing 10,000 too many operations but I just don’t see it because it is doing them all really fast now. I will wait for the results of the DS_Store suggestion (see Important Notes #3 below) to decide which is the case.
Before trying any solution, including the one that worked for us, I try to “clean the slate” as much as possible. In this case, I went to the “connect to server” dialog in Finder and under recent servers (the drop-button with a little clock icon on it) list I clicked the “clear recent servers” button so that the machine would forget all of its memorized server connections. I also made sure to “undo” any of my previous fix attempts, which included deleting the nsmb.conf file and using default (DHCP) networking and DNS settings. I do not know if this makes a difference but it should be kept in mind.
Another article I came across suggested setting net.inet.tcp.delayed_ack=2 to put it into “compatibility mode”. I do not know if this is better than turning it off completely (which is presumably what I do when I set it to 0), or if it works at all. But sometimes turning off a default feature is not the best idea when that feature has a specific option that allows it to stay on and work. I think it is worth trying and learning more about.
I have read hypothesis’ (some written as fact) that the reason folders with a lot of child folders are slow is because Finder is reading and processing the .DS_Store files in every one of the folders. The suggested fix is to delete all of the .DS_Store files from the shared drive and then to tell the clients to stop using the DS_Store files. Given that I can see thousands of DS_Store files on our shared drive, I think this may have also fixed our issue or may be the actual correct fix for the issue. I plan on giving it a try. Right now I am just so glad to have the issue “fixed” that I am afraid to mess with anything. Basically, after you delete all of the DS_Store files from the shared drive, you run the following command from the terminal on the Mac clients: defaults write com.apple.desktopservices DSDontWriteNetworkStores true
If you are not familiar with using a terminal, you may want to have a friend or professional help. Or, even better, fire up an old computer that you do not care about, and learn the basics of using a terminal, before trying these or any solutions.
UPDATE (a few days later)
After about an hour of working well, one of the Macs is now slow again. I’m going to try the DS_Store fix and then I will post an update.
The DS_Store fix did not ultimately solve the issues.
UPDATE 3 (June 7, 2016) with the ACTUAL SOLTUTION
Frustrated with this issue, combined with a few other needs, my boss agreed to buying a new machine to act as a local domain controller and file server. This would allow me to remove the hard disk from the router and connect it directly to Ubuntu Server and share it using Samba. This would give me the advantage of having full control over the server instead of relying on the Linksys firmware and Smb implementation, allowing me to change any server-side configurations that might me contributing to the problem. I also hoped that it would just be fixed automatically by itself, simply by moving to the latest and greatest of hardware and software.
After setting up the new box with Ubuntu Server Xenial and sharing the drive with Samba, there was no difference. We still had the same problems. So… on to the Samba configs.
This was the setting I was most interested in setting. It made sense to me that the server wasn’t issuing enough credits and this was causing the delay. Unfortunately, I found that the default value for this setting in Samba on Ubuntu Xenial was already plenty big.
A Serving of Fruit with Every Meal
Thankfully I stumbled across this post at spiceworks (by TimFitz) and he talked about a vfs_fruit module for Samba that fixes the problem. I enabled the module by adding the following to smb.conf on the server under the troubled share: vfs objects = fruit streams_xattr.
I restarted Samba and, fingers-crossed, so far it seems to have fixed the problem.
...after following some of the Samba discussion threads linked here,
I have found what seems like a solution to this problem -- much to my cautious relief.
Basically, in Samba 4.2.0 and greater (which my 6.2.0 ReadyNAS has) there is support for an extension to vfs called vfs_fruit.
This is designed to deal with the problem described here (as originally linked to by @David_CSG here) -- or, rather,
the multiple problems introduced by (my semi-interested interpretation) Apple's free-wheeling extending of SMB2...
From the vfs_fruit man page:
Having shares with ADS support enabled for OS X client is worthwhile
because it resembles the behaviour of Apple's own SMB server
implementation and it avoids certain severe performance degradations
caused by Samba's case sensitivity semantics.
Update 6/30 – Fix Confirmed!!
I am extremely happy to report that the vfs_fruit module has effectively solved not only the slow folder listing in Finder but also has improved file transfer speeds to/from the shared drive by at least 20x or more.