Brock Palen writes that the act of writing too many small files can kill application performance, but there is a remedy.
Lustre meta-data (the existence of the file) lives on its’ own server. There is only one for an entire file system. That huge 10’s PB filesystem? Only 1 meta-data server. This can be a bottle neck. To open a file, first the client talks to this MDS (meta-data server) which tells the client which OSS (storage server) to write data to. Lustre will have many OSS’s. If the client keeps creating new files or opening and closing the same file, it keeps making that trip back to that single MDS. If the client creates one file, doesn’t close it, and keeps writing to it, the client never speaks to the MDS again! Just to the, many, OSS nodes. Obviously the client can avoid making the extra network trip over to the MDS and back multiple times, but it also avoids this single server bottle neck.
Read the Full Story.