In order to be able to move our media files from one computer to another it is critical that the names of our files can be understood by the different file systems and encodings they meet.
To find a set of characters which can meet all these criteria this article is originally based on content from Wikipedia Online Encyclopedia, especially the articles Filenames, Comparison of file systems and ASCII character encoding and Naming a File from MSDN. Please add other references to improve this article.
Introduction
If you want to make sure your files can be safely moved between different types of computers you need to consider what your files can and can't be called and how they can and can't be organised. For example, a file called uk_census_of_15.5.1851.txt will not be understood by a Windows computer. And even though your computer might let you make a file called birth_certificate_of_André_Mollier.jpg it won't open on all computers because of that accented é.
If you follow the rules below then your directories and files will be handled without issues on all of the following:
- servers, USB drives, CD's, DVD's, Blue Ray discs and HD DVD's. Hard drives formatted with FAT32, NTFS, EXT. Computers running Windows '95 and later. Any POSIX compliant systems (Linux, Unix, OSX) and much more...
If your operating system is from before 1995 you should be aware that the rules described on this page are not supported by ISO 9660:1988 level 1. After that the Joliet extensions fixed several issues.
File and directory names
To make a list of unsafe characters would take up far too much space, so here is the list of what is safe. Notice that a space is not a safe character.
- a-z Lowercase alphabetical characters without any accents (see below)
- A-Z Uppercase alphabetical characters without any accents (see below)
- 0-9 Numerals
- - Hyphens/ dashes (except at the start of a file name)
- _ Underscores
The characters from the list of safe characters must be used with some care:
Always use mixed case because MYFILE.txt can become myfile.txt without warning. In contrast MyFile.txt will not be changed by most file systems. Windows ignores capitalisation.
Hyphens should not start a file name because a hyphen indicates that what follows is an option for a script.
The period/ full stop, while allowed, can only be used to indicate the start of a file extension. It may not be used to start a directory name. (Limitation of ISO 9660)
Illegal file names
- CON, PRN, AUX, CLOCK$, NUL, COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9 (reference) also ., and .. (reference)
Limits
There are limits imposed by operating systems and file formats. The lowest of each of these limits (for systems after 1994) is listed below.
- The number of directories in any path on a CD must not be more than eight. Ie: (the CD itself)/2/3/4/5/6/7/file.txt (the limit of ISO 9660). Is this valid after 1995? --DuncanNZ 12:21, 16 October 2008 (EDT)
- The number of directories on a CD is limited to 65,535 (the limit of ISO 9660) on Windows)
- The length of a file's path, ie: /genealogy/sources/uk_census_1851.txt, is limited to 256 characters (the limit of Windows Path Size)
- The length of a file's name, ie: uk_census_1851.txt, is limited to 31 characters including the period and extension (the limit of the Macintosh HFS file system)
- The size of a file is limited to 2 gigabytes (the limit of ISO 9660 and the Macintosh HFS file system)
Needing clarification
The following are not recommended but I can't find any reference to why they could be a problem. They are all ASCII characters.
# | number sign | Yes | Not reservedref |
& | Ampersand | Yes | Not reserved(ref) |
' | Apostrophe | Yes | Not reserved(ref). Some websites have trouble handling file names containing apostrophes (PHP Bug #33198) |
( and ) | Parentheses | Template:Maybe | Unclear. Reference. |
+ | Plus | Yes | Not reserved(ref) |
, | Comma | Yes | Not reserved(ref) |
; | Semi colon | Yes | Not reserved(ref) |
= | Equals sign | Yes | Not reserved. reference. |
@ | At sign | Yes | Not reserved. Reference. |
[ and ] | square brackets or box brackets | Yes | Not reserved. Reference. |
^ | Caret | Yes | Not reserved. Reference. |
_ | Underscore | Yes | Not reserved. Reference. |
{ and } | Curly brackets | Yes | Not reserved. Reference. |
~ | Tilde | Yes | Not reserved. Reference. |
External links
- Digital Cameras and Genealogy
- Filenames as a Strategy to Managing Your Image Assets and Recommendations for Limitations on Image Filenaming from the Controlled Vocabulary website
- How to make file names cross platform
- Filenames: Which characters to use, Which characters to avoid
- Mac and Windows OS File/Folder naming rules