The Tab-Indented List Format Explained
I'm a great believer in the sharing of knowledge. That's why everything about Keyword Lists, or at least everything that I know about Keywords Lists, is shared for free on this website. If you want to build your own keyword list, then this page explains what you need to know. If you want to edit an existing list, then please see the Keyword List Editing page.
Hierarchical Keyword Structure
The most basic list of keywords is known as a 'Flat File'. It consists of a collection of keywords separated by commas, as in this example:
Mammal, Household Pets, Dog, Fox Terrier, Natural World, Plant, Tree, Scots Pine
Whilst manageable in small numbers, a large file with 1000s of keywords would soon be completely unwieldy and impossible to use. Enter the Hierarchical Keyword Structure. This organizes your keywords into a collapsible structure, so you only need to look at the ones that you are currently interested in. Instead of scrolling through an ever growing list of keywords, you can 'drill down' and locate them in a logical way, through intuitive categories on different levels.When you first view a hierarchical keyword list in LightRoom, it is collapsed, which makes it easier to explore as only the top categories are shown. To expand the list, click on the small triangles that face to the right...
Here we see the same example list, with all its categories expanded. 'Fox Terrier' is contained within 'Dog' which is contained within 'Household Pets' which is contained within 'Mammal'. The same structure is shown for the next top-level category: 'Natural World', which contains 'Plant', which contains 'Tree', which contains 'Scots Pine. The small triangles to the left of these keywords indicates that these levels are collapsible. The numbers on the right, which are all zero, indicate that no photos in the LightRoom database currently have been tagged with these keywords.
While Lightroom gives you this organizational structure from within the program itself, the interface it provides to create such a structure is not the best to work in, especially when your keyword lists become more and more complicated. It is therefore easier to use an external program to build your keyword lists. When complete, the lists can be easily imported into Lightroom and used in exactly the same way as if you had created them from within Lightroom.
Programs To Build A Keyword List
The good news is that all you need to build a basic keyword list is a Text Editor program. You might already have a favorite one that you are used to. The essential requirements for a suitable Text Editor program are:
- Saves its files in the UTF-8 format
- Visualizes hidden characters, such as TAB and SPACE
Other useful functions are:
- Global and selective sorting
- Box selection
- Spell checking
- Case changing and Capitalizing
- Search and Replace
Won't a Word Processor program do just as well? You might be quite familiar with your favorite Word Processor program, and quite like the way it works. Unfortunately, Word Processors, whilst trying to be helpful, ordered, and pretty, save more than just the basic text information. Font height and color, paragraphs and page footers: none of these are useful in a Photo Keyword List if we then wish to import it into LightRoom.
My suggestion for a suitable Text Editor program is 'PSPad'. Its free to download and use, (though they do ask for a donation), and is just the thing to get you started.
NB: When you start PSPad for the first time, click 'Settings > Program Settings > Editor (part 2)'. You should then check the box marked 'Real Tabs', and set the 'Tab Width' to 6 or more. I'd also recommend that you click 'Format > Font' and choose a fixed-width font, like Courier New.
Here we see a screen capture of a simple Keyword List displayed in 'PSPad'. I've clicked 'View > Special Characters' to get the program to display the characters that are normally hidden. TAB is indicated by the little '>>' symbol. SPACE is indicated by a dot or period. CARRIAGE RETURN/LINE FEED is indicated by backwards-P at the end of each line. (Its called a 'Pilcrow'.) The top-level categories in the Keyword List (Mammal, and Natural World) are positioned to the left of their respective lines, with no preceding characters. The next level down in the keyword hierarchy has a TAB character in front of it. Each subsequent level down in the keyword hierarchy has one more TAB character in front of it than the line above. It is thus quite easy to visualize the hierarchical structure when you look at the text file.
Advanced Keyword List Building
A text-editor is fine to start with, but what happens if you want to sort your list, spot errors, identify duplicates, search and replace, visualize the list, and other functions? Surprisingly, there are remarkably few programs designed to work with tab-indented lists. I used to create my own tools to build my own lists, as there was nothing else that did what I needed. Eventually I put all of the tools together, added a raft of new features, and now sell the program from this website - see my Tab-List Tools page.
Keyword List Building Rules
- There are a number of special characters that you must not include in your Keyword Lists. Obviously your Keywords should not include the TAB character, as this is used to layout the list. You should also avoid the use of commas, semicolons, greater-than (>) and less-than (<) characters, and the pipe character ( | ). I have also added the query or question mark (?) to the list. It is not an illegal character like the other ones, but its presence often indicates a transliteration or change of character encoding error, so it is best to avoid it. If you are Copying-And-Pasting to your list, its easy to miss these characters, so ALWAYS do a search for them before finalizing your list.
- It is not possible to use Comments in a LightRoom keyword list.
- If extra tabs are included, the category won't show i.e. in this example, Category2 won't show:
TAB TAB TAB Category2
- Empty lines (TAB carriage-return) cause an error.
- LightRoom automatically puts entries on the same tab-level into alphabetical order. Where entries MUST be kept in a specific order, they can be prefixed with a number or ~ symbol.
Error-checking the Keyword List
Spotting such errors by eye in a tab-indented keyword list is a very difficult task. PSPad and many other text editors come with basic spelling checkers, but that is as far as they can help. To identify and correct blank lines, excess indents, missing brackets, trailing TABs, and the presence of illegal and unwanted characters in your Keyword List, you really need some kind of tool that can do the job for you automatically.
The good news is that such a tool is now available here on this website, and it runs on all recent Windows computers. You can use it to check for errors, correct them, and sort your own tab-indented keyword lists. Actually, it has a lot of features unique to tab-indented lists. You can find it here: Tab-List Tools.
Keyword List Categories
If you look at the image below, you will see that some words are surrounded by square brackets. These words are called 'Category Words'. They are not normally exported in the keyword list when you add keywords to your images, so Category Words are useful to help define a path within your Keyword List that you wouldn't necessarily wish to add to the keywords in the photograph.
It is a good practice to make your categories all UPPERCASE since Lightroom gives no indication when it displays the Keyword List that a word in the list is a category, and not just a normal keyword.
Sometimes you might need a word to be a category, yet still wish to add it to the keyword output. In that case, the simplest method is to duplicate the word, once with square brackets and once without.
If you DID wish to include all the 'Category Words' as keywords when you export, you can check the LightRoom option 'Write Keywords as Lightroom Hierarchy'.
Here we see what the Keyword List above looks like when it has been imported into LightRoom. But wait a minute - what has happened to the words beneath 'baby' and 'child'? The 10 words concerned aren't showing at all.
Keyword List Synonyms
The missing 10 words are not shown because they have been classed as 'synonyms'. My dictionary defines 'synonym' as "Two words that can be interchanged in a context are said to be synonymous relative to that context". In other words, and especially where Keyword Lists are concerned, a synonym is a word that we always wish to be added to the Keyword List when its 'father-word' is selected. So, in this example, when you choose the keyword 'baby', the synonyms 'babies', 'infant', 'infants', 'new born', and 'newborn' will be chosen as well. Synonyms appear on separate lines below the 'father-word' and are indented with one extra TAB character. They are also surrounded by curly braces. When used in Lightroom, synonyms don't appear in the keyword list, yet you can filter and search with them.
Saving the Keyword List
When you have finished building your Keyword List, save it as a text file, with the '.txt' suffix. The format of the file is important: if you have the option, then set the file encoding to UTF-8. On a Mac the line endings must be standard Unix. To learn how to import the file into LightRoom, see my LightRoom Keyword Import page.
Other Keyword List Building Methods
If you have a spread sheet program like Excel then you can also use that to build your keyword lists. Some people prefer to use spread sheets to build their Keyword Lists as the column and row layout can be more intuitive and help prevent you from accidentally deleting or including extra TAB characters.
Here we see an example of the same list as before being built in an Excel spread sheet. My Office Assistant has also snuck into the shot! To save the Keyword List in the correct format for LightRoom, click 'File > Save As > Save As Type > Text (Tab delimited)'. Open the file in your Text Editor to check that it has not added extra trailing TAB characters to some of the lines - if so, they will need to be deleted. You can use 'PSPad' to do that by clicking 'Search > Replace' then in the window that opens, add "\t$", WITHOUT the quote-marks, into the 'Find' box, and leave the 'Replace' box empty. Click 'Direction > Entire Scope', and then 'OK'.
You should note that Microsoft Excel 2003 has no means of choosing the character-set that it uses in a '.csv' export, and defaults to ANSI. If you have foreign and accented characters in your list, you are likely to find that they will be corrupted during the export process. The way around this is to choose 'File > Save As' and set the 'Save As Type' to Unicode Text. This will save the file with the character-set 'UTF-16'. You can then open that in your text editor, and copy-and-paste it to a new 'UTF-8' file, which should retain all of the foreign and accented characters in your list. Don't forget to do a 'File Compare' to see that this is so.