Backing up and recovering Endnote libraries

I try to back up every file that gets changed in my working directory every day or two. This is not a problem except for one program: Endnote, still my main bibliography manager. Unlike 99% of all programs, even if you do nothing but open and close an endnote library (*.enl) file, Endnote will mark anywhere from 10-12 files as changed, some of them buried in sub-sub-directories. This makes backing up a pain, and tempts me to folly, like backing up every month instead of every other day.

Here is my somewhat random method to minimize Endnote’s back up workload. NOTE: This method is based on MY habits and needs, which are highly individual (weird). YOUR needs are different than mine, so be sure to think about what you need before you do it.

Anyway, here is my routine. Any time the *.enl file is changed, back it up. This does NOT change every time you open the file, and when it DOES change, it means your data has really changed, so, no problem.

However, some things DO change without there being a difference in your REAL data; this is the stuff in the two (or three) data sub-directories associated with the .enl file, and this is where I get headaches.

Say you have a library file called dummy.enl All versions of Endnote since 7 or 8 will create an associated directory called dummy.Data. Newer versions (don’t know since when) can have as many as three sub-directories under dummy.Data. Here is what they are:

dummy.DataPDF: If you attach any files (pdf, etc) to your library items using the copy to local option, they go in this sub-directory. Obviously, not every .Data directory will have this. I have stern habits about how I use this feature, which I won’t go into here.

dummy.Datatrash (since Endnote X1 at least). This has two things: a file called trash.enl and another sub-directory: dummy.Datatrashrdb. This will be filled with anywhere from 6 to 12 files for every endnote library. Yuck.

When backing up, make sure that everything in the “trash” subdirectories is ignored by your backup programs. Endnote saves stuff you delete (“put in the trash”), and then asks you if you want to get rid of it every time you open and close the library. Do not treat this feature as anything other than a safety measure, to use in case you accidentally zap stuff. Either use it on the spot, or delete it all when you close the file. Backing this up is a waste of time and cpu cycles, and will only confuse you when you actually need to restore your backups. Instead, focus on the following…

dummy.Datardb: The only important stuff for real backups is what’s in the .Datardb sub-directory. This is all in the form of MyISAM tables, which means that for each table there are three files: *.frm, *.myd, *.myi. Yuck again. The number of tables will vary from library to library, depending on what features of Endnote you use, but there is real data (or meta-data) stored here, so you have to be careful with this stuff.

Before you decide what to back up, however, you need to know what Endnote does in case this stuff is lost. If you have dummy.enl and no dummy.Data directory, Endnote will silently create a new dummy.Data directory with all the basic files it uses. This only works if dummy.enl is not corrupted. If there are problems with dummy.enl, and you don’t have the .Data directory for it, you are probably toast. However, since I back up like mad, this is not my situation. What this silent recovery means for me is that some of these basic files don’t need to be backed up. Specifically, all of the .frm files are more or less unchanging. For some reason, one file called csort.frm gets rewritten almost every time you open a library, but this is meaningless; the contents never change, so make sure it goes on the exclude list.

For the individual tables, this means that you need, at most, to back up .myd and .myi files, representing data and indexes for the MyISAM tables. For each of my Endnote libraries, I have a maximum of 6 tables under rdb: csort, jterms, misc, refs, refs_ext, and terms. Not ALL of these tables are essential for your library to work. Apparently csort saves information about the sort order you are using. This is pretty unimportant, yet all three MyISAM files (csort.myd, csort.myi, and csort.frm) are rewritten every time you open a library. Put this on your exclude list. jterms and terms are where Endnote keeps the info for your term lists. As long as your library is intact, these can be regenerated without any problem, so I exclude them too. refs is all the info in your library: every field for every reference. So if they use this, what’s the *.enl file for? Thoroughly redundant, but that’s Endnote’s business. I have no idea what refs_ext does, or its importance, have to get back to you on that. The misc table is what it says: it includes petty things like the size and location of windows, and important things that represent a real investment in time, like groups and groupsets.

This makes the misc table a real pain to back up. It is rewritten every time you open a library, so the date changes for misc.myd and misc.myi. Even though it is rewritten every time you open a library, the data file misc.myd sometimes doesn’t change at all, but because the misc table holds my group information, I absolutely want this backed up whenever I am working on groups.

I have one library with over 1400 references in it; the groups are an essential part of my data for these references. Unfortunately, I can’t tell when this important data has changed (must back it up) and when just the windows have changed (who cares, no backup). Bad news for backer uppers. Although the refs table is redundant, and can therefore be skipped, I like redundancy for real data; in addition, refs.myd apparently only changes when .enl changes; since I back up .enl, why not back up refs as well?

So the sticking point for the table data files is backing up misc, and doing it very, very, often. Can’t rely on date stamps or size to determine when to backup.

In addition to the myd data files, the myi index files change constantly. It is very tempting to ignore these changes, since most of it is meaningless MyISAM table management, but beware: if you have added rows or changed indexed fields for any of these tables, and you don’t have the index that goes with it, Endnote will refuse to open your library with the warning: “This library appears to be damaged. Please verify that no other user has this library open simultaneously with write access.” It will then demand that you use the “Recover library” function to open the library. Recover library may or may not get back all your groups. I had at least one case where it did not, and this is a bad memory.

So here is my backup regime: exclude everything in the .Datatrash directory. Back up all *.enl files, and all rdbrefs.myd files whenever they change. If you are messing with groups, you MUST backup the 3 misc files under rdb to be sure this data is safe. If I know I have not touched my groups in a blue moon, I happily skip this, but this means I cannot automatically them on the exclude list. csort, is meaningless and is a definite exclude, and terms, and jterms are both excludable. If you backup an myd file, you must backup its companion myi file, or Endnote will make you “recover” your whole library; this consists of making a copy of the library called XXX-saved in the same directory you are working in, and an associated .Data directory as well. I currently backup the refs_ext table on the same schedule as the misc table, because I don’t know what its for.

The long and short: exclude the trash directory, and jterms, terms, and csort in the rdb directory to cut down on your backup space and time. .enl and refs.myd and refs.myi are must haves. The files misc.myd and misc.myi should also be saved frequently when you are working on groups; otherwise, it is very skippable. All the *.frm files you need can be generating by opening the .enl file after renaming its .Data directory.

So, you have saved a little bit of time on your backups and a little space as well. The real reason Endnote is a pain to back up is the programmer’s failure to properly factor the table data: groups, which are an essential part of libraries, should have their own table, sorting options should not. But as big Tony often said, “Whaddaya gonna do?”

This entry was posted in Research methods, Software. Bookmark the permalink.