New FLLD list on line

My old list of foreign language and literature programs in Taiwan and Hong Kong linked on the right side of this blog ceased to be useful about 15 years ago. I didn’t realize quite how out of date it was until I started to revise it a couple of weeks ago.

In the 15 years since I compiled the list, whole categories of schools have ceased to exist, categories of departments, programs, and specializations. An example: there is no longer even one teacher college (師範學院) in Taiwan. They have either ascended to the ranks of normal universities (師範大學), or been absorbed, amoeba like, into other schools.

Despite these extinctions and transmogrifications, however, Taiwan’s tertiary level foreign language programs are still flourishing in extremely various ways. With the help of my assistant Ignatius Liu, I’ve put together a new page of links listing over 130 departments and programs. The list is still not complete, however, and additions will appear as I find them.

Use the side link for the new version, or click here and give it a try. If you click on a link that doesn’t respond, try again later, even some rather large schools can have poor network connections it seems (hrmph, hrmph). If you get a page not found error, though, let me know. I hope the page format looks more professional than before. There is a real English version as well, the new default view.

Next project: Chinese language and literature departments!

Posted in school | Comments Off on New FLLD list on line

JFKRA Release 6: The two missing files

NARA’s announcement of JFKRA release 6 stated that 3,539 files were posted on NARA’s website. As I noted in an earlier post (here), I was only able to find 3537, leaving two unaccounted for. This is the kind of thing that drives me crazy, so I’ve been looking for them for almost a month now.

Although I cannot definitively answer this question, I now have a suggestion about the discrepancy. In reviewing my figuring for my post on NARA’s file replacement (here), I noticed that there are two duplicate listings in release 6. Using the latest NARA spreadsheet of JFKRA releases as a reference, row 2356 lists document #124-10204-10000, posted as docid-32585200.pdf. Row 3424 then lists the same document, 124-10204-10000, posted as the same file, docid-32585200.pdf. This happens again at rows 3849 and 3944, which list document #157-10002-10002, posted as docid-32281842.pdf, twice. Perhaps this is the reason that NARA thought that it posted 3539 files, but actually posted only 3537.

Another possibility is that, in both these cases, NARA actually posted two different files with the same name. The effect of this would of course be that the second file overwrote the first file. This is what happened with the ‘replacement files’ I discussed earlier. In those cases, I was able to see that there were originally two different files because I had downloaded the earlier version of the file before it was overwritten (replaced) by the later version. In the case of 124-10204-10000 and 157-10002-10002, however, the first version of the file would have been overwritten a few minutes or seconds later, when the second version was uploaded on top of it. I doubt that anyone could have been lucky enough to download the first version in such a short time, so the only ones who would have seen these theoretical different versions of the two documents are the folks at NARA.

Going back to an earlier issue, I also noted a while ago that there is an earlier set of 9 files that have the same problem: These are files listed in the spreadsheet for release 1, then again, with the same record number, and the same file name, in the spreadsheet for release 3. I did not discuss these in my post on replacement files because I do not have copies of the first (July) release versions of these files; I downloaded only a part of these files at that time, and therefore missed that opportunity. These release 1 and release 3 duplicate listings have the same possibilities as the duplicate listings in the release 6 spreadsheet: 1) it could just be a typo; 2) there could have been two versions of these files as well, and the earlier version was overwritten by the later version, as happened to the replacement files.

Although I can’t say what the case was for any of these “duplicate listings”, I’ve put up a list of them here, and I’ll try a letter to NARA after I’ve gone through the remaining issues in release 6.

Postscript

I reviewed the zip files I downloaded immediately after release 1 in July, and found 5 of the 9 pdf files that were posted again in release 3 in November. The release 1 version and the release 3 version are in all cases byte for byte identical. This negates any suspicion of NARA ‘replacing’ an earlier version of these July files with a later, different version in November.

Posted in History, JFK ARCA | Comments Off on JFKRA Release 6: The two missing files

JFKRA Release 6: Replacing files at NARA

This post continues a discussion of the sixth release of JFKRA records from NARA. This time I will look at a quirk of release 6 that I call replacement files.

As I noted a while ago (here and here), in releases 1-5 there were a number of records listed twice in NARA’s spreadsheet of documents posted on line.

These came in two varieties. In one set there were actually two files posted at NARA: one from release ‘A’ and one from release ‘B’, both files presenting the same document, but with various differences between them (see the earlier posts for a discussion).

In the second set, the same record was listed twice, but each listing referred to the same file, so there was only one file posted. I had thought this was simply an error on the part of the spreadsheet editor(s), but release 6 now has me wondering.

In fact, release 6 has 45 instances of this type of duplication, as listed in this link.

For each of these instances, I had already downloaded the file prior to the posting of the release 6 files. When release 6 was posted, however, I discovered that these 45 files had changed; and were now different from the files I had downloaded earlier. The earlier versions of these 45 files are no longer available on the NARA site.

How to describe this situation? Let’s just say that NARA replaced the older versions with newer versions. Since I had downloaded the earlier versions, I was able to compare them with the newer versions that replaced them. This post summarizes what I found.

Type 1: markings changed, text unchanged

In some cases, the new version is actually a new scan; in other cases, it seems to be the same scan, but is marked differently. This sort of thing is, well, unfortunate. I am sure that no one intended it would be necessary to engage in the study of what bibliographers call ‘accidentals’ (physical variations in a printed work, as opposed to different wording) when reading the JFKRA releases, but there you are.

One way to denote these different versions is to look at the ‘case #’ of the file. The case # can appear in two or three places. Some files have a stamp with the case number on it (usually preceded by boilerplate text reading something like ‘Released under the John F. Kennedy Assassination Records Collection Act of 1992’ etc.) In other cases there is no stamp, the case # may be added in the header or footer of the file.

In 38 of the 45 files that were replaced, these case #’s either changed, or the earlier version had NO case # and the release 6 version has added a case #, so this is a convenient way to distinguish them. These differences are all indicated in the table linked to above. In one case, the case # was the same, but one file had a stamp and the other didn’t. Other than this kind of ‘accidental’, these files are the same. Why then replace them? Ask NARA.

Type 2: markings unchanged, text changed

For 6 of the remaining 7 files the case # did not change. In these files, however, the text changed, with redactions in the earlier versions removed in the later versions. Since the point of this whole exercise is to release more complete versions of the documents, I don’t think there will be too many complaints about this.

The problem with this method of silently replacing early versions of files with later versions is that some people may worry that NARA could get cold feet, and replace an unredacted version of a document with a redacted version.

I am reasonably sure that NARA has not gotten cold feet, but there is one case where this actually happened: RIF # 124-90035-10121 (docid-32144601.pdf). The current version has a redaction that was not there in the earlier version which I downloaded on November 19.

What did the earlier unredacted version say? Send me 5 dollars US by PayPal and I’ll tell you. (Kidding, just kidding.)

The redacted version reads:

Enclosed herewith for the Bureau are two copies of cover page B, pages 1 and 5 and for PG three copies of pages 1 and 5 and two copies of cover page B. The amendments were made necessary by the fact that on 1/12/60 [three lines blanked out]

The redaction is footnoted “JFK Law 11(a)” which is the exemption for IRS records. However, the earlier version shows that the words deleted were as follows:

PLATO CACHERIS turned over to SA PENNYPACKER additional checks of ESCO which included salary checks of WEINHEIMER and checks to other persons on behalf of WEINHEIMER which are pertinent to the report and Agent’s work papers.

This has nothing to do with information from the IRS, it was information given to SA Pennypacker (the author of both the report and this memo) by Plato Cacheris. I find this citation of JFK 11(a) quite dubious, and I’ll complain about it in a minute.

First, though, what is this obscure item talking about anyway? A sort of summary of the story is here. Edward Weinheimer, who apparently “fixed” union problems for people, was charged with perjury for claiming he wasn’t paid to fix problems for a company called ESCO, when he really was. Plato Cacheris was a DoJ attorney who was working the case.

Why this text was removed, retroactively, is very hard to understand. Citing JFK 11(a) is just not reasonable. Other JFKRA documents on this case were released by NARA here and here. Why weren’t they redacted as well? I don’t understand.

Regardless, however, I await my gold citizen’s badge for exposing this coverup. What? What’s that? What does Weinheimer, or ESCO, or Pennypacker, or Cacheris have to do with the assassination of President Kennedy? What a question! Even a simpleton such as me knows the answer to that! And if you send me 10 dollars US by PayPal, I’ll tell you.

Posted in History, JFK ARCA | Comments Off on JFKRA Release 6: Replacing files at NARA

JFK Records Act Releases: Errata

I had hoped to have another post on JFKRA release 6 from NARA, but there are some problems with the release metadata. For one thing, there are some typos in the RIF numbers that have made it hard to figure out what some of the documents released are. I have a tentative list of RIF number errata here. There are also a lot of duplicates in release 6. I do not understand where these are coming from. Perhaps I should write to NARA. Anyway, I haven’t given up yet; more dull posts are coming.

Posted in History, JFK ARCA | Comments Off on JFK Records Act Releases: Errata

Wisconsin by air

A video from number one son Nick, who got his drone pilot license earlier this year. Credit for his hand and eye coordination, and sense of picture composition, goes to his mom, not me.

Posted in quotidiana | Comments Off on Wisconsin by air

JFK Records Act Releases: 12-15 update

The National Archives and Record Administration (NARA) has released a sixth set of records under the JFK Records Act. The release, which took place on December 15, was announced here. According to the announcement,

At this point, with the exception of 86 record identification numbers where additional research is required by the National Archives and the other agencies, all documents subject to section 5 of the JFK Act have been released either in full or in part

This may therefore be the end of NARA posting JFKRA records. Or it may not. Many of the records released this year have been redacted. Whether to lift these redactions is not yet decided, but will be decided by April 16, 2018. My hope is that where redactions are lifted, NARA will re-post these files, with the original text replacing the redactions. Anyone who is taking a vote on this, please note that my hand is raised in favor of re-posting.

Going back to the current release, #6 is rather complicated if you want to count what’s now available. NARA provides a spreadsheet of releases with lots of important metadata (here), but you still have to do some number crunching to get the correct figures. The remainder of this post will explain how I crunched. A follow-up post will raise some questions that I still have.

Spreadsheets, filenames, and RIF numbers

After each release under the JFKRA, NARA has posted a spreadsheet that gives important information about the records that are being released. The records are also posted online as either audio (.wav) files or pdf files. These pdf files are generally scans of original documents, usually produced by government agencies. Which records should be released was determined by the Assassination Records Review Board, an independent government agency established under the JFK Records Act (JFKRA). This law was intended to open up all U.S. government records relating to the assassination of President Kennedy to the maximum extent possible, and the ARRB adopted a VERY broad definition of what counts as relevant to the assassination.

NARA’s spreadsheet of records in the releases is cumulative. That is, there is not a separate spreadsheet for each release. Instead, there is only one spreadsheet which is updated for each release, with information on the most recently released records at the top of the sheet. The cumulative spreadsheet for the December 15 release has 35,557 rows, but this does not mean there are 35,557 files posted at NARA; to understand why, one needs to crunch numbers.

The two key bits of information in each spreadsheet row are the name of the file posted at NARA, and a RIF number. RIF (Reader Information Form) is a form developed by NARA and ARRB for use in a database of JFK records. The JFKRA mandated the creation of this database. An RIF includes a bunch of information, but the key component is the RIF number. This is a unique number which identifies every individual record in the database.

The relation between a document and a “record” in the NARA’s database is not simple. There are cases where the same document may have more than one RIF number. For example if the ARRB acquired the same document from two different sources, say a document from the FBI investigation of the JFK assassination, then a copy of that document from one of the government bodies that investigated the FBI’s investigation of the JFK assassination, these are usually treated as two different records. Another example would be where single components of larger documents are assigned one RIF number, and the larger document they came from is assigned another RIF number.

Another complication occurs in the NARA release spreadsheet. The simplest situation for the spreadsheet data would be if there was a one to one correspondence between the name of the file posted at NARA for each release, and an RIF number from the JFK database. In other words, each file represents a record in the database. This is not the case, however. Instead, a single filename may be listed multiple times in the spreadsheet, each time with a different RIF number. I discussed a number of these cases in earlier posts. A single RIF number may also be linked to more than one file. I also found several cases of this which I discussed in earlier posts. The cumulative spreadsheet for the 6th release has more of both of these. Hence the need for additional number crunching.

Release 6 counts

The revised spreadsheet for release 6 has an additional 4238 rows of record metadata compared to the spreadsheet for release 5. According to the NARA announcement, however, there are 3539 “documents” posted on the NARA’s website. This difference arises in part because in the new rows, there are many cases where a single filename occurs in multiple rows. Each row in the spreadsheet has a different RIF number, so this means one file is listed under different RIF numbers.

This, I believe, is what NARA means when it says

Within records released on December 15th, there are instances where multiple record identification numbers are associated with the same pdf. This is due to the fact that the files were scanned in batches.

I don’t understand what “files were scanned in batches” means or why it would cause this situation. But in any case, this explains, in part, the discrepancy between the number of pdf files posted at NARA and the number of spreadsheet rows. In one case, a single file posted at NARA is listed in 25 different rows. Unfortunately, most of the metadata is missing for these rows, so it is quite hard to say why one file should be listed so many times.

Multiple RIF numbers for one file does not completely resolve the discrepancy between the number of spreadsheet rows and the number of files posted on NARA, however. The second reason for this discrepancy is because there are 12 rows in the spreadsheet that list multiple files. This means that there are multiple files with one RIF number.

In the previous cases where I found this, the different files keyed to the same RIF number occurred in different releases, and it seemed that the later versions of the files either had extra material added or redactions removed. But in the case of the files in release 6 where multiple files are associated with one RIF, I have not yet figured out what is going on. As an example of this, there is one case where a single row with one RIF number lists four different files (RIF 124-10183-10291). When one adds up all these multiple files listed under one RIF, one gets another 15 files.

Summary

Summing up, there are 3510 files listed in the spreadsheet for release #6 which are associated with 4226 RIF numbers. There are also 27 files listed under 12 RIF numbers. 4226+12 = 4238 and this is the total number of new rows in the release 6 spreadsheet (new as compared to the spreadsheet for release 5). The total number of files in release 6 should therefore be 3510+27 = 3537. NARA, however, says there are 3539. I have not yet figured out where the missing 2 files are.

Posted in History, JFK ARCA | Comments Off on JFK Records Act Releases: 12-15 update

ARRB Electronic Records at NARA

In all the news coverage about the release of documents under the JFK Records Act, one release has attracted almost no attention. As the NARA press release on the October 26 release noted (here), the electronic records of the Assassination Record Review Board, the agency which determined what government records would be released under the JFKRA, were also released.

The ARRB records are themselves a noteworthy release. Previously few records from the ARRB were available on line, except for their Final Report, and transcripts of the public hearings they held from 1994 to 1998. The Mary Ferrell Foundation website, maryferrell.org, has only one set of files from an ARRB staffer (here), and a series of memos from ARRB staffer Douglas Horne (here). The newly released electronic ARRB records are therefore useful indeed for anyone who wants to know not just what the ARRB did, but how and why they did it. The rest of this post provides a summary of the content of the files, as I did earlier for the JFKRA releases (more boring numbers, little on file content).

Number of Files

The ARRB electronic records were released in 83 files. These consist of 13 email archives in csv format, and 70 zip files (all available here). This post will cover only the zip files. The follow-up post will cover the email archives.

In the October release, there are 58 zip files associated with specific ARRB staff members, and another 12 zip files which are general board files. Each zip file is divided into two main directories, Electronic Records and Technical Documentation. The TD directories all have the same components: a readme file explaining a dating discrepancy in some files, and a csv file of metadata for the original files in the ER directory. The ER directory has all the documents from each staff account, converted from their original formats into pdf files, and arranged under sub-directories such as wp-docs, excel, etc.

The pdf files were all created between March and September 2017, the majority in June and July. These of course were not the dates the original files were created, but fortunately the TD directory csv files have the ‘date last modified’ for the original files. All you have to do is match up the directory and file names in the csv file to the pdfs in the zip file.

Oddly, this is not entirely possible, even though the csv files were mostly made in August 2017, after the pdf files were created. There are several dozen files in the zip archives that do not match up with the listings in the csv files. Some of these mismatches are due to things like missing file extensions, files in different directories, file extensions missing or misshapen, misspellings, and some cases that defy explanations, such as rec.id becoming non-rec.id. There are also pdfs in the zip archives that clearly were not registered in the csv file, and files listed in the csv data that are not present in the zip archives. The numbers are small; files from the zip archives that I have not been able to match back to the csv file lists total 31. pdfs listed in the csv file that do not match to pdfs from the zip archives number only 49. So not a tremendous deal, but I don’t understand how these differences came about.

In addition, a significant number of files from the zip archives are not pdfs of the original file, but instead have been replaced by forms that state the original files were “withdrawn.” These replacement forms are clearly marked by adding the string “_wd_NNNNNNNN” to the file name (wd standing for ‘withdrawn’ I presume). Instead of the original file, these pdf forms consist of a one page document explaining the reasons for the withdrawal.

There are two reasons for file withdrawal: 1) The file “contains electronic characters that are unintelligible and therefore cannot be authoritatively reviewed according to the John F. Kennedy Records Collection Act of 1992.” These files are almost all binary files, with extensions such as .bmp, .dll, .exe, etc. and are thus not worth looking at anyway. 2) The file “has been withdrawn according to (one of the exemptions of) the John F. Kennedy Records Collection Act of 1992.” The file then lists which of the five exemption categories the document falls into (some of them fall into multiple categories). The binary junk files I will call WDI files, the exempt files I will call WDE files. The totals for the different types of files in the zip files is as follows:

File type No. files
Non pdf files 142
pdf files (not withdrawn) 14167
withdrawn pdf files (unintelligible) 2063
withdrawn pdf files (exempt) 396
Total 16768

The non-pdf files are all from the Technical Documentation directories except for two miscellaneous items. The NARA press release states that the ARRB files consisted of “16,627 files from the ARRB drives.” This must refer to the total number of pdf files (and exclude the 140 csv and readme.txt files in the Technical Documentation directories). My count is one more than theirs, who knows what the extra file is (perhaps the novelty mpg file staffer Carrie Fletcher had on her drive).

The withdrawal of files must have been done by NARA, though why they did it according to the JFKRA rules is not clear to me (the JFKRA does not seem to say that it applies to ARRB records), and why the NARA is entitled to carry out such a withdrawal is also unclear. Unlike documents released under the JKFRA, the ARRB withdrawal sheets state that one may apply for the release of the withheld materials under FOIA provisions.

The zip with by far the largest number of files is s-adm-g_pub-20171001.zip, which is general “administrative” matters and has 5838 files. After this jumbo zip, the top three zips are from ARRB staff members T. Jeremy Gunn, Laura Denk, and Tracy Shycoff, each of whom has a zip file with from 1000 to 800 plus files. As we should expect, the zips with the smallest number of files are from the interns, and the Board members, who have only two files apiece, except for Board Chairman John Tunheim, who has nine.

I have put up a list of the 58 ARRB personnel with zip files here.

Pages in files

The number of pages in each file ranges from a 2982 page monster (S-ADM-G/PRG/FBIFIX.TXT.pdf; this seems to be a database table dump), to one page scraps. Here is a table of number of pages per file (this does not include non-pdf files or the WDI and WDE files):

Page range No. of Docs
> 100 78
51 – 100 188
31 – 50 168
21 – 30 141
16 – 20 177
11 – 15 297
6 – 10 751
4 – 5 1076
3 1312
2 3978
1 6002

Based on this count, the total number of pages released from ARRB files is 79962.

Dates of files

After matching up the pdfs in the zip files with the files in the csv lists, I can also summarize when the files were last modified:

Year Number of documents
< 1994 63
1994 235
1996 3769
1996 3609
1997 3648
1998 2841

Because the csv lists include the date last modified for the withdrawn files, I could have put these into the summary, but I have chosen to omit withdrawn files from this count, for consistency with the counts above. I have also included in the pre-1994 count unmatched files from the csv lists.

Other than withdrawn files and unmatched files, the pre-1994 records are all text files from various software applications, such as database programs or printer drivers. None of these materials were created for or by the ARRB.

The 1994 files also include a number of such files. The earliest files that clearly were created by ARRB staffers are from September 1994 (by then executive director David Marwell), so that the actual figure for 1994 is more like 189 files.

Most of the staff accounts were set up in January 1995, as dated by the creation of an EXCEL subdirectory with a text file at this time.

The ARRB closed on September 30, 1998, and the last files in this release are some thank you notes by the ARRB’s final executive director Laura Denk, dated September 26.

(There are also three files dated from 2015 whose date stamps I cannot explain. Two of these are excerpts from a 1998 article by David Mantik, the other what looks like a version of a 1995 staff memo)

Withdrawal of files

As the withdrawal notifications provided in the ARRB files indicate, files were reviewed for withdrawal from December 2016 to July 2017, with most of the reviews done in June and July 2017.

There were two reasons for withdrawal of files, as noted above: files had unintelligible (non-ASCII) characters that made review impossible (WDI), or they fell under one of the ‘five exemptions’ of the JFKRA (WDE). In fact, as the withdrawal sheets show, there is one more exemption, 11(A), which exempts tax returns from the JFKRA. This exemption was invoked in four records.

Exemption 6(5), which exempts records revealing Secret Service protection measures, is in fact never invoked at all for the ARRB file withdrawals. The most commonly invoked exemption was 6(3), which was invoked 293 times. This exemption covers records whose release would constitute an unwarranted invasion of privacy. Judging from the title of these records, many were documents dealing with ARRB budget information, and probably revealed Board and staff salaries.

Final notes

There are some quite interesting memos in this release, and overall one gets a good idea of the ARRB’s sometimes contradictory goals and attitudes. My own interest at this point is in comparing what ARRB hoped to release, and what has come out so far. This will be the focus of later posts.

Posted in History | Comments Off on ARRB Electronic Records at NARA

Review of Reporting the Chinese Revolution: The Letters of Rayna Prohme

Letters from a fading past

This book consists of a narrative by two editors, Baruch Hirson and Arthur Knodel, written around a few dozen letters by Rayna Prohme. Rayna was an American who, together with her husband Bill Prohme, ran the People’s Tribune, the English language newspaper of the Nationalist Party (the KMT) in China from 1926-27. You have never heard of her unless you read Vincent Sheean’s 1934 autobiography, Personal History. Personal History was a best seller, one of those books that convinced people that the most adventurous thing to do with your life was become a reporter. Rayna is a central figure in Sheean’s book, where she is portrayed as the very spirit of Revolution, an event which “Jimmy” Sheean thought was just around the corner.

Apparently quite a few men who read the book decided that the most romantic thing a reporter could do was fall in love with this radical spirit, and went around for years searching for someone like her. The book did not have that effect on me, but it certainly made me wonder what she was really like and how she had got into a very unusual situation. This book (partly) answers both questions, giving a moving description of Rayna and her husband Bill through letters she wrote from 1926 to 1927. The letters are to her sister, Grace Simons, her friend in Berkeley, Helen Freedland, and her husband Bill. They end a few days before she died in Moscow, and are supplemented by a few more letters written by Jimmy Sheean and Bill Prohme, describing the aftermath of her death from some type of meningitis or encephalitis.

In the early letters, Rayna is excited to be in Canton and Hankow, working for a cause with people like Eugene Chen, foreign minister of the KMT regime in Hankow, Michael Borodin, chief Russian advisor to the KMT, and Soong Ching-ling, the widow of Sun Yat-sen. In the end though, the KMT expelled both the Chinese Communists and their former Russian advisors, for reasons sketchily explained in the preface to the book. Rayna strongly identified with the Communists (it is not clear whether she was a party member), so she and Bill quit the paper (or were fired) and returned to Shanghai.

Mrs. Sun chose to go to Moscow rather than remain in China, perhaps to express her rejection of the KMT’s change of direction (or perhaps not). For reasons quite unclear, Rayna was invited to accompany Mrs. Sun, and Bill was asked to stay in Shanghai. Mrs. Sun and her fellow travelers arrived in Moscow in early September, at the climax of the struggle between Stalin and Trotsky. The failure of the Russian efforts in China played an important part in the struggle, and as an inconvenient witness, Rayna was very unwelcome. She was still looking for a regular job and a place to stay, when she became ill and died suddenly on Nov. 23, 1927. The last few letters she wrote to Bill, which unknowingly describe the onset and progress of the disease that killed her, are truly heartbreaking.

That these letters survived after all those decades is simply eerie. When Helen Freeland died in 1956, Helen’s sister Nancy gave the letters to Marian Parry, a friend of Rayna from Berkeley in the 1920s. Marian (who died in 1986) then gave letters to Knodel, who had been a fan of Sheean’s book since the 1940s. Knodel began preparing a manuscript based on the letters, and apparently had a typescript by the early 1980s (according to the C Frank Glass papers in the Hoover Library), but eventually he must have put it aside.

The letters to Bill Prohme had an especially turbulent passage. Bill was tubercular, and after some very difficult times he killed himself in 1935, on the anniversary of Rayna’s death. He destroyed all his papers except for the letters Rayna sent him from Moscow, which he gave to Rayna’s sister Grace. The letters were found in Grace’s papers after she died in 1985, and finally passed into the hands of Baruch Hirson, who was interested in writing a biography of Grace’s husband, C. Frank Glass. Hirson was unaware that there were more letters from Rayna until two years later, when he was shown Knodel’s manuscript of Rayna’s letters to Helen. The two then collaborated on this book, but it is hard to say when. In any case, their collective work must have lain in yet another box for many years: Hirson died in 1999 and Knodel in 2001. How it was rescued from this final oblivion is also hard to say; Gregor Benton, who wrote the introduction, does not explain. Perhaps a letter to Benton might be in order before he too passes away! Truly a haunted book.

For those interested in the period, this book is fascinating. If you have read Andre Malraux’s book Man’s Fate, read this to find out about real radicals in China in the 1920s. If you have read Sheean’s book, read this to find out what kind of person Rayna really was. Read this even if you haven’t read Sheean. Despite an extremely difficult situation, she comes across as a talented, resilient, and loving woman.

Posted in Book Reviews | Comments Off on Review of Reporting the Chinese Revolution: The Letters of Rayna Prohme

An index to Hsiao-hsueh k’ao

Today I’m posting another recycled paper, an index to a book called 小學考 (Hsiao-hsueh k’ao, pinyin Xiaoxue kao). The link to download the index is here.

I compiled this index in June 1991, when I was still a graduate student. As the preface indicates, I thought I was going to do a revised version later that year, but here it is 26 years later and the revision never happened. Off to the Internet bitbucket with you! The pdf is scanned from the original printout. The files and database I used are either lost or in wordperfect 4.2 format with Chinese added, which cannot be easily reconstituted into any readable format. Apparently OCR doesn’t work very well on these older Chinese fonts, so I have just posted the scan without doing anything more. The author of the Hsiao-hsueh k’ao, Hsieh Ch’i-k’un (謝啟昆, pinyin Xie Qikun) has a short biography in Hummel’s Eminent Chinese of the Ch’ing Period, but it is not under his name, and thus not easy to find. If I have the time, I may do another post on Hsieh and his book.

Posted in Linguistics, Recycled papers | Comments Off on An index to Hsiao-hsueh k’ao

How to select/display all rows with duplicate values in MySQL

(This should be a fub, but the description is too long.)

One of the most frequent uses of database tables is to identify duplicate values in your data. Assume you have a table with a list of names, and you want to know how many of the names appear in the table more than one time. Most books on MySQL will tell you how to do this. Here is a recipe from Paul Dubois (MySQL Cookbook 3rd ed., O’Reilly p. 556):
SELECT COUNT(*), last_name, first_name
-> FROM catalog_list
-> GROUP BY last_name, first_name
-> HAVING COUNT(*) > 1

Suppose, however, that you want more information about the duplicates, for example their birthdates, which is contained in another field in each row. Just adding the field birth_date to this sql command will only show you the birthdate for one of each of the duplicates. Suppose you want to show data for all of the duplicates? Dubois actually suggests how to do this (p. 560), but his solution involves making a temporary table and joining it against the main table. Is it possible to accomplish the same task without an intermediate table? The answer is yes: use a self-join. I have great difficult understanding and using self-joins, so it took me quite a while to get it. Even worse, I’m pretty sure I’ve spent considerable time figuring this out more than once. So here is the solution, for permanent reference:

SELECT t1.last_name,t1.first_name,t1.birth_date
  FROM catalog_list as t1 
    INNER JOIN (
    SELECT last_name,first_name,count(*) 
      FROM catalog_list 
      GROUP BY last_name,first_name 
      HAVING COUNT(*)>1
    ) as t2
  ON t1.last_name = t2.last_name and t1.first_name = t2.first_name
ORDER BY t1.last_name, t1.first_name, t1.birth_date
Posted in Programming, software | Comments Off on How to select/display all rows with duplicate values in MySQL