Tuesday, January 17

More Bandit Graphs

Without much explanation (yet), here are some ngrams-style topic graphs I've made on my second topic analysis of the Qing Shilu. This time, I cleaned up a lot better - got rid of very short entries that were giving strange results, got rid of chapter headers - and got better results. This time round, I tried out 40, 50 and 60 topics. 50 gave the best results - started getting lots of garbage at 60 so I didn't try more topics.
First off, the topic model (as before) does a nice job separating rebellions (4 topics this time) from regular law and order issues (4 topics). Here the graph of the two composites:

Note that again law and order (solid line) is fairly consistent - lots of variation (some suppressed by the 12 month moving average), but at a low drone. On the other hand, rebellion (dashed line) is basically zero (plus a bit of noise related to the statistical nature of the model), except during major events - some minor rebellions in the late 18th and early 19th century, and the well-known crises of the mid-to-late 19th century.
In fact, this cleaned up model "wasted" fewer topics on noisy data and the preponderance of minor imperial rituals in the very short shilu entries. It had more topics to assign to usefully differentiating within the ideas of "rebellion" and "crime":

Note that the Taiping Rebellion has essentially received its own topic that is basically zero except during the rebellion (dotted line), and the rest of the rebellion entries have been split between three topics. Two are shown here - a general rebels topic (dashed line) that spikes with every rebellion, and a 19th century rebels topic, that only seems to pick up rebellions starting in the 1790s. I haven't figured out what this is describing yet.

Finally, there is a topic for northwestern rebellion, that spikes at the expected time. Check out how the topic model differentiates between the Western Campaigns during Yongzheng and Qianlong (dotted line, not included in the composites above), and the Northwestern rebellions in the late 18th century (solid line) despite the fact that there are similar terms used in both cases (bing 兵, zei 贼, jiao 剿, fan 番, various characters used to transliterate foreign words, etc.)


Finally, there are four topics that divvy out the general law-and-order issues. I have yet to figure out what all of these mean, but it is clear that there is one that is dominated by Qianlong's neuroses (how Kuhnian), a policing topic (all about capturing criminals) that is fairly consistent, a trials topic that seems to generally increase over the course of the dynasty, and one that tends to decrease:


Anyway, this is preliminary. The results are clearly better - less noisy,  more clearly divided - than my first model produced. But I haven't had a chance to read dozens of memorials yet to verify my suspicions.  Please comment - on the results, analysis and visuals. I'm playing with the black and white because I suspect it will do better in print. On the screen, which is clearer, that or color?:



No comments: