WeSay: A tool for community-participatory lexicography?

In the most recent volume of Language Documentation and Conservation there is an article (here) about a piece of lexicographical software called WeSay. The interesting thing about WeSay is that it is designed to be used by lay members of language communities — rather than professional linguists — to build dictionaries of their own languages. There are obvious reasons why this application would be interesting to many of us involved in language documentation, but I want to relate a personal experience that indicates why something like WeSay would definitely fill a serious gap in the range of currently available lexicographical software.

One of the major goals of the Iquito Language Documentation Project, in which I participated, was to integrate trained community members (‘community linguists’) into the day-to-day research activities of the project. One area in which we felt the community linguists could be especially productive was lexicography, especially since the tasks involved squared with the community linguists’ personal interests in the language. The question that immediately arose was how to coordinate the community linguists’ lexicographical work with the task of building the Shoebox lexical database.

Our first idea was simply to teach the community linguists how to enter data into the Shoebox database. We were already in the process of teaching the community linguists how to use PC laptops and word-processing software, so we thought that extending their training to include Shoebox would be a relatively straightforward matter. Unfortunately, this did not turn out to be the case. Shoebox can be difficult to use even for individuals with considerable computer-related experience, and for the community linguists, who were learning to use computers for the first time in their lives, the application proved far to finicky and difficult to use.

One of our team members had significant programming experience, however, and suggested that he write a front end for Shoebox that would considerably simplify the community linguists’ interactions with the database. The idea was a good one, but I had two misgivings. First, I was concerned that regardless of how foolproof the front end seemed, over the course of the nine months we were away from the community, and the community linguists were working on the dictionary independently, *something* unforeseen would happen with the front end, and bring work to a halt. Second, I was concerned that the team member who promised to maintain the front end would not stay with the project for its entire duration, and we would be left with a piece of home-grown software that we didn’t know how to modify or fix, should the need arise. We debated the issue at length, but the team leaned towards the front end idea, so we decided to try it.

At first, everything went well. The front end worked very well, and the community linguists found it easy and comfortable to use. The visiting linguists (including me) left at the end of the summer, and it was then that the problems arose. After about four months, something happened that disrupted the connection between the front end and the Shoebox database, and that was that until the team of visiting linguists returned five months later. The community linguists were smart and started entering their data into an Excel spreadsheet, so their work didn’t grind to a halt, but we had to spend a lot of time transferring the data into Shoebox. So, all in all, the front end experiment was not a great success. And, to top it off, the team member who wrote the front end didn’t return — he decided to quit linguistics and go into real estate.

From then on, the community linguists collected their data in notebooks, and every June, when the visiting linguists arrived, we spent many hours entering the data into Shoebox. Hardly a very efficient process, but the best we could manage at the time.

It should be obvious, then, why I was very excited to read about WeSay. Before I provide a brief description, let be add that apart from the LD&C article, information can also be obtained at the WeSay website (www.wesay.org), which includes a page of screenshots and Flash movies that illustrate how the program works (here).

Basically the idea behind WeSay is a much better implemented and more comprehensive version of the front end we came up with in the field. The user interface consists of relatively simple forms into which one enters data, and the entire paraphernalia of directories, data field codes, and the like are hidden from view. The program also provides guidance, in terms of semantic fields, to prompt the collection of lexical data, further facilitating the independent work of community members. Significantly, WeSay also provides localization tools, so that the interface can be translated into the locally appropriate language. Despite the simplicity of the interface, however, WeSay can also export in data formats used by more powerful lexicographical software. And note that WeSay is free, open source software, and can be downloaded from the WeSay site.

In many respects, then, WeSay sounds like the answer for those who are interested in linguistic documentation projects with significant community participation. I have yet to try it out myself, but I look forward to doing so when I have the time. If any readers have had personal experience with WeSay, I’d be interested to hear about it.

3 thoughts on “WeSay: A tool for community-participatory lexicography?

  1. Yeah, interesting piece of software. I also like the fact that the underlying database structure plays well with SIL Fieldworks Language Explorer, a truly useful and well-executed successor to Shoebox/Toolbox. I’ve been planning to do a review of that for some time.

  2. Hey Mark,

    I should have guessed that you had heard about WeSay. Do you know anyone who is actually using it?

Leave a comment