CWB beta version (pre-3.0), binaries for various platforms ---------------------------------------------------------- To install the beta version, copy the appropriate archive for your operating system (Linux, Solaris, IRIX, AIX) into /usr/local (recommended directory) or any other directory (you'll then have to adjust the path), and un-gzip and un-tar it. Available versions: cwb-2.2.b72-sparc-solaris.tar.gz (SUN Solaris 2.6+) cwb-2.2.b72-i386-linux.tar.gz (Linux/i386 2.0+) "Unsupported" versions: cwb-2.2.b17-mips-irix.tar.gz (SGI IRIX) cwb-2.2.b17-powerpc-aix.tar.gz (IBM AIX) For example, if you're working on a Linux machine, type the following: gunzip cwb-2.2.b72-i386-linux.tar.gz tar xvf cwb-2.2.b72-i386-linux.tar (if you want to see the contents of the tar file first use "tvf") This creates, amongst others, a directory bin/ which contains the binaries for CQP, the Corpus Query Processor, and encode/makeall, which are used for encoding corpora. To find a corpus, CQP uses an environment variable $CORPUS_REGISTRY. This has to point to a directory registry/ where the corpora on your system are defined. You'll find an example corpus in this directory, the file "susanne.tar.gz", and the registry file for it, "susi". Here's an example on how to proceed to query this corpus: gunzip susanne.tar.gz tar xvf susanne.tar This will create a subdirectory susi/ in the current directory. Now create another subdirectory registry/ . mkdir registry Copy the file "susi" to the subdirectory registry/ and change the path after the string "HOME" to the current path followed by "susi". The file will then look like this (without the "-----"): ----- NAME "Susanne-Corpus" ID susi HOME /susi ATTRIBUTE word ATTRIBUTE tag ATTRIBUTE pos ATTRIBUTE lemma ----- Set the environment variable $CORPUS_REGISTRY to the registry subdirectory. setenv CORPUS_REGISTRY "/registry" ( alternatively: CORPUS_REGISTRY=/registry ) (If that doesn't work you can call cqp with the option "-r /registry"). Now CQP will find the registry subdirectory, and because the path to the actual corpus data is given in the file "susi" in the registry subdirectory, the corpus can be queried. Here's an example for some simple cqp commands: bin/cqp -e [no corpus]> SUSI; SUSI> "interesting"; 18480: the Legislature with an dilemma . Since the cons 55388: llum renaissance . It is , however , that despite 59525: « Why , it 's extremely . But I - - would never ... SUSI> set Last collocate leftmost [pos="V.*"] within 5 words right from match exclusive; (For details on the syntax please have a look at the manual, http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/UsersCorner.html, or at the file new_features.txt) SUSI> delete without collocate; SUSI> cat; SUSI> group Last target word; SUSI> cat Last > "/tmp/query_results"; SUSI> exit Calling cqp with the option "-h" will show you all the command line switches. If you know the name of the corpus you're going to work with you can for example type "cqp -D SUSI" (the name of the corpus has to be typed in capital letters...). The "-e" switch allows for command line editing. If there are any questions or something doesn't seem to work don't hesitate to contact us, Arne Fitschen fitschen@ims.uni-stuttgart.de Stefan Evert evert@ims.uni-stuttgart.de Stuttgart, 20.12.2001, Arne Fitschen/Stefan Evert