ALLC, the Association for Literary and Linguistic Computing, aims to support the application of computing in the study of language and literature. Membership in the association is by subscription to its journal, Literary and Linguistic Computing (LLC). An annual conference is held by ALLC in collaboration with The Association for Computers in the Humanities. The ALLC site is clearly structured and includes information about the association, its members, and minutes from previous meetings; conferences, projects, events, and bursaries; publications (LLC, Humanist, Computing in the Humanities Working Papers); and links to related organisations, institutions and projects.
An electronic journal for language researchers and practitioners which aims at presenting and exchanging theoretical and practical work in the areas of: didactics; applied linguistics; psycholinguistics; educational sciences; computational linguistics; and computer science. The journal is published twice a year, in June and in December. As well as providing information about ALSIC, the website provides access to current and previous issues, and allows authors to submit their work. Sample articles cover: images of the self; computer assisted collective language learning; research and development in language teacher education; and building student communities in online learning.
The Association for Computational Linguistics (ACL) describes itself as "the international scientific and professional society for people working on problems involving natural language and computation". The Association has a European and a North American chapter. It publishes a quarterly journal, Computational Linguistics, as well as organising conferences and sponsoring publications and events. Membership to the ACL provides: subscription to the journal; reduced registration and discounts for most ACL-sponsored conferences and publications; and participation at special interest groups. The site offers: information on the Association; current and past conferences; resources related to computational linguistics; paper archives; a search engine; special interest groups; and more.
The ATT-Meta Project: Metaphor, Metonymy and Mental States website presents and describes a project for creating a reasoning system called ATT-Meta. The purpose is to conduct reasoning about metaphoric utterances and mental states. The reasoning is done within a general purpose framework and is rule based. The system is not able to take natural language input but works, instead, with logical representations of such utterances. The site contains a list of publications regarding the project and the subject in general where the articles are downloadable as PDF-files. In addition there is a set of pages concerned with databank and the formalisation of data within the system. This site is useful for anyone interested in computational linguistics implementation of reasoning systems and metaphors and mental states in general.
The website of the Centre of English Corpus Linguistics at Catholic University of Louvain contains useful information about corpus linguistics and in particular learner corpora. The site offers a comprehensive learner corpus bibliography and description of the ICLE (International Corpus of Learner English) project. The site also provides annotated links to other resources, such as corpus linguistics websites, discussion lists, and concordances. The major difference between this site and other corpus linguistics sites is the emphasis on learner corpora: research, resources, and publications. The page about ICLE (International Corpus of Learner English) contains information about the project and the resulting corpus resources as well as advice to those who wish to join the project or create a similar corpus. Similar information is provided about LINDSEI (The Louvain International Database of Spoken English Interlanguage), a corpus of spoken learner language. There is also information about the LOCNESS corpus which is a corpus of native English essays used for comparative studies.
Centre for Speech Technology Research (CSTR) is the website of an interdisciplinary research centre at the University of Edinburgh engaged in the study and application of speech recognition and synthesis technologies. The Centre publishes research papers and develops software for academic and commercial users. It also offers postgraduate research facilities. The CSTR website describes the Centre's objectives and the projects they are involved with. These projects are grouped into three principal areas: speech synthesis; automatic speech recognition; and database collection and labelling. Separate pages are devoted to each project within these fields. There are also sections for miscellaneous projects that do not fit into these main areas, including: named entity extraction; acoustic to articulatory inversion using neural networks; and voice transformation (or voice morphing). Several pieces of software may be downloaded from the site. These include the Festival Speech Synthesis System and the Unisyn lexicon. Research papers published by members of the CSTR may be downloaded from the site in PDF or Postscript formats. There is also news of upcoming workshops and conferences.
The Child Language Data Exchange System, CHILDES, provides resources and tools for the study of conversational interactions. The tools include: database of transcripts; programs for computer analysis of transcripts; methods for linguistic coding; systems for linking transcripts to digitised audio and video. The site offers a mailing list; a tool for language analysis, CLAN; a manual and online tutorial for using CLAN; teaching tips; the CHILDES database; and language acquisition bibliography. A hardcopy of the manual and CD-ROM are also available. Ground rules for using the CHILDES database and for contributing new data are available from the TalkBank Project.
CLUK : UK Special-interest Group for Computational Linguistics is the website of a special-interest group for UK-based computational linguists (CL) and natural language processing (NLP) researchers. CLUK aims to bring together and create a forum for CL specialists in the UK, as well as to foster interaction with other neighbouring disciplines such as: linguistics; sociology; psycholinguistics; cognitive science; or data retrieval engineering. This site contains some useful information. It gives information about CLUK's aims and posts reports from the annual research colloquium of the group. There is a listing of NLP researchers and their research interests, as well as a list of NLP departments in the UK. There is also a mailing list and a number of links on the site.
Computational Linguistics Conferences is an online listing of forthcoming conferences in the field of computational linguistics. In addition to listing the dates and locations of conferences, the website includes the submission and notification dates for those wishing to submit papers. Links to the conference websites are also included where available. The site also includes links to other conference listing websites.
Computational Linguistics of Slavic languages : a Hands-on Introduction is an online course geared towards the acquisition of skills useful in so-called language industries such as: computer-assisted language learning (CALL); information retrieval (IR); machine translation (MT); and other forms of natural language processing (NLP). The main components of the course focus on various fields of computational linguistics including: CALL; Unicode Standard; digitizing; textual corpora; statistical analysis; data mining in the Web; computational lexicography; and programming in Perl. Practical tasks are set in each field. Also provided are: links to corpora for a Russian frequency dictionary; a list of most frequent 30K German words from the Institut für Deutsche Sprache; and Serbian, Croatian, and Danish corpora. Tabular examples of lexical and morphological problems encountered in Slavic languages are provided. However, technical information seems to eclipse linguistic input, with homework often assuming a certain level of previous knowledge and expertise. This online course is designed for undergraduates taking a course in computational linguistics, but it also could be useful to those in need of basic computer skills used in certain linguistic applications.
Computer-aided Summarisation Tool (CAST) is the website of a project that aims to provide methods for automatic summarisation. It includes: a term based summarisation program, built using modules from CAST; and an annotated corpus for automatic summarisation were freely available for downloading. The site is of use to students and researchers interested in: computing for the humanities; linguistics; or any textual analysis. The site provides a few of the papers given on the subject in PDF form and the guidelines for the project. The Reuters Corpus of text was used with the Palinka annotation tool. This project received funding from the Arts and Humanities Research Council (AHRC) within the Research Grants scheme.
Concordance : Software for Concordancing and Text Analysis is the website of a software package for Windows which will generate: concordances; word lists; and indices from single and multiple texts. Results can be: printed; saved; exported as text HTML and as Web Concordance files for dissemination online. The Web Concordance provide a user-friendly interface for exploring texts processed with Concordance. It allowsthe viewing of the original text; a wordlist with frequencies for each entry; and hyperlinks to their occurrences in the text. Concordance supports multiple languages and alphabets, and can convert from OEM to ANSI character sets, and from Unix to PC files. The software is available on a commercial basis though an evaluation version of the software is available for download from the website. The software has been developed by R.J.C. Watt (University of Dundee) and is used by the author for the teaching of Shelley, Coleridge, Keats, and Blake.
Corpus in an online electronic journal devoted to all aspects of corpus linguistics. Published annually in October, the journal is available in print or (six months later) in online form. All articles are available in full-text once the online version becomes available. The remit of articles include: theoretical; epistemological; and methodological aspects of corpus linguistics, and concentrates particularly on heuristic processes that link empirical data and linguistic hypotheses. In addition to the full-text of past issues, the website also includes: author and keyword indexes; calls for papers; links to related journals; and contact information. The website is mainly in French, although articles in French or English are published in the journal.
Corpus Encoding Standard (CES) is an online set of guidelines developed by the Expert Advisory Group on Language Engineering Standards (EAGLES), designed to be optimally suited for use in language engineering research and applications, in order to serve as a set of encoding standards for corpus-based work in natural language processing applications. The CES is a subset of SGML (Standard Generalised Markup Language) compliant with the TEI (Text Encoding Initiative) Guidelines. The document starts with an overview of the general principles of corpus encoding, and the 'recommendations common to all documents' which include a description of SGML syntax and a discussion of issues related to character sets, including the International Phonetic Alphabet. The sections that follow describe: building the TEI header; encoding of primary data and encoding of linguistic annotation. For primary data the CES identifies three levels of encoding, from the minimum encoding level required for CES conformance to more detailed tagging. The section on linguistic annotation includes chapters on: locators; encoding conventions for segmentation and grammatical annotation; and encoding conventions for parallel text alignment. The document ends with: a bibliography; lists of relevant standards and URLs; the CES DTD; the tag index; and the recommendations on using the CES. The page provides links to an XML version of the CES DTD and a list of projects using the CES.
Data-Intensive Linguistics (DIL) is an online book introducing tools and techniques for using text corpora and giving the basics of statistical natural language processing. It presents: UNIX corpus tools; probability and information theory and their application to computational linguistics; fundamental techniques of probabilistic language modelling; and implementation techniques for corpus tools. The book is divided into five chapters: an overview of DIL and its historical roots; finding information in text (tools for: finding and displaying text; concordances; and collocations); collecting and annotating corpora (corpus design; SGML (Standard Generalised Markup Language) for computational linguistics; annotation tools); statistics for DIL; and applications of DIL. The book is a useful resource for anyone interested in computational linguistics, corpus linguistics and natural language processing.
Digital Humanities Quarterly (DHQ) is a digital journal published online, trying to cover all aspects of digital media in the humanities. It is a peer reviewed journal that is freely available through the site. The first issue was published in spring 2007. The journal covers a wide area of the field and contains a variety of articles concerning topics from text based computer games to: electronic texts; linguistics; and literary theory. Indeed this ejournal intends to co-publish articles with Literary and Linguistic Computing (the print digital humanities journal). DHQ is published by the Alliance of Digital Humanities Organizations (ADHO) in the UK. An RSS feed is available so that you can learn of new DHQ material as it's posted.
'Digital Studies' (aka 'Le Champ Numerique') is a full-text refereed ejournal from Canada. The journal covers "the digital humanities, broadly construed", and publishes three issues each year. The journal is published online using a standard open ejournal system - and although this places a "user" and "password" box on the front-page, registration is not required to access articles. At March 2009 there are 11 issues freely available online, some of which are 'reprints'. Some issues are themed and some contain conference proceedings. Themes include: 'Historical Dictionary Databases'; 'Technologising the Humanities / Humanitising the Technologies'; 'Collaborative Mind Technologies', and the most recent issue 'Reassembling the Disassembled Book', among others. Preference is given by the editors to interdisciplinary articles. Articles are available in HTML format, and there are also abstracts. The website has details of the editors, Editorial Board, and the submissions process. Despite the secondary French title for this journal, it appears that the overwhelming majority of the articles are in English. The journal is offered under a Creative Commons Attribution 3.0 license.
The Empirical Language Research (ELR) journal is an online peer reviewed e-journal for any kind of linguistic research based on empirical corpus data. ELR journal is the relaunched English Language Research journal with a changed name to reflect the emphasis on the use of empirical data for linguistic research. The journal will focus particularly on empirical approaches to linguistic theory; multilingual corpora and translation; data-driven learning; natural language processing; and corpus-driven lexicography and lexicology. Launching the journal as a free online e-journal is a statement in support for 'the movement towards making academic research open and freely available rather than obscure, expensive, and inaccessible’. At the time of review the journal contained only two issue with three articles. This publication is of interest to both scholars and students of empirical linguistics.
Entwicklung und Implementierung eines Datenbanksystems zur Speicherung und Verarbeitung von Textkorpora is an online dissertation on the design of database systems for the storage and processing of text corpora. The dissertation starts with: an introduction to corpus linguistics; and an overview of some early and modern corpora including: the British National Corpus (BNC); the Bank of English; and the German language corpora developed at the Institut für Deutsche Sprache in Mannheim. The author discusses various aspects of corpus annotation, including: the choice of part-of-speech tag sets; automatic part-of-speech tagging; disambiguation; and parsing. The chapter on corpus analysis tools gives an overview of: text analysis and concordancing software; and such corpus analysis systems as: SARA developed for the BNC; COSMAS developed for work with German language corpora at the Institut für Deutsche Sprache in Mannheim; and the IMS Corpus Workbench developed at the Institut für Maschinelle Sprachverarbeitung in Stuttgart.
Subsequent chapters discuss the encoding of texts for linguistic corpora using SGML (Standard Generalised Markup Language) and the TEI (Text Encoding Initiative) Guidelines. The discussion of corpus markup includes 'Tokenisierung' - the identification of: whitespace; words; sentences; figures and punctuation to be encoded; and the building of corpus and text headers following the guidelines for the TEI header. The rest of the dissertation describes the design and building of a text database using the corpus database system CORSICA.
This is the website of the European Language Resources Association (ELRA), which was established in Luxembourg in 1995. The primary aim of this site is to provide language resources (spoken and written) for the human language technology community and language engineering research purposes. It can be viewed in English and French. Its activities involve collection, evaluation and distribution of language resources. In addition, this organisation produces a quarterly newsletter which focuses on news related to language engineering. The content outlines of the newsletters are made available on the site. However, in order to access the articles, users have to subscribe online. There are also links to LREC (the Language Resources and Evaluation Conference), whose proceedings can be ordered online. Some external related links are also available. This resource covers many languages including Arabic, French, Polish, Turkish and many others. For browsing and searching the catalogue, there is a search engine which facilitates that by either searching for the exact title or by keyword. These resources are available for purchase only. This site is of interest to research institutions and language engineers.
The Human Communication Research Centre (HCRC) is an interdisciplinary centre based at the universities of Edinburgh and Glasgow. It studies the cognitive and computational aspects of communication: spoken and written language, as well as visual, graphic and computer-based communication. The Centre provides a framework for the work of several research groups working in related areas such as computational linguistics, cognitive psychology, psycholinguistics, and many aspects of artificial intelligence. HCRC works on a number of externally funded projects and collaborates with companies and institutions in the area of improving the effectiveness of communication. It provides expertise in a wide range of disciplines related to human communication. It is also a major centre for postgraduate study. The site has a list of working groups and current, as well as past projects; a list of recent HCRC publications; a list of HCRC academic and related staff and their contact details; and a useful search facility.
The Head-driven Phrase Structure Grammar (HPSG) project home page is a resource containing information about the theory of HPSG as well as useful links to other sites dedicated to the theory. The resource contains a manual and a short introduction to leading ideas of HPSG along with links to publications by the project and conference proceedings, many in PDF format. This site is useful for students of grammar and computational linguistics. The theory of HPSG is a lexical approach to grammar theory and is based on the assumption that the lexical heads of syntactic structures select other constituents according to both syntactic and semantic criteria.
The Institute for language, cognition and computation (ILCC), formerly Institute for Communicating and Collaborative Systems (ICCS), at the University of Edinburgh engages in research into the nature of communication amongst humans and between humans and machines. The Institute looks at various modes of communication including text, speech, graphics, and computer dialogue systems. Research is carried out in such fields as: cognitive science; artificial intelligence; computational syntax and semantics; and human reasoning and psychologically realistic knowledge representation. ICCS work is intended to contribute in particular to the disciplines of linguistics and psychology. The website explains the research and teaching conducted by the ICCS, and includes a number of research papers that may be downloaded in zipped Postscript format, as well as abstracts of many more. The research section of the site links to several online projects developed by the Institute and various web pages for working groups in particular fields. These include groups devoted to natural language generation, grammatical theories, dialogue, and external representations for communication and human reasoning. The site also publicises workshops, seminars, lectures, and conferences organised by the ICCS.
Journal of Interesting Negative Results in Natural Language Processing and Machine Learning in an online electronic journal devoted to examinations of negative results from well-conducted experiments in natural language processing. The articles describe readily reproducible experiments based on sound justifications for the premises underlying them, and the negative results are analysed for their implications for further research. In addition to the articles themselves, available in PDF format, the site includes: details of the editorial board and editors; submission guidelines; and links to: a forum on negative results in computer science; and to journals covering negative results in other disciplines.
The Language Technology Activities in the Web Technology Sector website contains information about the activities of the Joint Research Centre (JRC) of the European commission. The idea is to use language technology to overcome the language barrier between different European languages and to fight the information overflow encountered on the web. The centre works with document analysis and retrieval systems for that purpose. The website contains information about different tools and methods for document analysis and retrieval along with reports and articles in the area in PDF-format. To facilitate this research two important language resources have been created. Both of them are built up from parallel texts from the European Commission. The JRC-Acquis is a large aligned parallel corpus containing parallel texts in 22 languages. The DGT-TM translation memory is a collection of translations between languages. Although smaller and more limited that the JRC-Acquis, most of the alignments in teh translation memory are manually corrected. Both resources are freely downloadable from the website in XML-format with information about the encoding. Although the site is not easy to navigate it still contains some very useful information and resources and will be of value to researchers and students in computational and corpus linguistics, especially in the area of parallel corpora and translation studies.
Language Technology Group (LTG) is the website of a research and development group which works in the area of natural language processing (NLP) and language engineering. Based in the Human Communication Research Centre and the Institute for Communicating and Collaborative Systems in Edinburgh, its various projects include: work in large-volume text handling; text classification; collecting, annotating and distributing text; development of tools, corpora and other resources for language processing; processing large volumes of unstructured text and extracting structured information; document architectures and information management; language processing in restricted domains. There are multiple links to the group's projects and products and a list of LTG software, most of which is available for free for academic research and for a small fee to industrial users. There is also a list of recent publications in NLP, accessible in ps and PDF format, and a message board with recent news.
The Linguistic Data Consortium (LDC) is an open consortium of universities, companies and government research laboratories. It creates, collects and distributes speech and text databases, lexicons, and other resources for research and development purposes. The intended use of LDC-Online is to facilitate linguistic research and development. LDC-Online retrieves only concordances or statistical summaries, not whole documents. The LDC's catalogue currently contains over 200 corpora of language data and this is continuing to expand. These corpora are usually available on CD-ROM from the LDC, with full details of their contents being provided on the website. To use the full range of services provided by the LDC membership is required, details of which are provided at the site. The website requires registration in order to access all its constituent parts.
The website of the Linguistica project at the University of Chicago provides information on its work on computer applications for performing linguistic tasks, such as: analysis of the morphological structure of any unknown language (Linguistica) and morphological analysis of a lexicon, which could be particularly useful for fieldwork (Alchemist). The programs are free to download, and comprehensive instructions for download, installation and use are given. The texts of the main Linguistica-related publications are available online. The site is straightforward and easy to navigate; links to some groups involved in the research of natural language processing are provided. The site should be of interest both to the researchers working in the field of automated natural language processing, and linguists using the applications.
Literary and Linguistic Computing (LLC) is a quarterly journal published by the Association for Literary and Linguistic Computing. Individual subscription to the journal provides automatic Association membership. LLC focuses on the application of computing and information technology to literature and language research and teaching: digital libraries; corpus databases; electronic dictionaries; electronic publishing and teaching. The site gives access to: contents and abstracts dating back to 1986; instructions for authors; online alerting service; and links to related journals.
This resource is available via the Oxford Text Archive (OTA) website, and can be downloaded as a zipped file, available in plain text, C programming language and HTML format. It is necessary to apply for approval from the OTA before download, and a link is provided to the terms and conditions of use, and a form to apply for permission. The publication is based on texts made available by the OTA (Macrae-Gibson, O.D., and J.R. Lishman “Computer assistance in the analysis of Old English metre : methods and results - a provisional report” in Poetics in the early Middle Ages : essays in honour of C.B. Hieatt. M.J. Toswell, ed., 1995; and Macrae-Gibson, O.D., and J.R. Lishman. - “Variety of Old English metre usage”. - (Neuphilolgische Mitteilungen), 1999.)
MPQA Releases - Corpus and Opinion Recognition System website contains information about and gives access to the MPQA (multi-perspective question answering) opinion corpus, which is a collection of news articles annotated for opinions and sentiments. The corpus is annotated with a system that encodes opinions and sentiments, expressed in the texts, in terms of contextual polarity. The site contains information about the corpus and the instructions used for annotating the corpus. The corpus itself is freely available and a request for downloading the texts can be sent from the webpage. A lexicon is downloadable directly from the page. In addition, the website enables the user to request downloading of OpinionFinder, which is a computer program that automatically identifies subjective sentences as well as various aspects of subjectivity within sentences. This website is a useful resource for researchers and students of corpus linguistics, computational linguistics and semantics.
The website of the Natural Language and Information Processing (NLIP) Group based at the Computing Laboratory of the University of Cambridge offers information about the group and its research. The group is engaged in research of different aspects of natural language processing such as: information retrieval; parsing technology; constraint-based processing; text summarisation; rhetorically-motivated search and navigation; text generation and regeneration, language acquisition and evolution; system evaluation; and text mining and bioinformatics. The site offers select project publications, as well as an opportunity to join dedicated mailing lists. The website also provides information about: forthcoming seminars; graduate opportunities; and contact details of researchers involved in different projects.
Natural Language Software Registry (NLSR) is an online listing of a large number of software products used for natural language processing (NLP). It lists: academic; commercial and proprietary software; and gives details about terms and conditions of their use, but it does not distribute such products. The products can be searched through a search engine in the Queries section. They can be found according to a: keyword; licence (free/commercial); kind of licence (academic/commercial); operating system; category (e.g. annotation tools; spoken language); or supported language. There is also an application form and submission guidelines for NLP developers who wish to contribute software to the NLSR. The Registry is initiated by the Association for Computational Linguistics (ACL).
New Voices in Translation Studies is a peer-reviewed electronic journal that aims to disseminate work by new researchers in Translation Studies on a broad range of themes. These themes include: human and computer-aided translation; machine translation; oral and sign language interpreting; and dubbing and subtitling. The journal is sponsored by the International Association for Translation and Intercultural Studies (IATIS) and the Centre for Translation and Textual Studies at Dublin City University. Articles are published as soon as they are ready and are organised in annual issues and occasional special issues. The first issue went online in 2005. Abstracts of recently submitted PhD theses in Translation Studies are also provided. Topics covered so far include: applying translation theory in teaching; translating comedy; gender-related issues in the English translations of contemporary Spanish novelists Esther Tusquets and Rosa Montero; punctuation shifts in Italian translations of Virginia Woolf serving to eliminate salient traits of Woolf's female sentence; translating Gulliver's Travels into Finnish; corpus analysis for scientific writing and translation; and translating names in children's fantasy literature. This is a promising new journal which will be worth watching in the future.
The Open Roget's Project is attempting to create a fully functional lexical resource, based on Roget's Thesaurus, for Natural Language Processing (NLP). The resource consists of the data from the thesaurus and is implemented in Java. There are two versions of the resource based on the two version of the thesaurus, from 1911 and 1987. The earlier version is freely available for downloading while the later is copyright protected. Besides the software the site allows the downloading of the documentation of the project. In addition there is a page with articles connected to the work with the project. The articles are downloadable as PDF-files. This is not an introductory site but is a very useful resource for anyone interested in lexicology or NLP.
This website offers a basic course in the Perl programming language, concentrating on the processing of written texts. No previous knowledge of computer programming is assumed. A number of example programs are given, and students are encouraged to understand these and then adapt them. There are plenty of exercises, and answers to some of these are provided online. The course was created by Paul Bennett, formerly of UMIST and now at the University of Manchester. It has been used in both undergraduate and postgraduate teaching. There are links to further online Perl courses.
The Preposition Project website describes a project that is designed to describe English prepositions suitable for use within the area of Natural Language Processing (NLP). The descriptions cover 334 prepositions and in total 673 senses. Each sense has been given a syntactic and semantic description along with sample usages from Oxford Dictionary of English. The use of prepositions is by no means straight forward in English and is a problem for NLP applications. This project aims at providing information that is useful for such applications. For the most common prepositions, examples from the FrameNet database were collected and analysed. The website is simple in its design but contains important information about the project. The database can be searched online or downloaded and there is a collection of articles in PDF-format. This site is valuable for researchers and students within the areas of semantics and NLP.
Research and Development Unit for English Studies (RDUES) is the website of a research unit, based at the University of Central England, which consists of a team of corpus linguists and statisticians engaged in developing electronic databases and tools for the description of modern English language in use. Since the Unit's inception in 1989, work has progressed on various projects, all of which are summarised on the website. These have included: Neologisms in Journalistic Text; Analysis of Verbal Interaction and Automated Text Retrieval (AVIATOR); Automatic Collocational Retrieval of NYMs (ACRONYM); Analysis and Prediction of Innovation in the Lexicon (APRIL); System of Hypermatrix Analysis, Retrieval, Evaluation and Summarisation (SHARES); and WebCorp, a suite of tools for accessing the World Wide Web as a corpus. Most of the databases are not directly accessible from this site, although demonstration entries are provided in some instances. The WebCorp search engine is, however, publicly available. The site includes a bibliography of RDUES publications, some of which are available online. Many of the project description pages are also accompanied by more specific bibliographies.
ReCALL is a journal published by Cambridge University Press on behalf of the European Association for Computer Assisted Language Learning (EUROCALL). It is issued twice a year in May and in November and is also available online to subscribers. The May issue normally contains selected papers from the previous year's EUROCALL conference. The journal contains articles relating to theoretical debate on language learning strategies and their influence on practical courseware design and integration as well as regular software reviews. The website provides information: about the journal; how to subscribe; notes for contributors; as well as access to lists of the contents of previous and current issues.
Språkteknologi.se (Language technology) is the home page of the national language technology centre in Sweden. This site functions as a gateway to organisations and companies that work with developing language technology resources, and to collections of texts, mostly in PDF format, concerning the subject. It contains news, information and links to resources concerning language technology and computational linguistics in Sweden in both English and Swedish. This site may be useful for researchers and students interested in Swedish resources within the field. Language technology is defined here as a cross-disciplinary subject concerned with the use of computers in modelling of natural language and as tools for simplifying and improving communication between people and between people and computers.
Survey of the State of the Art in Human Language Technology is an online book which surveys the state of the art in human language technology research around the world. It looks at: human-computer communication using natural communication skills and research; and development activities which deal with the: generation; coding; recognition; interpretation; and translation of language. The book is a collection of articles written by 97 authors from different countries and provides: an overview of the field: main areas of research; capabilities and limitations of current technology; technical challenges; and problems. The articles are organised into 13 chapters and special efforts have been made to create a consistent and coherent piece of work out of the contributions of a large number of people. This project was supported by grants from the National Science Foundation, USA, and the European Commission.
This is the home page of the Systemic Meaning Modelling Group at the Macquarie University in Sydney, Australia. It contains information about the research activities and projects by the group along with more general information about Systemic Functional Grammar (SFG), the linguistic theory founded by M. A. K. Halliday. SFG is based on the belief that language is about meaning and that its main function is communication. The theory suggests that the study of grammar should make use of empirical data such as language corpora. The site is divided into two parts. The first part is concerned with the group itself and contains information about the members and their research. The second part contains more general information about Systemic Functional Grammar. It is aimed both at prospective students and curious readers. Although the Virtual Library contains some dead links at the time of review it also contains some useful articles about the theory. The article 'Systemic functional grammar: a first step into the theory' by Christian Matthiessen and M. A. K. Halliday is a good and quite comprehensive introduction to the most important elements of SFG and is recommended for anyone interested in getting to know the theory.
Text Analysis Portal for Research at the University of Alberta (TAPoR) is a project that aims to collect and make available tools for text analysis. It functions as a portal for a large set of useful computer tools that allow the researcher to visualise and analyse any plain, HMTL or XML text. One important idea is that the portal should provide the user access to these tools without having to download and install software on their own computers. The tools of TAPoR are freely available online. The site is a tad confusing but contains not only the tools but tutorials and instructions for their use and a set of texts that are called recipes and functions as instructions for a wide range of different kinds research methods. This site contains useful and invaluable tools for anyone interested in text analysis, both linguistic and for literature studies.
Text Technology : the Journal of Computer Text Processing is an online electronic journal covering the application of computing technologies to textual studies. Its subject coverage includes the use of computers for: textual creation; acquisition; analysis; editing; and translation. An archive of issues from 2003 is available on the website: the constituent articles of each issue are presented in full-text in PDF format. The archive, which starts at volume 12, no. 2 (2003), is incomplete, with several issues missing. In addition to this archive, the website also includes submission information and contact details for the editorial board. No full-text searching of issues is possible, although the tables-of-contents of archived issues are listed on a single page allowing simple retrieval by a web-browser's search function.
Treebank Wiki is an online collection of links to treebanks, text corpora marked up with syntactic structures which are intended as analytical materials in corpus linguistics. The treebanks listed are in a variety of languages, including: English; Portuguese; Catalan; Spanish; Danish; Dutch; Czech; Romanian; Russian; Slovenian; and Italian. Many are available in the graphical format eGXL. The wiki also includes links to: the SFB 673-X1 project, which deals with multimodal alignment corpora; and Indogram, a project which examines the automatic induction of probabilistic document grammars as models of web genres.
The Web-SLS: European Student Journal of Language and Speech website contains a number of articles concerned with speech technology. The articles are written by students but have been through a reviewing process before being published. The texts are freely downloadable as PDF-files. The project is supported by International Speech Communication Association (ESCA); European Chapter of the Association for Computational Linguistics (EACL); and European Network in Language and Speech (ELSNET). Although the site claims to have been updated in April 2009 no article is younger than March 2002, furthermore, the links to supporting associations are out of date. Despite this, it is a valuable resource for anyone studying computational linguistics and in particular speech technology.