No.31  
January, 2013    

ヘッダー

English | Japanese
Language Grid Jakarta Operation Center’s Web Site

Indonesian Language Services

Language Grid Jakarta Operation Center

The Information Retrieval Laboratory at the University of Indonesia began operating the Lan-guage Grid, which can be used for non-profit and research purpose, as part of the Language Grid Jakarta Operation Center.

Currently, four of their Indonesian language services (http://langrid.cs.ui.ac.id/langrid-2.0/language-services) have been registered - two Indonesian morphological analyzers, an Indo-nesian-English translator, and an Indonesian speech recognition engine. The Jakarta Operation Center is continuously registering various types of language services centering on the Indonesian language.

The Jakarta Operation Center has already concluded a service grid agreement with the Kyoto Operation Center as an affiliated operator, which enables users of the Jakarta Language Grid to access language services on the Kyoto Language Grid. Those wishing to use all the Indonesian language services registered on the Jakarta Language Grid should first conclude a service grid agreement with the Language Grid Jakarta Operation Center. This is due to the fact that users of the Kyoto Language Grid are not permitted to access language services on the Jakarta Language Grid. Please see directions on how to join the Language Grid at:
(http://langrid.portal.cs.ui.ac.id/langrid/procedure.html).





Language Services and Resources

◆New Service Type was added.

【Morphemes Dependency Parser】Dependency parser, which takes the lemma, part-of-speech and word as an input.


◆17 new Language Services were added. (Service Names, Provider, (Copyright), Supported Languages)

【Morphemes Dependency Parser】
  • MaltParser, Language Grid Kyoto Operation Center,(Johan Hall, Jens Nilsson and Joakim Nivre),All Languages.
【Morphological Analyzer】
  • Yahoo! Morphological Analyzer, Language Grid Kyoto Operation Center,(Yahoo! Japan),Japanese.
  • Frog, Language Grid Kyoto Operation Center,(ILK Research Group (Tilburg University, the Netherlands) and CLiPS Research Centre (University of Antwerp, Belgium)),Dutch.
  • HunPos, Language Grid Kyoto Operation Center(Peter Halacsy, Andras Kornai, Csaba Oravecz),All Languages.
  • Illinois Part-of-Speech Tagger, Language Grid Kyoto Operation Center,(Nick Rizzolo and Dan Roth),English.
  • Kyoto Text Analysis Toolkit, Language Grid Kyoto Operation Center,(Graham Neubig),Japanese and Chinese.
  • Z Part-of-Speech Tagger, Language Grid Kyoto Operation Center,(Yue Zhang and Stephen Clark),English and Chinese
【Keyphrase Extractor】
  • Gensen Web (Euro), Language Grid Kyoto Operation Center,(Hiroshi Nakagawa, Akira Maeda and Hiroyuki Kojima),English, Spanish, French, Italian, Finnish and Swedish.
  • Gensen Web (Japanese), Language Grid Kyoto Operation Center,(Hiroshi Nakagawa, Akira Maeda and Hiroyuki Kojima),Japanese.
  • Gensen Web (Chinese), Language Grid Kyoto Operation Center,(Hiroshi Nakagawa, Akira Maeda and Hiroyuki Kojima),Chinese.
  • Yahoo!Keyphrase Extractor, Language Grid Kyoto Operation Center,(Yahoo!Japan),Japanese.
【Text Summarizer】
  • Japanese Summarizer, Language Grid Kyoto Operation Center,(Kazuhiro Osawa),Japanese.
  • Open Text Summarizer, Language Grid Kyoto Operation Center,(Nadav Rotem),English, German, Spanish, Russian, Hebrew, Esperanto.
【Dependency Parser】
  • Yahoo! Dependency Parser, Language Grid Kyoto Operation Center,(Yahoo!Japan),Japanese.
  • Z Parser, Language Grid Kyoto Operation Center,(Yue Zhang and Stephen Clar),Chinese.
【Named Entity Tagger】
  • NExT, Language Grid Kyoto Operation Center,(Nagoya Institute of Technology, Department of Computer Science), Japanese.
【Text to Speech】
  • Open Jtalk, Language Grid Kyoto Operation Center, (Mie University, Ritsumeikan University),Japanese.

Language Grid Project Introduction: YMC-Viet project

Sending heat/humidity data on their cell phones
(during YMCViet Youth Training)

Measuring the height of the rice-plant
(during YMCViet Youth Training)

In 2010, the NPO Pangaea held the YMCViet Project in Vietnam with stakeholders, to try out the "YMC" model. YMC stands for "Youth Mediated Communication," and it is based on a support model for rural communities in developing countries, especially targeting illiterate people. Because of their historical backgrounds and life habits, there are many farmers who have difficulties with reading and writing. They rarely attend agricultural workshops for fear of not understanding because of their low literacy rate. Statistics show that more than half of Vietnamese farmers are likely to engage in farming without correct agricultural knowledge. Therefore, they struggle with various problems such as low income and environmental destruction.

Under the YMCViet project, children come to the ICT center once a week to learn agriculture, observe their parents' rice fields twice a week, and measure temperature and humidity every day. In the rice fields, they observe the plants' height or color, and they take photos of pests or prob-lems they observe with their cell phone cameras. They ask questions online by using the YMC System provided by Pangaea. Japanese agricultural experts answer the questions posed by chil-dren via the YMC System. Children go home and teach their parents the knowledge they ac-quired from the YMCViet project. The Language Grid service alleviates problems regarding lan-guage barrier. It translates questions and offers automatic responses. Volunteer staff members correct translation mistakes when necessary. After each workshop at the ICT center, children explained what they have learned to their parents, for 15 minutes on average. Details of the first YMCViet project was featured in the "Language Grid News Letter #28."

We saw how well this YMC project progressed the last time. The same youth who had never been interested in agriculture before, actively participated in the project by going to the rice fields, asking their parents about agricultural problems, and found out solutions by asking rice experts. We were convinced that the project made connections between parents and children much stronger. Both Vietnamese national and local governments are looking forward to future development. At a previous YMCViet project press conference, Dr. Nguyen Chien (Head of Statistical Information Center at the Department of Agriculture and Rural Development in Vietnam) said, "More than one million farmers in Vietnam need the YMC project." On the other hand, we also found out that we still have a lot of challenging tasks. For example, how do we reduce the translators'burden when they correct mistakes made by machine translation? How do we interpret information about rice fields, which were taken on the children’s cell phones? How do we share knowledge and expertise between Japanese experts and the local experts?

Dealing with all these tasks, we started the second round of YMCViet project in the same area of Mekong Delta in Vietnam. Though the number of youth participants is only half of those who partook in the previous trial, we're especially trying to improve the communication between Japa-nese agricultural experts and youth participants in Vietnam. Children usually attend the YMCViet Activity at the YMCViet center every Sunday morning. Local staff members send Activity Reports after every activity. The Language Grid Toolbox, a multilingual bulletin board system, is used as the communication channel. Pangaea has used it for several years in collaboration with Kyoto University. With the Toolbox system, we can communicate with local staff, who cannot speak Eng-lish, and help solve any problems they have.

As many researchers see the possibility of the project as a research field, they have already published academic journal papers and academic conference papers in various research domains such as service sciences, agricultural information, and sociology. Pangaea has been collaborating with those academic communities and trying to improve the YMC model in order to expand the service for not only "more than one-million farmers in Vietnam" but also for various knowledge domains in more countries that are in need of YMC! (November 2013, NPO Toshiyuki Takasaki)


Screenshot of the YMC answering system                       2012 YMCViet Project Group Photo


Language Grid User Introduction (the 31st): Europe's First Language Grid "Linguagrid" (CELI)

Group Photo of Linguagrid Administrators

CELI was established in 1999 in Turin, Italy, by a group of language technology researchers with the aim of addressing the growing market in language engineering. Today, CELI is a provid-er of software solutions in the field of Natural Language Processing (NLP) in mono-, multi-, and cross-language perspectives. With 36 employees and collaborators, CELI operates in Turin, Italy and, since 2006, in Grenoble, France.

In 2010, CELI has put online a European Node of the Language Grid, known as the Linguagrid. Linguagrid hosts a family of Web services developed, published and maintained by CELI, which is open to private and public institutions. Services available in Linguagrid include ISO certified services, services of Language Grid standardized format (Morpho Analysis - MAF, Language Identification, Dictionary Translation), community recognized services (Text Classification, Text Clustering), and CELI developed services (Dependency Parsing, Sentiment Analysis).

One of the new and captivating services available in Linguagrid are domain-adaptable, corpus-based linguistic services on a standard LRaaS infrastructure. A corpus ingestion service (implemented as a Web Application) allows users to upload corpora of documents and then generate a web accessible corpus from them. A given search query can be used for filtering the documents from the original corpus. The filtered resources can be used by the corpus-based services as a Dynamic Corpus during the process of model generation. This feature allows one to fully exploit the expressive power of SOLR search engine (i.e. faceted browsing, proximity search, wildcard queries) for domain adaptation. By leveraging the presence of a search engine as an intermediate layer, this approach enables the full reuse of corpora for different applications (classification clustering etc.), since distinct models can be generated starting from the same corpus. (November 2012, Milen Kouylekov, CELI, Turin, Italy)


Periodic Maintenance

Periodic maintenance will be carried out as below. If you wish to use the Language Grid during this period, please contact us in advance at operation [at] langrid.org.
‐February 12nd, Tuesday/ from 10:00 to 13:00 (JST)
‐March 12nd, Tuesday/ from 10:00 to 13:00 (JST)
‐April 9th, Tuesday/ from 10:00 to 13:00 (JST)

*Periodic maintenance was previously carried out on the first Monday of every month from 18:00 to 21:00 (JST). However, this will be changed to the second Tuesday of every month from 10:00 to 13:00 (JST).
Please refer to the Language Grid portal sites at: http://langrid.org/en/.