lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject FYI: parallel corpus in 22 languages
Date Thu, 24 Jan 2008 21:02:21 GMT
Hi all,

Just FYI, perhaps this is old news for you ... This large corpus is 
freely available and it is pairwise sentence-aligned for all language 
combinations. This looks like a good resource for linguistic 
information, such as frequent words and phrases, n-gram profiles, etc.

http://wt.jrc.it/lt/Acquis/


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message