Computer Program Targets Hidden Meaning of Lost Language
A computer-driven statistical modeling process, similar to those used to make sense of complex data sets such as genetic sequences and economic information, now finds use in cracking a language whose meaning has been lost for millennia.
The Indus Valley region spans nearly all of Pakistan as well as portions of India, Afghanistan and Iran, and it has produced thousands of artifacts featuring a written script that includes approximately 500 symbols. The many tablets that feature the script date back to approximately 2,000 years B.C.
Decoding the language has been elusive, and not for want of effort. Most inscriptions include very few characters, there have been no findings that include the Indus script alongside any other language, and linguists are left only to guess what the associated spoken language could have been.
American and Indian scientists are collaborating in an effort to glean as much as they can from the mysterious text. After creating a database of the individual script symbols and loading the full texts, a statistical analysis of how and where characters appear in relation to other characters, yielding some early insights into construction and syntax.
And while the Indus script case is far from cracked, lead researcher Rajesh Rao of the University of Washington explains to Discovery News why "quit" is not in his vocabulary:
"There are some who say the script can never be deciphered without a bilingual text like the Rosetta Stone or really long texts. I am however optimistic that given a few more years, we may be able to at least narrow down the language family of the script by using computer analysis to gain an in-depth understanding of the underlying grammar."
Photo courtesy of PHGCOM, via Wikimedia Commons



0 comments