Cool Idea #9: Tatoeba

Anybody who’s taken a language course in high school or college is familiar with the thick two-way dictionaries made for learners. For those of us who are native English speakers with some experience in our target languages, those are pretty useful. But too often, their entries are useless to new students. When I took my first Spanish class, for example, I had no clue how to conjugate ser (to be) in the preterite tense, let alone what the preterite tense was. The dictionary didn’t provide that, and because of this I had no way to express things other than in the present tense.

I could elaborate more, but I’ve got a ton of other work to do right now, so I’ll have to cut this a little short. What I’m trying to say is that traditional dictionaries are great tools for seasoned language learners who just can’t recall the right word. When it comes to helping the new students, though, they fall flat because the words exist on their own. Tatoeba takes a different approach by giving translations of sentences, not individual words. Because everything is generated by users who are for the most part accessible and because it teaches with actual examples as opposed to rules, I think Tatoeba could be a great way to pick up a language.

Here are all the versions of “My name is Jack.” And here’s a video explaining the idea behind Tatoeba.

This is a great idea, and I’ve already started translating some sentences in English, Spanish, and Ukrainian. But I also see a whole lot of untapped potential. As translation and language education become more open and more collaborative processes, this same concept of bit-by-bit translation across multiple languages can be exported. I’d like to see within the next few years a large-scale project focused on translating books in the public domain. A crowdsourced version of say, Moby-Dick in Esperanto would be a great read, if you ask me.

One issue: I can’t seem to get Arabic text (or anything else written from left to right) to type correctly. Maybe it’s just me, though.


Idea #20: Use U-Haul trucks as a “human migration indicator”

First, I’d like to apologize to any regular readers for my absence. It’s my senior year of high school, and when I’m not busy with clubs, college applications, or studying, I’m usually flat-out tired. I know that’s hardly an excuse, but it’s something.

Call me Captain Obvious, but when people in America move, they tend to bring their belongings with them. Though the amount of stuff they bring depends on how much they own or are willing to leave behind, most families can’t fit two closets, a kitchen, three bedrooms, and Fido into the Honda, so they turn to moving companies. U-Haul is one of those companies, and it’s unique in that it’s more or less a self-service system, meaning that customers do what they desire with the product. That’s what gives its network so much potential as an indicator.

This idea writeup is short and simple. People in different professions often wonder where the population moves. U-Haul vehicles are already location-tracked, and customers pick up and deposit them at their own start and finish points. By taking the start data and the finish data and putting them on a map, analysts can see trends in movement across North America. During the upcoming winter holidays, will more New Englanders travel South than West? During which month do the most people depart from Vancouver? Data from the trucks can give views that answer questions like these and allow for market research that is nothing if not interesting.

One limitation to this idea is the budget factor. U-Haul only caters to one segment of a broad market, and more expensive companies that include labor in their services quickly snap up other potential clients. As a result, the market’s “migration patterns” that come from analysis like this probably won’t be entirely accurate.

Cool Idea #7: Sign language messaging

Aerial view of the University of Washington ca...

Image via Wikipedia

I should probably let them speak for themselves (bad pun), but I imagine that any mobile communication is difficult for the deaf. To my limited knowledge, phone calls are out of the question, text messages are tedious and ambiguous, and video calling is expensive and still in its infancy. That’s why this research project by University of Washington students is so promising. Essentially, the team has created software called MobileASL that works with the cameras built into video calling-capable cell phones. The software recognizes American Sign Language gestures made by one caller and transmits them, live, to the other caller’s device in the form of small files, presumably pictures. This idea is powerful because it is much less taxing than video is on users’ batteries, networks, and wallets. It should also be more accurate than choppy video calls.

All in all, this project has great potential, and I’m sure it will be adopted widely. The only dilemma I see is that two-way communication requires front-facing cameras, something that I’ve only seen on a limited number of devices. Hopefully, these will become more commonplace as competitors mimic the iPhone or as MobileASL gains steam.

For more information, including a video explanation, navigate to the project’s official website at Alternatively, check out a blog post explaining more MobileASL features at From the Moon and Beyond.

Bonus: Feel free to let me know if you think that this sounds strange, but I’d like to propose the idea of calling the software ASCell for the sake of convenience.

Idea #15: Find primordial language by using an emotional database


For decades–or maybe even centuries–scholars and enthusiasts have been fascinated by the pursuit of an ancestral language that all humans trace their native tongues back to. There are several theories and methods concerning how exactly to find roots common to all words. However, current methods all seem a bit homogeneous. From my understanding, they tend to use brute force to analyze words across several languages, viewing the vowels and consonants until some semblance of a pattern emerges from the jumble. So far, it’s worked to some extent, with experts having constructed a vocabulary of over 500 words in a language they call “Nostratic.” It is the oldest vocabulary we’ve been able to find.

Nostratic was spoken about 20,000 years ago. A Proto-Human Language, on the other hand, may be up to 180,000 years older. That leaves quite a few years that we know nothing about, language-wise. However, I believe that a new method may be able to narrow this gap significantly.

The Concept:

Darwin was one of the first to establish the idea that emotions–more specifically, the facial expressions associated with them–are universal among humans. This has recently been proven true by researchers such as David Matsumoto. It’s arguable that because these expressions shape the face when they are made, they can influence the sounds people make when speaking while they express an emotion, in terms of both vowel pitch and actual consonants. If this is valid, then we can use  the “emotional value” of each word in several languages to see which ones match. Back when human brains were first developing Broca’s Areas, which are responsible for speech, humans had probably already been able to feel emotion. Assuming that our ancestors did what was most convenient to them, it’s possible that they assigned easy-to-make  sounds to objects with emotional associations. If, say, a fire made Uggabugga (by the way, how was he named?) happy, he may have started referring to it with sounds that are convenient to make while smiling.

The approach requires the construction of a place for several people to find words in several languages. They can look at a word, then classify it emotionally. Is it a happy word? An angry word? An icky word? And so on. Using this will help to create a list of words  strongly associated with certain emotions. Additionally, participants can create a list of words that come to mind when a certain phrase is read. This will build a thesaurus of sorts.

Once enough data has been gathered on a word, researchers can use its emotional classification to determine whether it fits with the idea of convenience. If “fire” is actually a term people associate with happiness, then combining this fact with research on happy sounds can be used to divine what Uggabugga used to call a flame.

The thesaurus can be used to see if words have a common root in the past, even if they sound different and are used in the same language.

The Pros:

  • All that’s required to collect the data is a set of dictionaries and a wiki-like website in which users find and evaluate words.
  • This approach crowdsources much of the “grunt work” of rating words, so quite a bit of the research more or less does itself.
  • Creating the “thesaurus” I mentioned can be done by several means such as web games, meaning that the data is more likely to be pure than it would be if participants were focused on finding an ancient language with their input.

The Cons:

  • This will likely have to be Internet-based, so it’s likely that researchers will run into trolls.
  • It’s possible that emotional reactions to objects have changed over the ages, so researchers may be working with improper data in some cases.
  • This will only provide the framework for a very old language. Unless outside methodology were to be implemented after conclusions have been made, we would still not know the language’s grammar system or how exactly it evolved into dialects.