How communication devices could turn into personal assistants
Posted: January 23, 2012 Filed under: Uncategorized | Tags: cell phones, directive detection, intention analysis, meeting request detection, sentiment analysis, text analytics Leave a comment »Mobile phones are no longer just communication devices. They are also used as follows:
a) As consoles for entertainment
b) As personal task management and planning tools
Keeping the second use in mind, it would be in the cell phone manufacturers’ best interests to develop highly integrated task management tools for cell-phones.
Cell phones today lack seamless integration between the email and SMS applications and the task management applications.
When SMS messages or Email messages are received by the user about a task, there is little to no assistance in transferring the relevant data to a task management application.
What seems to be needed is a way to assist the user in capturing the information related to the task and moving it to a third-party task management application on the cell-phone.
But intention recognition algorithms can do just that!
If an SMS says “What is the name of the person we met the other day?” it should be possible to detect that this is an inquiry and that the recipient now has the task of sending a response.
Another example is the following SMS “Can you please send me the draft report by 2 pm”. This SMS contains a directive and the recipient now has a task to complete by a deadline.
Yet another example is the following “Would 4 pm work for you?” This is a meeting request.
These categories of tasks can be identified by looking at incoming and outgoing SMS messages and classifying them into various categories of tasks.
There is also an entity extraction task involved (identifying the deadlines and time ranges).
Once task intentions are identified, the phone could take the following steps:
a) The phone confirms with the user whether or not a task needs to be created.
b) The phone passes the task to the user’s preferred task management tool.
—
I just wanted to point you to an interesting dissenting view from a Google engineer. In the blog post “Will Google fight Apple’s Siri with Alfred?“, Alexei Oreskovic quotes Google’s head of mobile Andy Rubin as saying:
“I don’t believe that your phone should be your assistant. Your phone is a tool for communicating. You shouldn’t be communicating with the phone; you should be communicating with somebody on the other side of the phone.”
However, the same article also says the following:
“On Tuesday Google said it had acquired the tech company that has developed Alfred, a smartphone app that acts as a “personal assistant” to make recommendations based on your interests and your “context,” such as location, time of day, intent and social information.”
Intentions and Information
Posted: January 16, 2012 Filed under: Uncategorized | Tags: entities, entity extraction, event analysis, fact analysis, intention analysis, natural language processing, nlp, relation extraction, sentiment analysis, text analysis, text analytics Leave a comment »In the post “Intent on Intentions”, I’d talked a bit about the Speech Act Theory of Searle and Winograd.
In this blog post, I’d like to look at all other utterances. What purpose do utterances have if they are meaningful, but are not a Speech Act?
It turns out that meaningful utterances that do not convey Speech Acts, typically convey information. Information in turn comes in two flavours – events and facts. Facts represent states of the world (they describe relations between entities or describe properties of entities). Events represent changes.
For example, “London is in England” is a fact, whereas “London Bridge is falling down” is an event.
Entities are the things being talked about. In the sentences used to illustrate events and facts above, the following entities may be observed: “London”, “England” and “London Bridge”.
The distinction between intentions, events and facts is not watertight. There are times when utterances can cross the boundaries and fall into more than one of these categories.
Interestingly, there are different uses for the three kinds of text analysis (analysis of intention, analysis of events, and analysis of fact) and types of data that they may be applied to.
Data Sources
- Event Analysis: News articles, because news
reports are always about important happenings or changes in the state of the world, and
hence are rich with events and also with facts. - Fact Analysis: Wikipedia, other Encyclopedias and Knowledge Bases are full of facts,
but don’t necessarily report current events.
They may contain information on events that
took place in another age. - Intention Analysis: Emails Messages, Customer Feedback, Social
Media Messages
Enterprise Applications
- Event Analysis: Media Monitoring Tools,
Opportunity Identification Tools, Conformance and Discovery Tools - Fact Analysis: Enterprise Search, Semantic
Web, Logic and Inference Engines - Intention Analysis: CRM Tools, Collaboration Tools, Task Management Tools, Communication Devices
Here is a link to a whitepaper on the topic of doing a 360 degree analysis of text.
Intent on Intentions – Vakintent API
Posted: November 3, 2011 Filed under: Uncategorized | Tags: ai, intent, intention analysis, nlp, research, searle, sentiment analysis, text analytics, winograd Leave a comment »We have been exploring intention analysis for some time now and we are pleased to announce the launch of the first ever commercial API for broad-based intention analysis, called Vakintent.
Here is a demo of the Vakintent Intention Analysis API: Demonstration of VakIntent, the Intention Analysis API from Aiaioo Labs
Definition
Intention Analysis is the identification of intentions from text, be it the intention to purchase or the intention to sell or to complain, accuse or to inquire, in incoming customer messages or in call center transcripts.
Uses
Intention Analysis has already given us some evidence of its usefulness.
In July 2011, we used intention analysis to study the GooglePlus launch. We especially looked at quit intentions to see how frequently people were threatening to quit FB over time and saw how the number dropped sharply once people got to try GooglePlus (once the by-invite-only period ended).
This was a powerful observation, because in just four days, we could tell that GooglePlus couldn’t replace Facebook, at least not yet. Here is the study: http://www.aiaioo.com/cami
Background
The work that intention analysis is based on goes as far back as 1962 when J. L. Austin noted that not all utterances are statements whose truth and falsity are at stake, and that there was a class of utterances like “I pronounce you man and wife” that are actions [taken from Winograd, 1987].
In 1975, Searle identified the following broad categories of illocutionary (causing an action to happen) speech acts [from Winograd, 1987]:
- Assertive – Committing the speaker to the truth of a proposition
- Directive – Attempting to get the listener to do something
- Commissive – Committing the speaker to a course of action
- Declaration – Bringing about something (eg., pronouncing someone married)
- Expressive – Expressing a psychological state
Interestingly, the expressives include expression of opinion which corresponds to the modern day task of sentiment analysis.
However, utterances have more uses than purely informative uses like “They’re planning to remodel the west wing next summer” or purely expressive uses like expression of sentiment.
In a paper in 1987, titled “A Language/Action Perspective on the Design of Cooperative Work”, Winograd proposes the concept of a “Conversation for Action (CfA)”.
Prior Work
Cognizant Technologies
There was a paper at ACL 2010 titled “Wishful Thinking – Finding suggestions and ‘buy’ wishes from product reviews” http://aclweb.org/anthology/W/W10/W10-0207.pdf by Krishna Bhavsar et al from Cognizant Technologies .
Lampert and Dale
Another recent attempt to build computer systems capable of analysing intention was made by Robert Dale and Andrew Lampert at Macquarie University. A paper that I’d recommend to you is their work on detecting emails containing requests for action: “Andrew Lampert, Robert Dale and Cécile Paris [2010] Detecting Emails Containing Requests for Action. Pages 984–992 in Proceedings of NAACL 2010, 1st–6th June 2010, Los Angeles, USA“. Our own work leads us to believe that the difficulty of detecting directives is rather higher than for other intentions, so what they’ve done in this project is quite impressive.
WisdomTap
WisdomTap (www.wisdomtap.com) has a very interesting buy intention offering. Their value proposition is “Your Customers announce their intent to buy by asking for product and service recommendations on Twitter. We find customers who need your products and services. We connect you to your customers at the right time.”
Twitchell
Twitchell et al have studied “Using Speech Act Theory to Model Conversations for Automated Classification and Retrieval”.
Carnegie Mellon
CMU has released a speech act corpus: through the Jangada and Ciranda projects.
Vakintent Demonstration Consoles
Here are some links to demos:
Name Description URL
Vakintent Intention Demo Demonstration of VakIntent, the Intention Analysis API from Aiaioo Labs
Vaksent Sentiment Dem Demonstration of VakSent, the Sentiment Analysis API from Aiaioo Labs
Case Study URL
Competitive Analysis http://www.aiaioo.com/cami
Vakintent API
The Vakintent API offered by Aiaioo Labs can identify 11 intentions, the objects of those intentions and their holders.
Please feel free to write me at cohan@aiaioo.com for more information.
Vaksent API for Sentiment Analysis
Posted: November 3, 2011 Filed under: Uncategorized | Tags: ai, natural language processing, nlp, sentiment analysis, text analytics Leave a comment »Aiaioo Labs has just released an API for fine-grained sentiment analysis.
A demonstration of the Vaksent Sentiment Analysis Engine is available here: http://www.aiaioo.com:8080/annotator-0.1/automation/demoView/1
The key features of the sentiment analysis system are: a) identification of the holder of the opinion (who holds that opinion), and b) identification of the object of the sentiment (what exactly is the sentiment expressed about).
Technology
We use a cascade of algorithms to identify sequentially 1) sentiment-conveying phrases, 2) entities (to identify objects being spoken about), 3) relations (to identify which sentiment applies to which entity) and 4) negations (to identify which relations are negated). This combination makes for a very sophisticated sentence level and entity level analysis of sentiment.
The main goal of this system was to have roughly domain independent behaviour (no imbalance in performance when used on financial data, product data or entertainment). Such a balance is pretty hard to achieve (some measurements suggest that human annotators agree with each other only 79% of the time when attempting to identify the sentiment of sentences/entities in certain types of text).
Evaluation
The accuracies that we measured for different domains are as follows.
Domain of Entertainment:
Accuracy = 0.7103
Precision = {negative=0.7222, positive=0.6997}
Recall = {negative=0.6837, positive=0.737}
F-Score = {negative=0.7027, positive=0.7181}
Tested on a total of 10662 sentences.
This was evaluated using the Bo Pang data set. As you can see the errors are roughly balanced on the positive and the negative side to get what we hope is a fairly unskewed error curve. This allows averaging to work as a strategy to cancel out noise.
Domain of Products:
Accuracy = 0.7266
Precision = {negative=0.5963, positive=0.8462}
Recall = {negative=0.7807, positive=0.6953}
F-Score = {negative=0.6823, positive=0.7671}
Tested on a total of 3731 sentences.
The data set used was the Bing Liu corpora (the first two) covering mostly electronic products. We have roughly the same performance again on products, but the curve is now slightly skewed.
Domain of Finance (evaluation incomplete):
Accuracy = 0.6896
Precision = {negative=0.7037, positive=0.6666}
Recall = {negative=0.7755, positive=0.5789}
F-Score = {negative=0.7387, positive=0.6212}
Tested on a total of 87 sentences.
We have roughly the same performance again on finance, but the evaluation data set is very small. We’re working on performing a more reliable evaluation.
Examples
Here is what Vaksent http://www.aiaioo.com:8080/annotator-0.1/automation/demoView/1 says about two sentences provided as examples:
I {- deny -} that [- it can never [+ be said that this is not [- a {!+ beautiful +!} ( car ) -] +] -] . = [ negative ]
( John ) and not [- ( Bruce ) -] said that this is not [- a {!- bad -!} ( car ) -] . = [ positive ]
Speech Recognition using PocketSphinx on Win32
Posted: June 9, 2011 Filed under: Uncategorized | Tags: CMU sphinx, natural language processing, nlp, speech recognition Leave a comment »The zeroth thing you need is the Pocketsphinx binaries.
Just download the win32 binaries from the Sphinx website (download pocketsphinx, sphinxbase, sphinxtrain and cmuclmtk from the Sphinx website).
The first thing you need to do is build a language model or a grammar.
The grammar can be something simple in a format called JSGF, and this is the easier way to get a speech recognizer up and running. Alternatively, you can use a language model. The language model can be built using the instructions on the Sphinx site. You can create it starting from a file with sentences like this:
<s> I WANT A NEXTCUBE ZERO FOUR ZERO </s> <s> I WANT THE NEXTCUBE ZERO FOUR ZERO </s> <s> I NEED A NEXTCUBE ZERO FOUR ZERO </s> <s> I NEED THE NEXTCUBE ZERO FOUR ZERO </s> <s> I AM LOOKING FOR A NEXTCUBE ZERO FOUR ZERO </s> <s> I AM LOOKING FOR THE NEXTCUBE ZERO FOUR ZERO </s> <s> I AM SEEKING A NEXTCUBE ZERO FOUR ZERO </s> <s> I AM SEEKING THE NEXTCUBE ZERO FOUR ZERO </s>
A sample JSGF file would be (modified from the sample on the Sphinx website) … note that I’ve made all the words capitals because the CMU phonetic dictionary has all the words listed in caps (make sure that any language model is all caps as well, except for the sentence boundaries):
#JSGF V1.0; /** * JSGF Grammar for Hello World example */ grammar hello; public <greet> = (GOOD MORNING | HELLO | HI) ( PAUL | RITA | WILL );
The second thing you need is an Acoustic Model
An acoustic model maps sound features from the speech recognizer to phonemes.
Voxforge provides a free acoustic model for Pocketsphinx that you can use.
The third thing you need is a phonetic dictionary
The phonetic dictionary maps the recognized phonemes to actual words in your language. For English, there is a phonetic dictionary available from CMU
You will just need to download one file: cmudict.0.7a_SPHINX_40
Now, you have all the components you need!
Running Pocketsphinx
With JSGF:
$ pocketsphinx-0.7-win32/pocketsphinx_continuous.exe \
-hmm voxforge-en-r0_1_3/model_parameters/voxforge_en_sphinx.cd_cont_3000 \
-jsgf greet.jsgf \
-dict cmudict.0.7a_SPHINX_40
With a language model:
$ pocketsphinx-0.7-win32/pocketsphinx_continuous.exe \
-hmm voxforge-en-r0_1_3/model_parameters/voxforge_en_sphinx.cd_cont_3000 \ -lm cmuclmtk-0.7-win32/output.lm.DMP \
-dict cmudict.0.7a_SPHINX_40l
Any additional phonetic entries in the phonetic dictionary can be created using the CMU dictionary phoneme set
Education
- Videos on speech recognition
- Lectures on speech recognition
- Voxforge has an article on what an acoustic model is
NLP Workshop
Posted: June 9, 2011 Filed under: Uncategorized | Tags: iiit-h, speech recognition, stt, tts Leave a comment »The IASNLP 2011 workshop turned out to be a good opportunity to learn a little bit about speech research.
(See the article: http://www.aiaioo.com/cms/index.php?id=28)
Here are two of the faculty who work on speech at IIIT-H:
1. Yegnanarayana: http://speech.iiit.ac.in/~yegna (many publications on signal processing, noise cancellation, feature extraction, ANNs).
2. Kishore Prahallad: http://www.iiit.net/people/faculty/kishore (speech synthesis and spoken dialog systems)
IIIT-H also has research on grammar and translation.
Dr. Rajeev Sangal (http://www.iiit.net/~sangal/) works on Dependency Parsing, Transfer Based Machine Translation and Anaphora Resolution.
Robotics Workshop
Posted: June 9, 2011 Filed under: Uncategorized | Tags: actuator, feature, nlp, robot, sensor Leave a comment »On the first day of a three-day workshop, I built a line-follower robot that successfully navigated what the instructor promised was a very difficult course (he said it would be impossible to navigate using a simple on-off algorithm).
The trick I used to complete the course was to run the DC motors on half-voltage and adjust sensor angles so that both always fed the ‘brain’ an excellent set of signals.
I came up with the idea owing to my experience with text analytics. The most critical task in text analysis is feature engineering. With a good set of features, you can get excellent results even if the machine learning algorithm is very simple. Unfortunately, very little work goes into feature engineering and feature combination methods for NLP.
So, I guess my weekend dabbling in robotics taught me an important lesson – no matter how good your machine learning algorithms (the brains of the system) are, they can’t do nothing without eyes.
An echo of voices
Posted: November 11, 2010 Filed under: Uncategorized | Tags: australia, australian, australian aboriginal languages, comparative historical linguistics, dravidian languages, dyirbal, india, native australians, pama-nyungan, phonology, tamil Leave a comment »A long time ago, on a different blog, I’d written about the grammatical and semantic similarities between Tamil and Japanese (and Korean).
Recently, I read that Tamil bears a striking resemblance to the aboriginal/native languages of Australia.
What I found was (thanks dad for some valuable assistance) that Tamil has or is thought to have had sound patterns that are considered distinguishing features of the languages of Australia.
Before I list the semblances, let me give you a quick overview of some characteristics of Australian languages (most of this information has been gleaned from Wikipedia):
Feature 1
Their languages have four to six ‘n’ sounds, and these sounds are associated with places of articulation (where the tongue touches the roof of the mouth). So, in the language called Dyirbal, we have the following consonants (I’ve highlighted the nasal sounds):
| Bilabial | Alveolar | Alveolo-Palatal | Retroflex | Velar | ||
|---|---|---|---|---|---|---|
| Plosive | p | t | c | k | ||
| Nasal | m | n | ɲ | ŋ | ||
| Trill | r | |||||
| Flap | ɽ | |||||
| Approximant | central | j | w | |||
| lateral | l | |||||
In the languages of the Pama-Nyungan family, we have the following consonant sounds (again I’ve highlighted the nasal sounds):
| Bilabial | Apico- alveolar |
Apico- postalveolar |
Laminal | Dorso- velar |
|
|---|---|---|---|---|---|
| Stop | p | t | rt | c, cʸ | k |
| Nasal | m | n | rn | ñ | ng |
| Lateral | l | rl | λ | ||
| Rhotic | rr | r | |||
| Semivowel | w | y |
Feature 2
Australian languages are characterised by an absence of fricatives (hissing/rubbing sounds like ‘s’, ‘h’ and ‘sh’) as you can see from the tables above.
Feature 3
Australian languages have only three vowel sounds: ‘a’, ‘i’, and ‘u’.
Now, you will notice that the above three characteristics of Australian languages are pretty distinctive. They’re extraordinary, and distinguishing features.
You would probably agree that if any other language had the above features, it might be said to resemble Australian languages in how it sounds.
Now, let me list the characteristics of Tamil that I think can help one make the argument that at the phonetic level, Tamil resembles languages spoken by native Australians:
Feature 1 in Tamil
There are six nasal sounds in Tamil:
| Plosives | p (b) | t̪ (d̪) | ʈ (ɖ) | tʃ (dʒ) | k (ɡ) | |
|---|---|---|---|---|---|---|
| ப | த | ட | ச | க | ||
| Nasals | m | n̪ | n | ɳ | ɲ | ŋ |
| ம | ந | ன | ண | ஞ | ங |
This feature is also found in Malayalam but not in the other languages of South India.
Now for those of you who are surprised by the number of nasals, don’t be. English has four nasals. It’s just that the language does not use them to distinguish between different words.
Don’t believe me? Oh well, here goes! The first nasal sound in English is ‘m’. The second is ‘n’ as in ‘bang’. The third is ‘n’ as in ‘hand’. There is a fourth (very rare) nasal. This is the ‘n’ in ‘London’ (when the word is pronounced in a pompous manner, the ‘n’ gets to be more plosive/hard than otherwise).
Ok, I made a mistake. English does distinguish between ‘m’ and ‘n’. Notice how the script gives it away.
Feature 2 in Tamil
The Tamil script does not have letters for ‘h’, ‘s’ and ’sh’. The lack of the corresponding consonants in the script does evoke suspicions that the sounds were not present and therefore the corresponding characters not needed at the time the early Tamil scripts came into being.
Another interesting observation that supports this hypothesis is that some dialects of Tamil prefer the use of ‘ch’ sounds to the use of the standard Tamil ‘s’ and ‘sh’ sounds. In these dialects, ‘seri’ becomes ‘cheri’, ‘sAppAdu’ becomes ’chAppAdu’, and ‘sonnAn’ becomes ‘chonnAn’.
Feature 3 in Tamil
Establishing the third feature in Tamil is a bit difficult. Modern Tamil has five simple vowel sounds ‘a’, ‘i’, ‘u’, ‘e’, ‘o’ (taught in that order to kids, just like in Japanese -notice how ‘a’, ‘i’ and ‘u’ come before ‘e’ and ‘o’). However, there is another tentative link.
In a 1960s book, one Dr. T P Meenakshi Sundaram performed a comparative historical linguistic study of Tamil, and he surmised that early forms of Tamil had only three vowel sounds!
According to Dr. Sundaram, those three sounds were … surprise, surprise … ’a', ‘i’ and ‘u’! He said that the sound for ‘e’ was originally composed of ‘i’ and ‘a’ sounds.
This I have personally observed. In some rural dialects of Tamil/Malayalam, ‘Enna pEchi pEsurAn’ is still pronounced as ’Yanna PiAchi PiAsurAn’ (come and talk to my grandma!)
Semantics
All the similarities I have listed are at a purely phonological level.
However, I did look at whole words (nouns and verbs) in Australian languages and they did not resemble corresponding Tamil words at all. But there is another level of similarity – semantic. Semantics is the way word distinctions are used to convey meaning.
One interesting pattern is the use of words to convey distinctions of importance to prevalent kinship systems. Let me explain.
Kinship Terms
The Australian languages of the Western Desert have the following words for parents and uncles and aunts (from a post on the Australian Anthropology forum by someone called Laurent Dousset):
I’ll give you an Australian example (Western Desert):
Mother: ngunytju
Mother’s sister: ngunytju
Mother’s brother: kamuru
Father: mama
Father’s brother: mama
Father’s sister: kurntili.A mama is married to a ngunytju and a kamuru is married to a kurntili. These do not have to be actual kamuru(s) and kurntili(s), but are usually classificatory ones.
Now you will agree that this is very similar to the use of words for parents and their brothers and sisters in Tamil, Malayalam and Kannada.
Now, back to Japanese. I have a test that I wish to perform to help me determine if Australian languages might really be related to Tamil, and I’m going to turn to Japanese for help.
Deictic References
One feature of Japanese that I found incredibly fascinating was the way words were used to refer to distances.
In Japanese, there are three types of distances [I believe these terms are also called deictic references, so I'm going to call them such, though I'm not really sure] and they are (koko – near the speaker, soko – near the listener, and asoko – far from both).
Such deictic references, it turns out, also used to exist in Tamil. Sri Lankan Tamil still uses the third kind of deictic reference (ivan – he who is near the speaker, uvan – he who is near the listener, and avan – he who is far from both).
You also notice this distinction in the old saying: ‘ikkara ukkara pachcha’ which means ’from the shore near me, the shore near you looks green’, and you can also argue that you see a bit of it in ‘unnai’ (you-accusative) and ‘avanai’ (him-accusative).
What I would love to do is find someone from Australia who can tell me if these triple deictic references are also features of Australian languages.
Conclusion
Well, I am not going to comment on the interesting question of what this means/implies. These similarities could simply mean nothing. The similarities could have been the result of random language mutations.
But then again, maybe, just maybe, the ancestors of the native inhabitants of Australia stood on these very shores a hundred thousand years ago. And just maybe, as I listen to my grandmother, I am hearing an echo of voices long gone from this world.
Acknowledgements
Thanks to dad for telling me about the work of T P Meenakshi Sundaram. Thanks to mom for helping me with the thoughts on deictic references.
Counterargument
One of my friends wrote to me with excellent counterarguments, so I’m adding them to this post, just so you have a complete picture.
The problem he discovered with my logic is as follows.
My main claim is that that the three features (which I’ll refer to as F1, F2 and F3) occurring together is a very very rare event, making their occurring together in two unrelated languages even rarer. However, for this claim to hold, the joint probability p(F1, F2, F3) would have to be very very low.
My friend pointed out that p(F1,F2,F3) need not be a very low number if the features are strongly interdependent, that is, when you see one such feature, you’re bound to see the others as well.
Now my friend also mentioned that F3 is a universal feature – all language initially started with only three vowels, so if you take any language and drill back in time far enough, you’ll be left with just ‘a’, ‘i’ and ‘u’. This also implies that F3 is independent of F1 and F2 and p(F3) is 1.
Now, because of the independence of F3, p(F1, F2, F3) can be written as p(F2|F1)p(F1)p(F3). Since p(F3) == 1, we can take it out of the picture and think of p(F1, F2, F3) as p(F2|F1)p(F1).
Now my friend pointed out that phonological features occur in clusters. So, a large number of alveolar articulation points in a language would be a good indicator that the language has a paucity of fricatives. So, p(F2|F1) is also close to 1. So we’re left with p(F1, F2, F3) = p(F1). p(F1) is not likely to be low enough to establish beyond reasonable doubt that the two languages are interrelated.
In order to complete my case, I’d still have to do all of the following:
a) find more such features
b) show that p(F) is low
c) show that the conditional probabilities are low (high feature independence)
Thanks Dr. M___ C___ for pointing this out!
Now the traditional methods of comparative historical linguistics use features of languages called cognates (similar sounding words). In doing so, they are biased in how they assign languages to language families. Using cognates alone, Japanese would be assigned to the same language family as Chinese, but not if we looked at the syntactic, semantic and phonological features of Japanese. So, I feel that the comparative methodology is incomplete and would need to be supplemented by some other features at the semantic/syntactic levels maybe wrapped into some kind of probabilistic framework.
Aiaioo World!
Posted: August 18, 2010 Filed under: Uncategorized | Tags: adjective, aiaioo, aptitude, aptness, hello, interjection, part-of-speech Leave a comment »The default first post on WordPress goes ‘Hello world!’ I thought I’d keep to that general theme, but include the name of the lab in place of ‘Hello’, so it became ’Aiaioo World!’
This is an introduction to the Lab, and the funky name, so it just seemed very apt. Another reason for this seeming aptness (I was about to say ‘aptitude’) was that both Hello and Aiaioo are interjections. At a broad level, words belonging to the same ‘Part of Speech’ can be interchanged (or used instead of each other) in semantically meaningful but linguistically similar situations. However, the devil is in the semantics. For example, both ‘aptness’ and ‘aptitude’ are nouns, but in the second sentence of this paragraph, aptness is more apt than aptitude!
So, back to Aiaioo! Aiaioo is an interjection whose use seems to run across Asia, from China to India. Actually, I should take that back. I do not know if the word is used in the Koreas, Thailand, Vietman, Cambodia and any other country in Asia other than China and India. So, what I have essentially done is generalized from a sample of two. That’s a bad thing to do in pure statistical terms, because I have not smoothed my estimators and accounted for the sparseness of my data over the sample space.
Another thing you might have noticed is that the name of the firm is made up entirely of vowels! Now that’s indeed very surprising! There are not a lot of words (in English) that are that way. How many vowels have I used? In English I would count six. In any of the languages of India, I would, most probably, count no more than three!