Tag: research

Deep Bayesian Learning for NLP

Deep learning is usually associated with neural networks.

In this article, we show that generative classifiers are also capable of deep learning.

What is deep learning?

Deep learning is a method of machine learning involving the use of multiple processing layers to learn non-linear functions or boundaries.

What are generative classifiers?

Generative classifiers use the Bayes rule to invert probabilities of the features F given a class c into a prediction of the class c given the features F.

The class predicted by the classifier is the one yielding the highest P(c|F).

A commonly used generative classifier is the Naive Bayes classifier.  It has two layers (one for the features F and one for the classes C).

Deep learning using generative classifiers

The first thing you need for deep learning is a hidden layer.  So you add one more layer H between the C and F layers to get a Hierarchical Bayesian classifier (HBC).

Now, you can compute P(c|F) in a HBC in two ways:

Product of Sums
Computing P(c|F) using a Product of Sums
Sum of Products
Computing P(c|F) using a Sum of Products

The first equation computes P(c|F) using a product of sums (POS).  The second equation computes P(c|F) using a sum of products (SOP).

POS Equation

We discovered something very interesting about these two equations.

It turns out that if you use the first equation, the HBC reduces to a Naive Bayes classifier. Such an HBC can only learn linear (or quadratic) decision boundaries.

Consider the discrete XOR-like function shown in Figure 1.


There is no way to separate the black dots from the white dots using one straight line.

Such a pattern can only be classified 100% correctly by a non-linear classifier.

If you train a multinomial Naive Bayes classifier on the data in Figure 1, you get the decision boundary seen in Figure 2a.

Note that the dotted area represents the class 1 and the clear area represents the class 0.

Multinomial NB Classifier Decision Boundary
Figure 2a: The decision boundary of a multinomial NB classifier (or a POS HBC).

It can be seen that no matter what the angle of the line is, at least one point of the four will be misclassified.

In this instance, it is the point at {5, 1} that is misclassified as 0 (since the clear area represents the class 0).

You get the same result if you use a POS HBC.

SOP Equation

Our research showed us that something amazing happens if you use the second equation.

With the “sum of products” equation, the HBC becomes capable of deep learning.

SOP + Multinomial Distribution

The decision boundary learnt by a multinomial non-linear HBC (one that computes the posterior using a sum of products of the hidden-node conditional feature probabilities) is shown in Figure 2b.

Decision boundary of a SOP HBC.
Figure 2b: Decision boundary learnt by a multinomial SOP HBC.

The boundary consists of two straight lines passing through the origin. They are angled in such a way that they separate the data points into the two required categories.

All four points are classified correctly since the points at {1, 1} and {5, 5} fall in the clear conical region which represents a classification of 0 whereas the other two points fall in the dotted region representing class 1.

Therefore, the multinomial non-linear hierarchical Bayes classifier can learn the non-linear function of Figure 1.

Gaussian Distribution

The decision boundary learnt by a Gaussian nonlinear HBC is shown in Figure 2c.

Decision Boundary of a Gaussian SOP HBC.
Figure 2c: Decision boundary learnt by a SOP HBC based on the Gaussian probability distribution.

The boundary consists of two quadratic curves separating the data points into the required categories.

Therefore, the Gaussian non-linear HBC can also learn the non-linear function depicted in Figure 1.


Since SOP HBCs are multilayered (with a layer of hidden nodes), and can learn non-linear decision boundaries, they can therefore be said to be capable of deep learning.

Applications to NLP

It turns out that the multinomial SOP HBC can outperform a number of linear classifiers at certain tasks.  For more information, read our paper.

Visit Aiaioo Labs


Direct Democracy and Implications for Research

Direct democracy can be broadly interpreted to mean the control of the allocation of common resources by the people who pooled in.

One common resource is tax money.

In most countries, those who pay taxes only have a say in whom they can elect to power.

Those who pay taxes rarely have a say in how the tax money is spent.

There is a middleman (someone who works in government) who decides how the tax money is spent.

The problem with having a middleman decide the allocation of common resources, is that the resources could end up being allocated very inefficiently due to man-in-the-middle corruption.

Here is an article about man-in-the-middle corruption:

The way out is to let the people who contributed to the common pool decide on how the resources are allocated.

Control Mechanisms

Tool 1: Apportioning

One way to do this is to embed direct democracy mechanisms into the contribution mechanism.

For example, tax-payers could be given the ability to tie a portion of their tax contribution to expenditure categories.

They could be given the right to apportion, out of every $100 that they have paid in taxes, a certain amount to each of the following major categories: education, healthcare, social security, infrastructure and defence (leaving a certain percentage to the finance minister’s discretion).

It could also be left to the tax-payers to specify how much money the government may borrow on their behalf.

This direct control could very likely have prevented the debt crises of Greece and Ireland (and especially in countries where people are averse to taking on debt), and might also have given people in the USA some control over their government’s borrowings.

Tool 2: Referendum

Another mechanism is the referendum.  It is already being used in all democratic countries, but mainly for the selection of the middleman.

Fortunately, things seem to be moving well beyond that stage.  In India, baby steps are being taken towards bringing about a direct democratic model of government.

A new political party came to power in Delhi on an anti-corruption platform.  The first thing they did was conduct an informal referendum to ask the people of Delhi if they should form a minority government.

So, referendums are one of the mechanisms of direct democracy.

This mechanism can also be used to prevent or reduce man-in-the-middle corruption.

Take the example of a road that needs to be surfaced.  Normally, a government official would have issued contracts based on the bribes paid to him by the contending contractors (rendering a selection on the basis of quality very unlikely).

If instead, the people living on the street that needed to be surfaced had been given all the relevant information needed to make a good choice and asked to select the best contractor for themselves instead, the middle-man would have been eliminated and the quality driven up instead of down.

Now, I am going to talk about some problems that I think affect research funding in the USA (and other countries with government-funded research).  Most research funding in the USA comes from government bodies like the NSF, the NRO and DARPA.

Now the following is only my personal opinion, but I think that research efforts in some fields in those countries might be distorted to some extent by the needs of these funding bodies.

Well, I can only speak for the research areas in which Aiaioo Labs is active.  We focus on a narrow research space – predominantly on text analytics and natural language processing.

In this space, I see a lot of low-hanging fruit that nobody in the USA or Europe ever picks.  And that’s quite inexplicable as these are often problems with very obvious applications to the software products space.  And yet I see no papers from California on them.

There are topics on which all the papers I see are from India (often by students who don’t even publish them in international conferences) and sometimes by researchers from Singapore – completely ignored by the main research community.

At other times, I’ve noticed areas of research that DARPA had spent much money on in the 1970s and that researchers had pursued very enthusiastically in that decade.  I’ve seen those lines of research being abandoned in the 90s (possibly once the funding priorities changed) and not being revived again, though product firms are working on those technologies again in California in 2013.

I find it hard to explain why these areas of research are being ignored, except by the remote possibility that they have passed under the radar of the guys in the Naval Research Office which makes it unlikely that grants will be provided for them.

That is again, if my conjecture is right, a man-in-the-middle problem.  The agenda for research is possibly not being driven by the research community or by the market (the needs of start-ups in California) but by people guessing at what sort of proposals might get funded (and that might be encouraging people to stay with what government knows).

So, I shall propose another direct democracy tool to solve this problem as well:

Tool 3:  Suggestions + Referendum

Here, each of the participants (researchers) bidding for the grants would put in suggestions about what the next important thing to focus on as a research community might be.  Then they could all vote on the suggestions.  The allocation of research grants could then be guided by the suggestions and the votes received by each suggestion.

Controls as Rights

In a sense, you can think of these three control mechanisms as three rights that people who contribute toward a common pool of resources will have in a direct democracy:

1)  The right to apportion

2)  The right to be consulted

3)  The right to suggest

There is a nice article on Wikipedia on direct democracy.  The article talks of two of the control mechanisms proposed in this article – referendum and initiative (which corresponds somewhat to suggestions) – and proposes one that I hadn’t mentioned – the right to recall.  It doesn’t talk about apportioning.

Here is an interesting video of a Mohalla Sabha (it’s an interesting participation mechanism that a political organization is experimenting with in Delhi).

Studies on how wealth might engender unseemly behaviour

A friend shared a study by Berkeley’s psychology department on how wealth or a feeling of being wealthy can make people exhibit less empathetic behaviour.

The video explains the research in a very accessible and easy-to-understand manner.

In their paper, the researchers say:

“We reason that increased resources and independence from others cause people to prioritize self-interest over others’welfare and perceive greed as positive and beneficial, which in turn gives rise to increased unethical behavior”

I suppose that’s a pretty good explanation of the behaviour.

If you’re wealthy and don’t think you will need another person’s help some day, you don’t need to be very helpful to people.

On the other hand, if you’re not wealthy and feel insecure about your own future, you might feel compelled to try and be nice to people around you since you might need their help one day.

Here is the full paper:

I found a similar conclusion at the end of a related study by researchers at the University of Minnesota (http://citeseerx.ist.psu.edu/viewdoc/download?doi= which a friend of mine shared with me:

“The self-sufficient pattern helps explain why people view money as both the greatest good and evil. As countries and cultures developed, money may have allowed people to acquire
goods and services that enabled the pursuit of cherished goals, which in turn diminished reliance on friends and family. In this way, money enhanced individualism but diminished communal motivations, an effect that is still apparent in people’s responses to money today.”

Prior Work on Intentions

We have been exploring intention analysis for some time now and we are pleased to announce the launch of the first ever commercial API for broad-based intention analysis, called Vakintent.

Here is a demo of the Vakintent Intention Analysis API:  Demonstration of VakIntent, the Intention Analysis API from Aiaioo Labs


Intention Analysis is the identification of intentions from text, be it the intention to purchase or the intention to sell or to complain, accuse or to inquire, in incoming customer messages or in call center transcripts.


Intention Analysis has already given us some evidence of its usefulness.

In July 2011, we used intention analysis to study the GooglePlus launch.  We especially looked at quit intentions to see how frequently people were threatening to quit FB over time and saw how the number dropped sharply once people got to try GooglePlus (once the by-invite-only period ended).

This was a powerful observation, because in just four days, we could tell that GooglePlus couldn’t replace Facebook, at least not yet. Here is the study: http://www.aiaioo.com/cami


The work that intention analysis is based on goes as far back as 1962 when J. L. Austin noted that not all utterances are statements whose truth and falsity are at stake, and that there was a class of utterances like “I pronounce you husband and wife” that are actions [taken from Winograd, 1987].

(I recently found the Winograd paper on his website: http://hci.stanford.edu/winograd/papers/language-action.html)

In 1975, Searle identified the following broad categories of illocutionary (causing an action to happen) speech acts [from Winograd, 1987]:

  • Assertive – Committing the speaker to the truth of a proposition
  • Directive – Attempting to get the listener to do something
  • Commissive – Committing the speaker to a course of action
  • Declaration – Bringing about something (eg., pronouncing someone married)
  • Expressive – Expressing a psychological state

Interestingly, the expressives include expression of opinion which corresponds to the modern day task of sentiment analysis.

Prior Work

Cognizant Technologies

There was a paper at ACL 2010 titled “Wishful Thinking – Finding suggestions and ‘buy’ wishes from product reviews” http://aclweb.org/anthology/W/W10/W10-0207.pdf by Krishna Bhavsar et al from Cognizant Technologies .

Lampert and Dale

Another recent attempt to build computer systems capable of analysing intention was made by Robert Dale and Andrew Lampert at Macquarie University. A paper that I’d recommend to you is their work on detecting emails containing requests for action: “Andrew Lampert, Robert Dale and Cécile Paris [2010] Detecting Emails Containing Requests for Action. Pages 984–992 in Proceedings of NAACL 2010, 1st–6th June 2010, Los Angeles, USA“. Our own work leads us to believe that the difficulty of detecting directives is rather higher than for other intentions, so what they’ve done in this project is quite impressive.


WisdomTap (www.wisdomtap.com) has a very interesting buy intention offering. Their value proposition is “Your Customers announce their intent to buy by asking for product and service recommendations on Twitter.  We find customers who need your products and services.  We connect you to your customers at the right time.”


Twitchell et al have studied “Using Speech Act Theory to Model Conversations for Automated Classification and Retrieval”.

Carnegie Mellon

CMU has released a speech act corpus: through the Jangada and Ciranda projects.

Vakintent Demonstration Consoles

Here are some links to demos:

Name Description URL
Vakintent Intention Demo Demonstration of VakIntent, the Intention Analysis API from Aiaioo Labs
Vaksent Sentiment Dem Demonstration of VakSent, the Sentiment Analysis API from Aiaioo Labs

Case Study URL
Competitive Analysis http://www.aiaioo.com/cami

Vakintent API

The Vakintent API offered by Aiaioo Labs can identify 11 intentions, the objects of those intentions and their holders.

Please feel free to write me at cohan@aiaioo.com for more information.