Month: September 2015

Why Google is being investigated for rigging search results

I read in an article a few days ago that Google is being investigated by the Competition Council of India on suspicion of rigging search results.

One of the complainants was none other than Flipkart which is, I believe, one of the largest e-commerce companies in India.

Flipkart seems to have complained that ‘it found search results to have a direct correlation with the amount of money it spent on advertising with Google through Google’s Adwords program’ (the quote is from the news article; I haven’t seen the actual complaint).

Iff Flipkart’s observations are indeed true, and if Flipkart can establish beyond a shadow of doubt the existence of a correlation between advertising expenditure and search ranking (for a random allocation of advertising spend – and we see later why this is important) then it could have serious implications for digital marketers.

If the search rankings really correlate with advertising spend, it would mean that customers would be better off not spending any money at all on advertising with Google, rather than spending a small amount of money on the same.  In other words, it impacts the choice of SEO vs SEM for digital marketing.

SEO vs SEM

SEO and SEM are two strategies available to firms to bring their offerings to the notice of customers seeking information using search engines.

SEO (Search Engine Optimization) involves optimizing the text and links of web pages so that they rank higher in search results.

SEM (Search Engine Marketing) is the term used to describe the strategy of paying search engines (like Google) to display ads about a firm’s offerings alongside search results.

If there is a causal relation between advertising spend and ranking, then a common strategy used by many Indian firms – that of spending a little on SEM in addition to SEO – might hurt rather than help them.

Is Flipkart’s complaint valid?

I don’t have a detailed study and can’t provide incontrovertible evidence for the validity or invalidity of Flipkart’s complaint.

However, I have some anecdotal evidence that suggests that Flipkart’s complaint might hold some water.

And I am going to present the evidence to you in the form of a search experiment that you can all perform yourselves.

Search Experiment

Here’s an exercise that you can all try yourselves.

There is a book called “Taming Text” that is a practical introduction to a text search platform called ‘Solr’.

When I search for the string “taming text” (the location is India and I use a browser into which I am neither logged in nor signed into Google from), I get the following results:

Results for 'taming text' page 1 (above the fold)
Results for ‘taming text’ page 1 (above the fold)

As you can see, Flipkart is nowhere to be seen (though it is the largest online book retailer in India).

If you scroll down to the bottom and look ‘below the fold’, you see the following:

Results for 'taming text' page 1 (below the fold)
Results for ‘taming text’ page 1 (below the fold)

The Flipkart product page does not show up here either.  But you see an Amazon India advertisement right at the bottom.

Let’s look at the second page.

Results for 'taming text' page 2 (above the fold)
Results for ‘taming text’ page 2 (above the fold)

Again, no luck.  You get a link to Google books, but no Flipkart.

(We looked through 10 pages of results but found no link to Flipkart.  Did you?)

So, we try a different search.

We enter “taming text flipkart” into the search engine.

And here are the results!

flipkart6
Results for ‘taming text flipkart’ page 1 (above the fold)

This time, the Flipkart page shows up right at the top!

In addition, a Flipkart advertisement shows up right above it.

So it appears that Flipkart had bid on the ‘flipkart’ keyword or on ‘taming text’ or on some combination of ‘flipkart’ and ‘taming text’.

When we scroll to the bottom of this page of results, we see:

Results for 'taming text flipkart' page 1 (below the fold)
Results for ‘taming text flipkart’ page 1 (below the fold)

Amazon, it appears, had also bid for either ‘flipkart’ or ‘taming text’.

Moreover, we see that Google has obviously indexed the Flipkart product page for ‘Taming Text’.

Then why was the Flipkart page ranking so low as compared to the Amazon India page?

Evil or Innocent

Could the difference in rankings be caused by the differences in advertisement expenditure on different keywords by different vendors?

If so, it would be evil.

Flipkart seems to think so, as evidenced by their complaint.

But, as it turns out, that does not have to be the case.

It is possible for the search rankings to be correlated with advertising spend without the latter causing the former (correlation without causation) in the following manner.

If Flipkart’s own SEM algorithms had bid higher on keywords that described their products better, that could in and of itself have resulted in a correlation between search ranking and ad-spend.

You probably saw a case of that in the example above.

The term ‘taming text flipkart’ would certainly have matched the link to the Flipkart product better than the link to the Amazon product.  This is because of the appearance of the word ‘flipkart’ in the Flipkart URL (the word ‘flipkart’ would not have appeared in the Amazon URL).

So, the fact that more words in the search string were matched would have caused the Flipkart URL to be ranked higher.  If Flipkart had bid on the keyword ‘flipkart’ but not on ‘taming text’, it would appear as if the rankings were correlated with the advertising spend.  But one (the expenditure) would not have caused the other (the rankings).

Similarly, for the string ‘taming text’, the Flipkart product URL could have ranked lower than the Amazon URL merely because there are fewer buyers for the book in India than in the USA.  This could have resulted in Google’s machine learning algorithms associating the name of this book with the keyword ‘Amazon’.

Thus, there could have been a correlation of ad spend and ranking without causation.

In other words, the perceived correlation could be on account of external factors that affected both variables.  The only way to eliminate those external factors would be to randomly allocate advertising spend and then see if there still was a correlation.  If a correlation could still be established, there would then be a strong case for saying that the relation was one of causation and not correlation.

But even if there were no evil intention (no causation), the fact remains that this ranking pattern is unfair to Flipkart, though unintendedly.

In other words, Amazon’s search rankings (if my theory as to why it ranked higher is right), might have received a boost from buyer behaviour in a geography where its competitor Flipkart does not operate.

So, Flipkart’s complaint of unfairness might indeed be warranted.

Bias Prevention

In any case, this illustrates the importance of a new area of study – the study of bias in algorithms.

The article linked to above says:

Venkatasubramanian’s research revealed that you can use a test to determine if the algorithm in question is possibly biased. If the test—which ironically uses another machine-learning algorithm—can accurately predict a person’s race or gender based on the data being analyzed, even though race or gender is hidden from the data, then there is a potential problem for bias based on the definition of disparate impact.

Search bias was also described as a cause for concern by Brin and Page in their paper on Google written during their days at Stanford:

http://infolab.stanford.edu/~backrub/google.html

The paper says, and I quote none other than Sergey Brin and Lawrence Page:

The goals of the advertising business model do not always correspond to providing quality search to users.  … we expect that advertising funded search engines will be inherently biased towards the advertisers and away from the needs of the consumers.

Since it is very difficult even for experts to evaluate search engines, search engine bias is particularly insidious. A good example was OpenText, which was reported to be selling companies the right to be listed at the top of the search results for particular queries.  This type of bias is much more insidious than advertising, because it is not clear who “deserves” to be there, and who is willing to pay money to be listed. This business model resulted in an uproar, and OpenText has ceased to be a viable search engine. But less blatant bias are likely to be tolerated by the market. For example, a search engine could add a small factor to search results from “friendly” companies, and subtract a factor from results from competitors. This type of bias is very difficult to detect but could still have a significant effect on the market.

Other interesting articles on the subject of search result bias:

  1. http://www.politico.com/magazine/story/2015/08/how-google-could-rig-the-2016-election-121548_full.html
  2. http://www.politico.com/magazine/story/2015/08/google-2016-election-121766
  3. http://www.theguardian.com/technology/2014/may/15/google-did-not-rig-indian-elections