Tuesday, June 22, 2010

Predict the Future with Technology

Predict the Future
Recorded Future: Google's Newest Tech Acquisition. 
 Acquired June 10, 2010
Today I read an interesting article about predicting the future using social media analytics.

The premise: social medium such as twitter and facebook do one particular thing very well - aggregate personal and social information such as moods, emotions, trends and events. The dearth of information could be analyzed and aggregated into a semantic ball of words and phrases and then correlated with the volume of their appearance to a future event.

"Johan Bollen and his colleagues at Indiana University in Bloomington have created an anxiety rating based on an analysis of hundreds of millions of tweets by people in the US. Their paper has not yet been published, but Bollen says they too found that increases in anxiety on their scale correlated with lower than expected stock prices."
"We're astounded",  he says

 "We didn't think it would be a predictive relationship."

Anyone who is practicing search engine optimization knows the value of analyzing trends in social mediums, but may not be aware of its utility as a tool in temporal forecasting as described above. On the other hand, it's likely that there may be those who are already using this to target their optimization strategy to "be there, before the buyer gets there."

My Attempts to Make Sense Out of It

Using a web application called tweetvolume, we can try to show (not predict yet) some prevailing trends in the real estate industry. Let's compare three phrases, notably,

House for Sale, Homes for Sale, House for Rent


Using TweetVolume

1. The demographic is limited to people who use twitter
2. The difference in the usage of the terms "house for sale" as opposed to "homes for sale" is negligible. Which simply means that people don't really have a preference whether to use house for sale or homes for sale.
3. Predictably, the volume of the phrase "house for sale" / "homes for sale" far outweighs "house for rent".  This could be further analyzed as showing that more people are intent on disposing of their real estate assets rather than leasing them out.
4. This analysis describes the present and past state of real estate trends in social networks - not the future - yet, we'll get there...

Using Google's Keyword Tool External

1. Notice the difference in numbers when compared with Google's Keyword Tool External. First of all, one must take into account Google's user base which is far greater than twitter's. The image that you see merely reflects the number of searches made on Google for the indicative terms.

What we can predict using this data

Nothing yet. To be able to predict, one must be able to compare it with a trend that uniformly occurs given specific conditions and parameters. I don't know how exactly Bollen and his team were able to correllate the word "nervous" with the rise and fall of the stock market.

Predicting Real Estate Trends Using Google Trends

This is nothing new and a formal study has been made not to claim that you can use Google Trends to predict future economic activity - but rather to predict the present. Read Hyunyoung Choi and Hal Varian's Predicting the Present with Google Trends. Quoting from their study:
"We are not claiming that Google Trends data help predict the future. Rather we are claiming that Google Trends may help in predicting the present."
There is a better position posited on the Internet about predicting the future using Google Trends, however it is not backed by statistical data. It makes for a good read though.

To quote:
"Simply eyeballing the popularity of the search term "American Idol" in Google Trends shows that the keywords were less popular in 2007 than they were in 2006, thus a downward trend was already in the making. Also, the spike in interest in January of 2008 was less than it was in January 2007, further proof that American Idol would not be as avidly watched this year."

Using Google Insights for Search to Predict the Future

What the data shows us is that this "reflect(s) how many searches have been done for a particular term, relative to the total number of searches done on Google over time. They don't represent absolute search volume numbers, because the data is normalized and presented on a scale from 0-100." (Google)

Note the seasonal rise and falls of the line graph. It is descriptive in that it shows that the peak seasons for the search queries that are made are almost always constant.

The peaks always arise midway from the end of the 2nd quarter to the beginning of the third quarter.

Mid third quarter and the numbers dip to their lowest in December.

The data shows a correlation with rentals as well.

I could go on and on and use a wide variety of web platforms to try to predict web trends. In the example above, what the data indicates aside from the seasonal changes is the annual regular ascension of the usage of the term most often used in real estate. Quite naturally, there would be more houses for sale in the coming years, more advertisements and more search queries because of constant and ever increasing real estate development projects. The US crisis, the bailout and the efforts to mitigate the efforts thereof, fuels the rise of searches for the term "house for sale". Whether this would translate to an actual increase in economic activity in the real estate sector still needs to be quantified.

Can We Really Predict the Future Using Technology?

First, I must divulge my ideological background for I am a big fan of Hari Seldon, the elder protagonist in the Book Series called The Foundation, a novel that portrays a future that is predicted accurately using an arcane yet potent form of statistics.

Predicting the Future Through Technology

Apparently, Google believes that we can use statistics and technology to predict the future, in fact it has acquired a company that can "predict the future". In a way, Google has already stepped closer to being a forecasting engine in that they have already incorporated a "forecast" feature in their Insights for Search. Take a closer look at the rightmost portion of the picture below. The broken lines represent how Google thinks the trend will be for the term "House sold" for the latter months of 2010.

Gray Portion to the rightmost is for the future.

Watch the video about the tech startup Recorded Future that Google acquired in 2010.

Conclusion on Whether We Can Predict the Future

At this juncture predicting the future is using technology is very limited in scope. The best word that should be used is not actually to predict - but to forecast. Economic activity, like the buying and selling of houses using social medium, could still be affected by a huge number of variables on a location by location basis.

What all of the information above, at best, could be described as predicting what people would be searching for given a period of time and location. These forecasts are by no means indicative of an overall accurate portrayal of reality, but rather a portrayal of the representation of reality as embodied in a virtual medium.

Simply stated, the numbers represent only the demographic which is included in the data. ie: people who don't use Google and Twitter are not part of the statistics.

With that said, technology is getting better and better and we may indeed be able to look at prediction technologies in the coming future with the expectation of more and more accurate and specific results.

Ray Kurzweil believes that we can predict the future using quantum computing. I, for one, would subscribe to an enhanced form of statistics aided by technology, perhaps approximating the one used by Hari Seldon.