Patterns – 2.27.2009

1. Twitter Analytics
The phenomenal growth of Twitter has drawn tremendous amounts of data out of its 3,400,000 estimated users. From average Jane’s to elite technorati and global media stars, the spread of information shared across Twitter is immense. With the increasing presence of product brand managers and corporate reps the Twitter channel has eyes & ears that now spread deep into commerce, government, and culture. Senators can speak to constituents and customers can speak to big business. Since these communications are public and archived behind the Twitter API anyone can develop tools to extract the data.

The perennial question continues to focus on Twitter’s yet-to-be-revealed business model. Whatever ace they have up their sleeve (or however large Evan Williams bank account remains) the API has enabled a large ecology of third-party services to grow around an open data repository of public communication about almost everything. With an openly-searchable public timeline and the addition of user-generated hashtags to coordinate thread topics, the greatest emergent value of Twitter is in the trends and meaning that can be extracted from its content. As often noted, Twitter presents a reading of the zeitgeist.

Who to watch: Twitters meteoric rise hasn’t allowed a lot of time for many compelling solutions to be developed. Twitter itself has been playing the cards close to its chest, leaving most of the interesting development to third-parties. But it’s likely they’re watching and plumbing the data in ways not yet exposed through the API. There have been hints at charging for commercial use but if Google is any indicator (and it always is these days) Twitter will find it’s core footing in providing deeper access to data & user analytics.

The New York Times recently published a Twitter mashup showing timelined tweets related to the Superbowl. It’s a really simple yet compelling visualization that is immediately valuable to anyone trying to get a gauge on viewer sentiment. The timeline is displayed over a map of the US that indicates the geographical distribution of Tweets. While it focuses on general content (eg Steelers, Cincinnati) its easy to imagine visualizations focusing on ad words like Pepsi, Sobe, and GoDaddy indicating viewer response.

Some smaller third-parties like Twitstat and Tweetmeme plumb the public timeline and collate a lot of info that shows off the power of the API, but their search tools do little more than return a list of query incidences. The real need is for targeted visualizations to extract trends and meaning from the stream.

Also keep an eye on Reuters/Calais as they extend their top-down semantic approach to more archives and services. And of course, Google, as they turn their eyes towards the Twitter datacloud and start to feel the pinch of the growing buzz around Twitter & real-time search.

Related: The really interesting element of Twitter is it’s emerging use as a live-search service, illustrated beautifully by Erick Shonfeld.

What if you could peer into the thoughts of millions of people as they were thinking those thoughts or shortly thereafter? And what if all of these thoughts were immediately available in a database that could be mined easily to tell you what people both individually and in aggregate are thinking right now about any imaginable subject or event? Well, then you’d have a different kind of search engine altogether. A real-time search engine. A what’s-happening-right-now search engine.

In fact, the crude beginnings of this “now” search engine already exists. It is called Twitter, and it is a big reason why new investors poured another $35 million into the two-year-old startup on Friday.

(See also Chris O’Brien’s Mercury News article How Twitter Could be a Threat to Google.)

Twitter reports on the Now and is much closer to people and behaviors, emotions and intentions than Google can get with its static intentional search of the indexed web. Searching Twitter effectively searches human behavior. This may be a total game-changer.

2. Open Information
The much-touted “death of publishing” is essentially a realization of the declining value of static content and closed media. From books & newspapers to PDF’s, value is shifting away from the container towards the information. Like all 20th century media, the industry giants are struggling to evolve their business models to adapt to the enormous changes wrought by the hyper-linked social web.

Increasingly, stories that may have traditionally lived in fixed media are now told through multiple channels, leveraging text, video, interactive visualizations, and user input & conversation. Likewise, formerly-walled gardens like The New York Times & WSJ – which tried to simply move their paper subscription models online – are quickly moving to free models that seek revenues through advertising, analytics, & partnerships. With the empowerment of user content capture & broadcast, major news outlets are increasingly relying on average people to give them data. In return, we are demanding more access to that data.

Of perhaps greater impact is the ongoing leveraging of social media tools, open API’s, and public accounting records to expose the operational transactions of corporations & governments. As a backlash against the intense privacy of the Bush administration, and further motivated by the web of hidden banking transactions that enabled the current financial meltdown, tech-savvy public interest groups are exposing institutional accounting data to the world, heralding a new age of transparency and accountability. These trends are beginning to hit business as well and you can expect shareholders of public companies to demand more business intelligence reports about operations.

In both cases, the trend is towards the aphorism that “information wants to be free”.

Who to watch: The most active players in this area are the Old Media news publishing giants struggling to innovate in the digital world while their paper circulations dry up and disappear. Many of them have been recording and archiving valuable information about the world for over a century. As they move their content online they’re highly motivated to explore new strategies for making it valuable and compelling.

As noted, the New York Times is very actively transforming itself into an open, highly visual multimedia resource. Also, Reuters/Calais continue with their efforts to build semantic meaning into their own databases, as well as providing services to the greater web that build standard representations into any web content.

Google of course has interest in all online content and will actively pursue ways that it can help make that content more accessible and searchable. Adobe is making Flash more transparent to spiders and will also be well-served to break apart video and enable algorithmic visual analysis, identification, and tagging. The Sunlight Foundation is leveraging open information and web 2.0 service models to expose & annotate the dangerous relationships between elected officials and their special interest campaign donors.

Related: Container-aware content delivery. Obviously, reading the New York times on a mobile should be a different experience than reading it online or in print or an eReader. The usability of the information takes priority over the consistency of the format but the structure of the data must be readily able to retarget to any interface. Expect more evolved translational systems that sit between dataclouds and interface layers.


  1. Anza

    Great article….still digesting….wanted to point out that your underlined links are returning errors….link address structure problem….

    I’m trying to envision the structure of the query mechanics required to access and parse this mountain of billions of random transactions to derive a means to retrieve specified information or keywords…. and the means to recognize and understand the context in which a keyword is being used. Sometimes, it may not matter but, how accurate could you expect the results to be? I know I’m jumping the gun but I have a tendency to want to figure out the “how” and the structure of the logic to achieve the goal. I seen the Sunlight site is compiling a bunch of API tools which may provide some insight into this challenge.

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>