Tag Archives: google

health, crowds, and data mining

Google released Google Flu Trends yesterday, which analyzes search terms for indicators of flu activity. With the onset of flu season, people start searching for keywords such as “flu vaccine” which Google detects and charts. The example below reveals that we are just a couple weeks away from a time of year that has experienced a large outbreak:

Google Flu Trends Sample

The true genius behind this system is that Google is not directly involved in data collection. Data is collected passively as searches are submitted by users. Incredibly, Google Flu Trends reliably performs flu surveillance up to 2 weeks faster than the CDC (US Center for Disease Control)! For details on Google’s tracking method, check out their blog post Tracking Flu Trends.

In a similar fashion, David Bates of Harvard Medical School is creating an epidemic surveillance system that analyzes electronic health records of several Boston-area medical centers every night. When an outbreak is in the works, not all the sick people go to one hospital. 2 might show up at one hospital and 3 at another. The next day several more go. By the time authorities are aware of an outbreak, it is weeks too late. Performing surveillance on data from several hospitals simultaneously greatly expands quantity of information available and can potentially prevent outbreaks from occurring.

Data mining in health that transcends a single unit (like a hospital) has only just begun. Personal health record systems like Google Health and Microsoft HealthVault optionally aggregate health data from a variety of sources (e.g. hospitals, clinics, insurers, pharmacies). Determining health trends is one of Google’s primary goals with this system:

Google will use aggregate data to publish trend statistics and associations. (http://www.google.com/intl/en-US/health/privacy.html)

Once again, while Google and Microsoft are both investing heavily in platform development and partner recruitment, the data is entered, imported, and managed by the consumer. For an interesting post on the positive and negative ramifications of Google Health, check out Tree of Knowledge.

Information Poverty & Global Rankings

I recently read a Google blog post entitled Information Poverty that highlighted how detrimental lack of access to information is to quality of life and development. A couple notable quotes:

According to the Kenya Poverty and Inequality Assessment released by the World Bank this year, 17 million Kenyans or 47% of the population were unable to meet the costs of food sufficient to fulfill basic daily caloric requirements. The vast majority of these people live in rural areas and have even less access to the information that impacts their daily life. Data on water quality, education and health budgets, and agricultural prices are nearly impossible to access.

The right information at the right time in the hands of people has enormous power…Where does [the] money go, who gets it, and what are the results of the resources invested? The power to know plus the power to act on what you know is the surest way to achieve positive social change from the bottom up.

There are a lot of possible ways to measure a population’s access to information, including internet connectivity, mobile phone proliferation, or mass media market penetration, but also literacy and education, or the presence of libraries and universities. While many indicators and rankings are available, I don’t believe that any global index exists that compresses available data together into a single Global Information Access Index.

The United Nations, namely the UN Statistics Division, tracks country data to measure development progress. The World Bank collects and calculate development indicators largely based on economic factors. A host of other organizations have created their own tools to read in publicly available data and summarize and present it in more useful ways. My favorite example of this is Gap Minder World, which let’s you graphically manipulate country indicators over time. The International Telecommunications Union is the closest I’ve seen with its Digital Access Index (only 2003 data publicly available), described as:

The Digital Access Index (DAI) measures the overall ability of individuals in a country to access and use Information and Communication Technology.

The DAI ranks countries using 8 categories: telephone & mobile phone subscribers, price of internet access, bandwidth, broadband internet subscribers, literacy and education, and total internet users.

The DAI is a great start, but it is still lacking. For example, the DAI cannot adjust for information-restricting policies in China. Libraries and library usage are not considered. Index does not allow for the huge marginal returns that can be gained when a society that has little information access installs a single internet-connected computer. Distribution of information sources is crucial.

a singular google

Singularity is a blog dedicated to technological transformations and social change. I’ll save the full meaning of a singularity for a later date; instead, I’ll mention a singular company pursuing social change in new ways. In honor of it’s 10th birthday, Google introduced Project 10^100, a competition “call[ing] for ideas to change the world by helping as many people as possible”. Winners receive a portion of the $10 million prize pot. The catch? It’s got to be realizable within 1 to 2 years. Submission deadline is October 20.

Project 10^100

In case you’re short on ideas, check out Y-Combinator’s top 30 ideas they would like to fund: Startup Ideas We’d Like to Fund.

Let the brainstorming begin…