Google, Facebook and Big Data
Most of us know that Google is a search engine. And we all use it, all the time. It seems that it has become our second DNA. Founders Larry Page and Sergey Brin met each other in 1995 know at Stanford University. By 1996 they had build together a search engine (they initially called BackRub). In the end they called it "Google", a pun on the word "googol," the mathematical term for a 1 followed by 100 zeros. Google Inc.. saw the light in 1998. 1 followed by 100 zeros make you think. It was not expected that it would be so Big as known today. Finally, these zeros indicates how big Google is the day today. To give you an indication, 48 hours of video material uploaded every 1 minute on youtube. More than 4 billion views per day. 50% of all internet users worldwide uses Google every day. 20 petabytes of data processed daily. And this is divided among several services they offer such Google+, Gmail, Google Translate, Google maps, Google Earth, Google Voice, Google Wallet, Google Drive and so on.
So, what happens when you google? Like all search engines Google is continually searching through the web. These pages are cataloged and stored in a smart way. This amount of data is huge. How efficient is Google? As a result of their efforts, the energy used per Google search is very small. The above comparisons show how doing 100 searches compares to drying your hands, ironing a shirt, or drinking orange juice. Specifically, Google currently use about 0.0003 kWh of energy to answer the average search query. This translates into roughly 0.2 g of carbon dioxide.
Is big data the next big hype. Or is it more than that? What is Big data is the next logical question. There are certain advantages but since I am always skeptical and critical I certainly look at the drawbacks. The size, diversity and speed of today's data make an attuned analytical architecture thus necessary. Think sophisticated data mining and more. What can it do for us and not. A nice example is instead of "find my iPhone" some care insurance companies are offering a service that may enable parents to "find my teenager." Progressive insurance, offers a tracking that reports on a car's location, acceleration, braking, and distance traveled. To make this all very attractive owners who install the device can get a discount on their policy. You may ask yourself how this can be use to the better good. How long before companies like Google produce a car with all the techniques used as today.
Big players such as Google, Facebook but also IBM, Oracle, Microsoft bring more and more advanced tools on the market to analyze this large amount of data trends. Organizations will also have to use these tools if they are able to make optimal use of all available data. Otherwise it is pretty useless.
Facebook founded in 2004, Facebook’s mission is to make the world more open and connected. People use Facebook to stay connected with friends and family, to discover what’s going on in the world, and to share and express what matters to them. With more than a billion monthly active users as of December 2012 and approximately 82% of their monthly active users outside the U.S. and Canada you get a pretty good idea on how much data is processed. To give to some insight. Facebook revealed some big stats on big data, including that its system processes 2.5 billion pieces of content and 500+ terabytes of data each day. It is pulling in 2.7 billion Like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half hour. Another statistic Fact is Facebook revealed that over 100 petebytes of data are stored in a single Hadoop disk cluster.
Facebook is like Google continuous looking for products to connect more and more people and thereby generate more data. Again been meaningless if nothing is done with these data. It is no coincidence that in the right column advertising banners being shown where you are just interested in. Coincidence? Of course not!
So, where does that leave us? Big Data is not only a huge amount of data (bits and bytes). So how does Big Data change the way we live? 'Data Mining' is Key! Data mining has been used intensively and extensively by many organizations. Data mining is becoming increasingly popular, if not increasingly essential. Data Mining is the search for (statistical) relationships in data sets with the aim to establish profiles for scientific or commercial use. Such a data set can be formed by events in a practical situation registering purchasing behavior, symptoms in patients, websurfing or Twitter behavior, etc. Data Mining helps companies and scientists to find the essential information they need. A report on an emerging threat to individual privacy to be issued by the European data protection authorities raises even more serious issues than those it is likely to address. The report will consider Google's asserted right to expand its data mining to combine users personal data across all their accounts and services, including Gmail, internet searching, map and location information and photo sharing, with no way for individuals to opt out. At least one technology blogger has accused Microsoft of planning similar changes, while two new Facebook programmes to aggregate user data with other advertising and loyalty card data have also drawn concern. Whatever the merits of each case, the larger issue deserves greater public attention.
Again, always skeptical, it can be used for the better good. Data mining can greatly benefit all parties involved. For example, data mining can help healthcare insurers detect fraud and abuse. Maybe Better medicine, by analyzing fragments of data. Facebook is using the services of Datalogix, a US-based Data Mining company that collects and analyses information about shoppers, to gauge how many of its users make buying decisions based on the advertisements served on the social networking site. The tracking of web users has raised serious privacy concerns. Companies like Google and Facebook are not charities. In order to serve better advertising, these companies collect and retain private and personal information about consumers. The greatest challenge is picking the ‘right’ data. There is a danger that organisations or individuals can simply select the data that best fits their need. This can go spectacularly wrong without rules and limits. And to generate as much data as Google and Facebook do we must be constantly on our guard.
- Item Tag: Cloud