Showing posts with label Big Data. Show all posts
Showing posts with label Big Data. Show all posts

Tuesday, February 07, 2012

Big Data Democratization By Wolfram Alpha

English: Publicity photo of en:Stephen Wolfram.Image via WikipediaThe Verge: Wolfram Alpha Pro democratizes data analysis: an in-depth look at the $4.99 a month service
... the ability to use images, files, and even your own data as inputs instead of simple text entry. The "reports" that Wolfram Alpha kicks out as a result of these (or any) query are also beefed up for Pro users, some will actually become interactive charts and all of them can be more easily exported in a variety of formats...... Wolfram Alpha presents a different way of interacting with knowledge and data than anything else out there on the web ..... Wolfram Alpha is excellent at returning answers to mathematical queries and scientific queries, but it also can provide results based on its structured data. Most people know it now as one of the sources for Apple's Siri feature on the iPhone 4S. ..... in many cases the service can provide better results from Siri than from text queries because there is "more structure to what they're saying" than what most people have trained themselves to type into search fields. ..... provide "reports" instead of just "answers." ..... The first and most obvious feature is that users will get an account at Wolfram Alpha with a complete history of their queries, uploads, and downloads. ...... any number of export options — including a basic Excel spreadsheet, vector graphics, and JSON data if you'd like to integrate the data into your own web app. ...... the Computable Document Format (CDF). CDF is a browser plug-in that enables interactivity with charts, graphs, and other data. ..... an "extended keyboard" which is similar to the larger keyboard available on its mobile apps. This makes it easier to enter mathematical symbols without having to remember obscure keyboard combinations ...... Users will also be able to use images as inputs ...... if you uploaded a set of dates and prices, Wolfram Alpha will actually try to determine exactly what those prices represent. ...... put yourself in the mindset of a small business owner. Instead of trying to hassle with interpreting a spreadsheet of website traffic and sales data, he or she could upload it to Wolfram Alpha Pro ...... not just for the realm of academics and research. Regular users may not have had much reason to dig into the service before now, but the ability to bring the entire brunt of Wolfram Alpha's computational engine on any arbitrary piece of data democratizes the idea of statistical analysis.
I did not have to invent email to be able to use Gmail. There will be Big Data versions of Gmail. I as a user should be able to do Big Data stuff without launching a Big Data company. We will see much more of that happen. If anything Big Data will show up as a hugely democratizing force. We will realize every single human being literally sits atop an oil field. There is money to be made.

Thursday, January 26, 2012

Google's Hidden Card: Become An ISP

English: Left to right, Eric E. Schmidt, Serge...Image via WikipediaSteve Jobs decided a long time ago that he wanted to do both hardware and software. Bill Gates' cofounder Paul Allen wanted the same. But Bill Gates vetoed the idea. He wanted to focus just on software. Software that will run on all kinds of hardware.

You could argue Bill Gates won the first round and Steve Jobs won the second round. But then Google was even more detached from hardware than was Microsoft. And yet Google bought Motorola, a hardware company. Granted it bought Motorola primarily for the patents to hit back in the Android fight. But there is no denying all that hardware.

Larry Page's Challenge

Google is going to build smartphones and tablets in-house. And that is not easy to do. Apple leads that herd.

Google, the king of search, made several clumsy efforts in the social space until it finally hit Google Plus. Google Plus is great, but it is no Facebook. And Google is well positioned in the Big Data space as well as next generation industries like driverless cars. Talk about hardware, software integration. A car is conspicuous hardware.

I think what though will set Google on the path to becoming the most valuable company in the world is Google getting into the ISP space. Hardware-software-connectivity integration beats hardware-software integration. (Not Hardware, Not Software, But Connectivity, One Gig Per Sec: This Is What I Am Talking About)

What would be some of the ingredients? One gigabit per second speed. Ad based. Use snooping technology. (Eric Schmidt's Cloud Computing And My IC Vision)

The snooping technology is that the ISP reads the web addresses of all the websites you visit and serves ads accordingly. It is like Gmail reads all your emails and serves relevant ads. Same thing. It will not be an invasion of privacy. It is machines reading.

Google as a global ISP would eclipse Google as the search engine of choice in terms of influence and revenue. That also might be the best way to conquer the mobile space with Android.

Sunday, January 15, 2012

Big Data

Image representing Hadoop as depicted in Crunc...Image via CrunchBaseBig Data: Big News
Facebook And Big Data

After reading this you appreciate your Facebook stream just a little more.

O'Reilly Radar: What is big data?
Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn't fit the strictures of your database architectures. ..... cost-effective approaches have emerged to tame the volume, velocity and variability of massive data. Within this data lie valuable patterns and information ...... Today's commodity hardware, cloud architectures and open source software bring big data processing into the reach of the less well-resourced. ...... analytical use, and enabling new products ...... Being able to process every item of data in reasonable time removes the troublesome need for sampling ...... by combining a large number of signals from a user's actions and those of their friends, Facebook has been able to craft a highly personalized user experience and create a new kind of advertising business. It's no coincidence that the lion's share of ideas and tools underpinning big data have emerged from Google, Yahoo, Amazon and Facebook. ....... The emergence of big data into the enterprise brings with it a necessary counterpart: agility. Successfully exploiting the value in big data requires experimentation and exploration. ........ Input data to big data systems could be chatter from social networks, web server logs, traffic flow sensors, satellite imagery, broadcast audio streams, banking transactions, MP3s of rock music, the content of web pages, scans of government documents, GPS trails, telemetry from automobiles, financial market data, the list goes on. ....... the three Vs of volume, velocity and variety are commonly used to characterize different aspects of big data. ........ Having more data beats out having better models ...... If you could run that forecast taking into account 300 factors rather than 6, could you predict demand better? ......... Many companies already have large amounts of archived data, perhaps in the form of logs, but not the capacity to process it. ...... data warehouses or databases such as Greenplum — and Apache Hadoop-based solutions ...... Apache Hadoop.. places no conditions on the structure of the data it can process. ...... First developed and released as open source by Yahoo, it implements the MapReduce approach pioneered by Google in compiling its search indexes. Hadoop's MapReduce involves distributing a dataset among multiple servers and operating on the data: the "map" stage. The partial results are then recombined: the "reduce" stage. ......... Hadoop is not itself a database or data warehouse solution, but can act as an analytical adjunct to one. ....... A MySQL database stores the core data. This is then reflected into Hadoop, where computations occur, such as creating recommendations for you based on your friends' interests. Facebook then transfers the results back into MySQL, for use in pages served to users. ............ the increasing rate at which data flows into an organization — has followed a similar pattern to that of volume. Problems previously restricted to segments of industry are now presenting themselves in a much broader setting. Specialized companies such as financial traders have long turned systems that cope with fast moving data to their advantage. Now it's our turn. ......... Online retailers are able to compile large histories of customers' every click and interaction: not just the final sales. Those who are able to quickly utilize that information, by recommending additional purchases, for instance, gain competitive advantage. The smartphone era increases again the rate of data inflow, as consumers carry with them a streaming source of geolocated imagery and audio data. ......... The importance lies in the speed of the feedback loop, taking data from input through to decision. ........ you wouldn't cross the road if all you had was a five-minute old snapshot of traffic location. ......... "streaming data," or "complex event processing." ...... when the input data are too fast to store in their entirety: in order to keep storage requirements practical some level of analysis must occur as the data streams in. ........ At the extreme end of the scale, the Large Hadron Collider at CERN generates so much data that scientists must discard the overwhelming majority of it — hoping hard they've not thrown away anything useful. The second reason to consider streaming is where the application mandates immediate response to the data. Thanks to the rise of mobile applications and online gaming this is an increasingly common situation. ........ The velocity of a system's outputs can matter too. The tighter the feedback loop, the greater the competitive advantage. ....... Rarely does data present itself in a form perfectly ordered and ready for processing. A common theme in big data systems is that the source data is diverse, and doesn't fall into neat relational structures. It could be text from social networks, image data, a raw feed directly from a sensor source. None of these things come ready for integration into an application. .......... the reality of data is messy. Different browsers send different data, users withhold information, they may be using differing software versions or vendors to communicate with you. And you can bet that if part of the process involves a human, there will be error and inconsistency. ....... Is this city London, England, or London, Texas? By the time your business logic gets to it, you don't want to be guessing. ...... a principle of big data: when you can, keep everything. There may well be useful signals in the bits you throw away. ....... documents encoded as XML are most versatile when stored in a dedicated XML store such as MarkLogic. Social network relations are graphs by nature, and graph databases such as Neo4J make operations on them simpler and more efficient. ....... a disadvantage of the relational database is the static nature of its schemas. In an agile, exploratory environment, the results of computations will evolve with the detection and extraction of more signals. Semi-structured NoSQL databases meet this need for flexibility: they provide enough structure to organize data, but do not require the exact schema of the data before storing it. ........ three forms: software-only, as an appliance or cloud-based. ...... IT is undergoing an inversion of priorities: it's the program that needs to move, not the data. .... Financial trading systems crowd into data centers to get the fastest connection to source data, because that millisecond difference in processing time equates to competitive advantage. ...... 80% of the effort involved in dealing with data is cleaning it up in the first place ...... data science, a discipline that combines math, programming and scientific instinct. ...... The art and practice of visualizing data is becoming ever more important in bridging the human-computer gap to mediate analytical insight in a meaningful way. ...... advice to businesses starting out with big data: first, decide what problem you want to solve.

Sunday, January 01, 2012

Google Is Mind Blowing

Image representing Google as depicted in Crunc...Image via CrunchBaseGoogle conquered search. I remember the days when I used to fondly display the Google search engine on my personal homepage on Geocities. This was when Google had just launched. The company has come a long way since.

It conquered search. It floundered on social, the next big trend, for a few years. But now it seems to also have mastered social. Google Plus is a big hit.

It has conquered the mobile space with Android.

Big Data is the next big thing after social, I think, many think. Google is doing some really interesting things in that space. Facebook is not, Apple is not. Microsoft might, but is not. Many new startups are doing better work than Microsoft in that space. Just like social belongs to Facebook, Big Data deserves to belong to new names, not Google. But Google is proving surprisingly resilient.

People talk about the magic of Apple. I never really got it. For me the magic has always rested with Google.

And that is not even talking about Google X. Google thinks long term like no other company I know. I think Google more than Apple is poised to end up the most valuable company in the world. Google X has been working on entire new industries of the future. Much of it comes across as sci-fi.

I love Google like some people love Apple. But that is no news. That has always been true for me. But I have always been fascinated by the Steve Jobs life story.

It's a buy from me on Google stocks.

If I were forced to choose between Gmail and Facebook, I would pick Gmail. But I am glad I am not being forced.

Wednesday, November 30, 2011

Big Data: Big News

Those who think GOOG is a one trick search pony, checkout GFS, BigTable, MapReduce, Tenzing, etc. These are the building blocks of Big Data
Nov 30 via webFavoriteRetweetReply


I am no pioneer to this observation, neither is this guy above. But it is so obvious Big Data is in the wings. Big Data will gather buzz like social has been the buzz for a few years now.