Big Data Is a Big Deal!
Computer science is expected to be a $500 billion opportunity that anticipates having one million more jobs than students by 2020. This staggering statistic is one factor behind Computer Science Education Week (CSEdWeek), an annual program dedicated to showing K-12 students the importance of computer science education. CSEdWeek is December 9-15, 2013 and to help promote this important cause and encourage you to consider a career in computer science, Dr. Bruce Harmon offers this discussion on big data and why it’s a big deal.
By Bruce Harmon, Ph.D.
Not only are “Big Data” and “Big Data Analytics” the most hyped-about phrases in Computer Science and IT today (according, anyway, to the marketing-research firm Gartner), but they are absolutely real and worthy of our attention. Let’s demystify them now.
We refer to this phenomenon as Big Data in recognition of how, along with an exponential growth in the amount of digital information available to mankind over the Internet and in offline repositories, we have more than kept pace (via technology) to store and deal with it. Google, Yahoo and other companies were motivated by the huge task developing efficient, Internet-based search engine capabilities and thus developed technologies that enabled them to process and store it. Furthermore, many of the technologies are available for other companies to use via open-source software projects hosted at Apache and elsewhere. These include Hadoop, which divides, distributes and reassembles results of the work to be done; MapReduce, which performs much of the parallelized work; Pig, Hive, HBase, YARN and many more. Add to these the rapid acceleration in memory, storage, networking and computational power, and you have the ability to derive significant competitive advantage from all this data.
Many of you are customers of Amazon. Recall how when you buy a book from the company, you get frequent reminders that people who bought that same book also bought “the following books” which you might find interesting. By systematic capture and maintenance of all customer transactions, Amazon is able to upsell with precision. And did you know that Google has kept every search request you’ve ever made? Armed with that data, Google demonstrated the power of Big Data in 2009 by accurately predicting the location and date of the outbreak of the winter flu in the United States weeks before the Centers for Disease Control. It accomplished this remarkable feat by correlating searches for products for flu-symptom relief with the documented first outbreaks over the five previous years and then watching for that same kind of search activity in 2009.
“Analytics” is the vehicle by which an organization can transform its data into the kind of business intelligence that gives it a competitive advantage. Big Data Analytics (BDA) is then the use of this vehicle on a grand scale. As this is the “bleeding edge” of Computer Science today, it is often necessary for the data scientist to write computer programs and scripts to perform the statistical analysis required. CTU has recently introduced a Doctor of Computer Science concentration in BDA with the aim of assisting its students in gaining the skills to be essential to their employers as such data scientists.
Of this I am sure: You will be hearing a lot about Big Data for a long time to come.
Image Credit: Flickr/Daniel X. O’Neil