Astronomers collect and generate petabytes of data. The volume of data is curren
ID: 3536442 • Letter: A
Question
Astronomers collect and generate petabytes of data. The volume of data is currently growing at a rate of 0.5 PB per year. The projections indicate that by 2020, more than 60 PB of archived data will be available to astronomers. Read the paper, %u201CHow Will Astronomy Archives Survive the Data Tsunami?%u201D, located in Week 6 of the online course shell.
Write a four to five (4-5) page paper in which you:
Your assignment must follow these formatting requirements:
The specific course learning outcomes associated with this assignment are:
Explanation / Answer
The whole idea of big data is still relatively new and most discussions or presentations around the subject start off with a definition of what big data really is. Definitions are certainly helpful, especially when the topic is still relatively new, and so we%u2019ll start there oursevles. Big data is very simply a collection of data sets so large and complex that your legacy IT systems can not handle them. When organizations get to the point where their volume, velocity, variety and veracity of data exceed storage or computing capacity, there are some big challenges that need to be addressed. You know you have a big data challenge when your traditional data management systems and analysis tools are overwhelmed and it becomes difficult to process your data using the analytic or visualization tools you currently have. Approaching the big data challenge often necessitates advanced algorithms, infrastructure and frameworks %u2013 and it can all seem very daunting for those just starting out %u2013 but the reality for information-age-based organizations is that our success is throttled by our our ability to rapidly and comprehensively navigate the big data universe.
But of course, big data is relative. In the end, big data by itself has no value %u2013 it%u2019s meaningless. It%u2019s what you do with the data that matters most. Today%u2019s big data discussion is often centered around how to target advertisements or customize a user experience, which makes sense given that the growth in the market place is so closely tied to fact that how we interact with the physical world is more and more dependent on the pervasive use of mobile devices that are connect to the work through sensors. Having the ability to leverage our rich history of data and combine it with new data we are receiving is a huge asset in making our missions successful.
If you are still trying to wrap your head around the difference between petabytes, exabytes, zetabytes, and yottabytes, check out this overview presentation titled %u201CWhat is Big Data and why does it matter%u201D by Tom Soderstrom, the Chief Technology Officer for IT at NASA JPL.
NASA%u2019s Big Data Challenge
NASA%u2019s big data challenge is not just a terrestrial one and it goes beyond the stereotypical challenge. Many of our %u201Cbig data%u201D sets are described by significant metadata, but on a scale that challenges current and future data management practice. We regularly engage in missions where data is continually streaming from spacecraft on Earth and in space, faster than we can store, manage, and interpret it. NASA has two very different types of spacecraft. We have deep space spacecraft that sends back data in the order of MB/s. Then we have earth orbiters that can send back data in GB/s per second. In our current missions, data is transferred with radio frequency, which is relatively slow. In the future, NASA will employ technology such as optical (laser) communication to increase the download and mean a 1000x increase in the volume of data. This is much more then we can handle today and this is what we are starting to prepare for now. We are planning missions today that will easily stream more then 24TB%u2019s a day. That%u2019s roughly 2.4 times the entire Library of Congress %u2013 EVERY DAY. For one mission.
It%u2019s still very expensive to transfer just one bit down from a spacecraft so we want to make sure we collect what is most important. Once the data makes its way to our data centers, storing, managing, visualizing and analyzing it becomes an issue. To give you an idea of what we are dealing with, the size of the Climate Change data repositories alone are projected to grow to nearly 350 Petabytes by 2030. 5 PB%u2019s is equivalent to the total number of letters delivered by the US Postal Service in one year!
One great example of the unique challenge that we face with managing space data is just starting to be demonstrated by the Australian Square Kilometer Array Pathfinder (ASKAP)project which is a large array made up of 36 antennas, each 12 meters in diameter, spread out over 4,000 square meters but working together as a single instrument to unlock the mysteries of our universe. The array, which will officially be turned on and open for business tomorrow Friday, October 5, 2012, is able to survey the whole sky very quickly and offers an ability to perform research that could never have been done before. Check out this great time lapse video showing off the new telescopes capabilities! The array is a precursor for the larger Square Kilometre Array telescope that will open in 2016 and will combine the signals received from thousands of small antennas spread over a distance of more than 3000 km. When operational, as much as 700TB/second of data will flow from the Square Kilometre Array! This is a big data challenge.
And of course, spacecraft are not the only source of our data, thanks to an ever-growing supply of mobile devices, low-cost sensors, and online platforms. As an article in Harvard Business Review this month put it, %u201Ceach of us is now a walking data generator.%u201D The scale of the big data challenge for NASA, like many organizations, is daunting.
As you can probably imagine, the increasing data volumes are not our only challenges. As our wealth of data increases, the challenge of indexing, searching, transferring, and so on all increase exponentially as well. Additionally, the increasing complexity of instruments and algorithms, increasing rate of technology refresh, and the decreasing budget environment, all play a significant factor in our approach. Fortunately, the entire federal government has turned their attention towards the growing challenge. In March 2012, the Obama administration announced the Big Data Research and Development Initiative to %u201Cgreatly improve the tools and techniques needed to access, organize, and glean discoveries from huge volumes of digital data.%u201D The goal is to transform government%u2019s ability to use big data for scientific discovery, environmental and biomedical research, education, and national security.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.