Senin, 23 Juni 2014

[J164.Ebook] Fee Download Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt

Fee Download Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt

Simply hook up to the net to gain this book Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt This is why we indicate you to use as well as utilize the industrialized technology. Checking out book does not indicate to bring the published Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt Developed modern technology has actually enabled you to read just the soft data of guide Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt It is very same. You might not need to go as well as obtain conventionally in searching guide Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt You may not have adequate time to spend, may you? This is why we provide you the very best way to obtain guide Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt currently!

Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt

Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt



Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt

Fee Download Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt

Recommendation in choosing the very best book Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt to read this day can be obtained by reading this web page. You can discover the very best book Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt that is offered in this world. Not only had the books published from this country, but likewise the various other nations. And also currently, we expect you to check out Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt as one of the reading products. This is just one of the most effective books to accumulate in this site. Check out the resource as well as look the books Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt You could find great deals of titles of the books given.

The way to obtain this book Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt is quite easy. You might not go for some locations as well as invest the moment to just locate guide Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt Actually, you might not constantly get the book as you want. Yet here, only by search as well as find Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt, you could obtain the lists of the books that you really expect. Often, there are several books that are showed. Those publications certainly will surprise you as this Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt collection.

Are you curious about mainly publications Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt If you are still confused on which one of guide Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt that ought to be acquired, it is your time to not this website to seek. Today, you will require this Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt as one of the most referred publication as well as a lot of needed book as sources, in various other time, you can delight in for some other publications. It will certainly depend upon your prepared needs. However, we consistently recommend that books Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt can be a terrific invasion for your life.

Even we talk about guides Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt; you might not locate the printed publications below. Numerous collections are offered in soft file. It will specifically offer you much more perks. Why? The first is that you might not need to bring guide all over by satisfying the bag with this Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt It is for the book is in soft data, so you could save it in gadget. Then, you could open the gadget all over and check out the book effectively. Those are some few perks that can be got. So, take all advantages of getting this soft data book Doing Data Science: Straight Talk From The Frontline, By Cathy O'Neil, Rachel Schutt in this site by downloading in link given.

Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt

Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.

In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.

Topics include:

  • Statistical inference, exploratory data analysis, and the data science process
  • Algorithms
  • Spam filters, Naive Bayes, and data wrangling
  • Logistic regression
  • Financial modeling
  • Recommendation engines and causality
  • Data visualization
  • Social networks and data journalism
  • Data engineering, MapReduce, Pregel, and Hadoop

Doing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

  • Sales Rank: #79136 in Books
  • Brand: Brand: O'Reilly Media
  • Published on: 2013-11-03
  • Original language: English
  • Number of items: 1
  • Dimensions: 9.00" h x .78" w x 6.00" l, 1.13 pounds
  • Binding: Paperback
  • 408 pages
Features
  • Used Book in Good Condition

Review
"Every once in a while a single book comes to crystallize a new discipline. If books still have this power in the era of electronic media, "Doing Data Science: Straight Talk from the Frontline"�by Rachel Schutt and Cathy O'Neil: O'Reilly, 2013 might just be the book that defines data science."�-- Joseph RickertRevolutions Blog

"I enjoyed Rachel and Cathy’s book, it’s readable, informative, and like no other book I’ve read on the topic of statistics or data science."�
—Andrew Gelman
Professor of statistics and political science, and director of the Applied Statistics Center at Columbia University

"I got a lot out of Doing Data Science, finding the chapter organization on business problem specification, analytics formulation, data access/wrangling, and computer code to be very helpful in understanding DS solutions."
—Steve Miller
Co-founder, OpenBI, LLC, a Chicago-based business intelligence services firm

About the Author

Cathy O’Neil earned a Ph.D. in math from Harvard, was postdoc at the MIT math department, and a professor at Barnard College where she published a number of research papers in arithmetic algebraic geometry. She then chucked it and switched over to the private sector. She worked as a quant for the hedge fund D.E. Shaw in the middle of the credit crisis, and then for RiskMetrics, a risk software company that assesses risk for the holdings of hedge funds and banks. She is currently a data scientist on the New York start-up scene, writes a blog at mathbabe.org, and is involved with Occupy Wall Street.

Rachel Schutt is the Senior Vice President for Data Science at News Corp. She earned a PhD in Statistics from Columbia University, and was a statistician at Google Research for several years. She is an adjunct professor in Columbia’s Department of Statistics and a founding member of the Education Committee for the Institute for Data Sciences and Engineering at Columbia. She holds several pending patents based on her work at Google, where she helped build user-facing products by prototyping algorithms and building models to understand user behavior. She has a master's degree in mathematics from NYU, and a master's degree in Engineering-Economic Systems and Operations Research from Stanford University. Her undergraduate degree is in Honors Mathematics from the University of Michigan.

Most helpful customer reviews

131 of 133 people found the following review helpful.
More breadth than depth
By Carsten J�rgensen
Book review - Doing Data Science by O'Neil and Schutt, O'Reilly Media.

More breadth than depth

What is data science? The book Doing Data Science not only explains what data science is but also provides a broad overview of methods and techniques that one must master in order to call one self a data scientist. The book is based on a course about data science given at Columbia University. However it is not to be considered as a text book about data science but more as a broad introduction to a number of topics in data science.

In the spring of 2013 I followed two Coursera courses. One about the statistical programming language R and one on Data Analysis. I had for some time been looking for a book that could be used as a follow-up reading on topics in data science. This was the reason I picked up "Doing Data Science".

The book begins with a chapter about what data science is all about is followed by four chapters on topics like statistical inference, explanatory data analysis, various machine learning algorithms, linear and logistic regression, and Naive Bayes. I have a background in both mathematics and statistics and I was able to understand these chapters but the material is covered in such broad terms that I find it hard to believe that a newcomer to this topics will understand or gain much knowledge from reading these chapters. Basic math is presented about the models but without some kind of detailed explanation one cannot develop any deeper intuition for the approach explained.

The best parts of the book is definitely chapter 6 to 8 and 10. In here we find interesting discussion about coverage of data science applied to financial modeling, extracting information from data, and social networks. I really enjoyed the examination of time stamped data, the Kaggle Model, feature selection, and case-attribute data versus social network data. The math behind these topics was however once again explained quite superficial. Centrality measures is central to social network analysis but it is very hard to develop intuition for there measures without a more detailed explanation about the underlying math. These chapters contains lots of useful resources for finding additional information about the discussed topics.

Data visualization is an integral part of data science for communication results. Beginners in the field of data science needs concrete and easy to follow instruction on how to get started with visualization. Unfortunately the book focuses more on the use of data visualization in modern art projects. The content is simply to abstract for beginners to learn about the usage of visualization in data science.

When I was browsing the book before actual buying it I was kind thrilled to see that it covered topics like causality and epidemiology. Topics that I did not found covered in any other book about data science. However the chapter about epidemiology is not about using data science in epidemiology but 'just' about using data science to evaluate the methods used in epidemiology. Likewise there seems to be no link between data science and causality. I later discovered that the authors used an entire blog post ([...] to explain why causality was part of the university course underlying the book. This material or parts of it should have made it into the book. I am still not convinced that causality is a topic in data science.

There are several examples in which the book assumes the reader to have knowledge of US government structure and organizations. Examples include page 292 when discussing US health care databases and page 298 where FDA is mentioned without further introduction or explanation about what FDA is.

A book than contains programming examples should always make the code accessible to download. Typing in the code yourself is simply waste of time. It is possible to download some of the datasets used in the book through GitHub. But the code does not seem to be available. I also own the electronic version of the book and I tried to copy-paste some of the examples from the e-book but there are several examples of code that hasn't been proof written or tested prior to publication. The sample code misses references to required R libraries or refers to computer folder structures on some local Columbia University computer. The companion datasets that can be downloaded on GitHub consists of a number of Excel files. The R sample code uses the gdata package to load these Excel files into R for further analysis. It took be quite some time to figure out why this process didn't work on a Windows computer. The gdata package requires Perl to be installed on the computer and this is not default software on Windows. In my opinion one should always publish data in a simple format, e.g. csv files and definitely not proprietary formats like xls for Excel files.

Data Science is both science and a lot of practical experience. I guess the title of the book Doing Data Science tries to capture that. You need to do data science in order to learn it. The covered topics are interesting but the material is more breadth than depth. Luckily there are lots of useful links and resources to additional materials. Personally I would prefer more details about the actual data science topics like e.g. extracting meaning from data and social network analysis and less focus on math. The book already requires some knowledge of math, statistics and programming, so why not presume that the reader has the background knowledge and dive straight into the data science discussions.

I really like the idea about having a lot of different people present various topics in data science and the book is well written and contains lots of useful resources for further studies of data science. I will recommend to book to people new to the subject but be aware of the fact that source code is not available and that is a major drawback.

Disclosure: I review for the O'Reilly Reader Review Program and I want to be transparent about my reviews so you should know that I received a free copy of this ebooks in exchange of my review.

56 of 58 people found the following review helpful.
Doing Data Science Worth a Look
By Dan D. Gutierrez
I found this book to be a very odd bird indeed. It is one book you can read from back cover to front cover and not be at a disadvantage. This is because the book is really just a collection of presentations made by various people to a class taught by the primary author Rachel Schutt at Columbia University in the Fall of 2012 – Introduction to Data Science. It wasn’t entirely clear what content Schutt was directly responsible for since only some of the chapters indicate who the contributors were (one of the chapters was contributed by a group of her students!). The co-author, Cathy O’Neil, I’ve encountered before as an outspoken blogger going by the name “mathbabe” but it wasn’t specifically stated how she became part of the book project, other than to say she was one of the students in Schutt’s class. Chapter 6 was partly written by O’Neil.

Both Schutt and O’Neil are Ph.D.s data science appropriate fields, but the book was not “written” by the two, rather they seemed to have performed some kind of editing function with the materials submitted by each contributor and added commentaries of their own. As a result, the book is a hodgepodge of anecdotes, factoids, R code snippets, plots, and mathematics, all from the in-class presentations. I enjoy seeing math in data science books, but the equations in this book were sort of just floating there requiring the reader to explore further at another time.

Although I have issues with the book as it is not any sort of text for the field, I did enjoy reading it with a number of “Ah, I didn’t know that!” moments. Schutt’s credentials in data science are considerable, having worked at Google for a few years around the same time that “data science” was growing up in Silicon Valley. As a result the book has many memorable anecdotes about the early days of the data science industry, and observations about what makes big data tick. I enjoyed the story about the Google software engineer who accidentally deleted 10 petabytes of data, and I think my favorite quote from the book is from the student’s chapter 15:

Kaggle competitions could be described as the dick-measuring contests of data science.

With contributor’s chapters on statistical inference, machine learning algorithms, logistic regression, financial modeling, recommendation engines, data visualization, Hadoop, MapReduce, and more, I’d say the book is worth a read, but not necessarily as a source of learning data science but more as a high-level guide and short historical account of this young industry. You get to learn about the people, companies, technologies that have collectively built the data science arena and you’ll be better for it especially if you are working to become a data scientist yourself.

77 of 92 people found the following review helpful.
A spoonful of sugar...
By Dimitri Shvorob
... helps the medicine go down, as Mary Poppins used to say. An IT-focused publisher, O'Reilly has twice before used the "book as collection of chapters by different contributors" formula in its foray into the attractive "data" niche, with such titles as "Beautiful data" and "Bad data". "Doing data science" - by the way, I prefer Hastie and Tibshirani's "statistical learning" to the fuzzy and grandiose "data science" - follows the same approach, but, with its subject matter being closer to the academe, the company enlisted two young PhDs to steer the collaborative effort. Rachel Schutt took the lead as author and editor, and, assisted by Cathy O'Neil, produced an engaging, informal - you don't often see "science" in the title and "huge-ass" in the text - yet sufficiently technical to be hands-on, sequence-of-vignettes-styled book. Imagine a mash-up of a magazine article and a textbook. Neither part may be best-in-class, but their combination makes for a "unique selling proposition".

Well, maybe not a textbook. Most textbooks are carefully written and carefully checked. In contrast, when I see "Doing data science" introduce the ROC curve in three places, one of which translates the "O" as "operator", I can guess that this is a copy-paste of papers by three contributors. When Dr. O'Neil casually redefines an English word ("causal") to avoid rewriting a couple of sentences, or pronounces, on page 159, that "priors reduce degrees of freedom" - this is painfully meaningless, and neither term is defined, only name-checked - I suspect that she knows better, but just did not feel like spending more time on her half-chapter. Neither author speaks of their own projects - if this is the "frontline", then it's other soldiers' "trenches" that we are visiting. The occasional code listings are borrowed as well, thrown in without editing or comments. In this last regard, "Doing data science" lags far behind the book that seems to have informed its choice of topics, Peter Harrington's "Machine learning in action". (That's one suggestion - and if you want a good, accessible textbook, "Introduction to statistical learning" by James et al. is another).

None of it is going to matter to the book's target audience. "Doing data science" is aimed at beginners - and is bound to be interesting and useful to thousands of keen undergrads and adult learners.

See all 49 customer reviews...

Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt PDF
Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt EPub
Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt Doc
Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt iBooks
Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt rtf
Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt Mobipocket
Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt Kindle

Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt PDF

Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt PDF

Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt PDF
Doing Data Science: Straight Talk from the Frontline, by Cathy O'Neil, Rachel Schutt PDF

Tidak ada komentar:

Posting Komentar