Methods in Biostatistics with R
Methods in Biostatistics with R
A Rigorous and Practical Treatment of Biostatistics Foundations using R
About the Book
Biostatistics is easy to teach poorly. Too often, books focus on methodology with no emphasis on programming and practical implementations. In contrast, books focused on R programming and visualization rarely discuss foundational topics that provide the infrastructure needed by data analysts to make decisions, evaluate analytic tools, and get ready for new and unforeseen challenges. Thus, we are bridging this divide that had no reason to exist in the first place. The book is unapologetic about its focus on Biostatistics, that is Statistics with Biological, Public Health, and Medical applications, though we think that it could be used successfully for large Statistical and Data Science Courses. Data and code can be downloaded here: https://github.com/muschellij2/biostatmethods
Table of Contents
1 Introduction
1.1 Biostatistics
1.2 Mathematical prerequisites
1.3 R
2 Introduction to R
2.1 R and RStudio
2.2 Reading R code
2.3 R Syntax and Jargon
2.4 Objects
2.5 Assignment
2.6 Data Types
2.7 Data Containers
2.8 Logical Operations
2.9 Subsetting
2.10 Reassigment
2.11 Libraries and Packages
2.12 dplyr, ggplot2, and the tidyverse
2.13 Problems
3 Probability, random variables, distributions
3.1 Experiments
3.2 An intuitive introduction to the bootstrap
3.3 Probability
3.4 Probability calculus
3.5 Sampling in R
3.6 Random variables
3.7 Probability mass
3.8 Probability density function
3.9 Cumulative distribution function
3.10 Quantiles
3.11 Problems
3.12 Supplementary R training
4 Mean and Variance
4.1 Mean or expected value
4.2 Sample mean and bias
4.3 Variance, standard deviation, coefficient of variation
4.4 Variance interpretation: Chebyshev’s inequality
4.5 Supplementary R training
4.6 Problems
5 Random vectors, independence, covariance, and sample mean
5.1 Random vectors
5.2 Independent events and variables
5.3 Covariance and correlation
5.4 Variance of sums of variables
5.5 Sample variance
5.6 Mixture of distributions
5.7 Problems
6 Conditional distribution, Bayes’ rule, ROC
6.1 Conditional probabilities
6.2 Bayes rule
6.3 ROC and AUC
6.4 Problems
7 Likelihood
7.1 Likelihood definition and interpretation
7.2 Maximum likelihood
7.3 Interpreting likelihood ratios
7.4 Likelihood for multiple parameters
7.5 Profile likelihood
7.6 Problems
8 Data visualization
8.1 Standard visualization tools
8.2 Problems
9 Approximation results and confidence intervals
9.1 Limits
9.2 Law of Large Numbers (LLN)
9.3 Central Limit Theorem (CLT)
9.4 Confidence intervals
9.5 Problems
10 The χ 2 and t distributions
10.1 The χ 2 distribution
10.2 Confidence intervals for the variance of a Normal
10.3 Student’s t distribution
10.4 Confidence intervals for Normal means
10.5 Problems
11 t and F tests
11.1 Independent group t confidence intervals
11.2 t intervals for unequal variances
11.3 t-tests and confidence intervals in R
11.4 The F distribution
11.5 Confidence intervals and testing for variance ratios of Normal distributions
11.6 Problems
12 Data Resampling Techniques
12.1 The jackknife
12.2 Bootstrap
12.3 Problems
13 Taking logs of data
13.1 Brief review
13.2 Taking logs of data
13.3 Interpreting logged data
13.4 Inference for the Geometric Mean
13.5 Summary
13.6 Problems
14 Interval estimation for binomial probabilities
14.1 Introduction
14.2 The Wald interval
14.3 Bayesian intervals
14.4 Connections with the Agresti/Coull interval
14.5 Conducting Bayesian inference
14.6 The exact, Clopper-Pearson method
14.7 Confidence intervals in R
14.8 Problems
15 Building a Figure in ggplot2
15.1 The qplot function
15.2 The ggplot function
15.3 Making plots better
15.4 Make the Axes/Labels Bigger
15.5 Make the Labels to be full names
15.6 Making a better legend
15.7 Legend INSIDE the plot
15.8 Saving figures: devices
15.9 Interactive graphics with one function
15.10 Conclusions
15.11 Problems
16 Hypothesis testing
16.1 Introduction
16.2 General hypothesis tests
16.3 Connection with confidence intervals
16.4 Data Example
16.5 P-values
16.6 Discussion
16.7 Problems
17 Power
17.1 Introduction
17.2 Standard normal power calculations
17.3 Power for the t test
17.4 Discussion
17.5 Problems
18 R Programming in the Tidyverse
18.1 Data objects in the tidyverse: tibbles
18.2 dplyr: pliers for manipulating data
18.3 Grouping data
18.4 Summarizing grouped
18.5 Merging Data Sets
18.6 Left Join
18.7 Right Join
18.8 Right Join: Switching arguments
18.9 Full Join
18.10 Reshaping Data Sets
18.11 Recoding Variables
18.12 Cleaning strings: the stringr package
18.13 Problems
19 Sample size calculations
19.1 Introduction
19.2 Sample size calculation for continuous data
19.3 Sample size calculation for binary data
19.4 Sample size calculations using exact tests
19.5 Sample size calculation with preliminary data
19.6 Problems
20 References
The Leanpub 60 Day 100% Happiness Guarantee
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.
You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!
So, there's no reason not to click the Add to Cart button, is there?
See full terms...
Earn $8 on a $10 Purchase, and $16 on a $20 Purchase
We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.
(Yes, some authors have already earned much more than that on Leanpub.)
In fact, authors have earnedover $13 millionwriting, publishing and selling on Leanpub.
Learn more about writing on Leanpub
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them