Ben Fellows

Cybersecurity Professional

Table of Contents

Math (ALEKS)

General / Algebra

  • Greatest Common Factor: No common factors? → 1; Don’t use fully reduced numbers – could be a multiple (e.g. 9 for 18 and 27)
  • Distributive Property: a(b + c) = ab + ac
    • FOIL (first, outer, inner, last): (a + b)(c + d) = ac + ad + bc + bd
  • Changing signs of numerator & denominator in a fraction: multiply by -1-1 allows you to change the signs
  • Cross products and proportions: If two ratios form a proportion then their cross products are equal.
    • if ab=cd, then a * d = b * c
  • When multiplying fractions, multiply the numerators and the denominators, e.g.:
    • a/b x c/d = a x c/b x d
  • When dividing fractions:
    • a/b ÷ c/d = a/b x d/c
  • prime factorization method to find least common denominator → find the prime factors of each number then take the prime with the greatest number of factorizations for each number and multiply them. E.g → find the LCD of 4, 6, and 8 → prime factors of 4 are 2 and 2, prime factors of 6 are 3 and 2, prime factors of 8 are 2, 2, and 2. The greatest number of 2s occurs 3 times for 8. The greatest number of 3s occurs 1 time for 6. Thus 2 x 2 x 2 x 3 = 24, which is the least common denominator of 4, 6, and 8.
  • Greatest Common Factor of 2 multivariate mononomials → find teh GCF of coefficient, then lowest powers of the variables

Rules for exponents

  • Product Rule –> am x an = am+n
  • Power of a Product Rule –> am x bm = (a x b)m
  • Quotient Rule –> xrxs = xr-s
  • Power of a Power –> (xr)s = xr*s
  • Power of a Quotient –> (xy)r = xryr
  • Negative Exponent –> a-b = 1ab
  • Zero exponent –> x0 = 1 if x ≠ 0

Rules for square roots

  • Quotient property → √(a/b) = √a / √b
  • √ (a x b) = √a x √b

Radicals

  • xm/n = n√xm

Logarithms

  • logac = b → ab = c (if a, c are positive and a does not equal 0)
  • Logarithm of a product → loga(MN) = logaM + logaN
  • Logarithm of a quotient → logaM/N = logaM – logaN
  • Logarithm of a power → logaMp = (p)logaM

Linear Equations

  • Slope intercept form of an equation → y = mx + b where m = slope and b = y intercept
  • Slopes of lines →
    • slope = m = rise/run
  • parallel lines have the same slope
  • perpendicular lines → product of the two slopes is -1

Other equations

  • Parabolas →
    • Basic form: y = ax2
    • Vertex form: y = (x – h)2 + k, where the values (h,k) are the vertex of the parabola
    • Quadratic Equations: ax2 + bx + c = 0
      • axis of symetry is x = – b / 2a
      • solve for y with the axis of symetry to find the vertex
      • you can use the quadratic formula to solve for x → x = -b ± √ b2 – 4ac / 2a
  • Cubes y = x3

Statistics Topics

  • Measures of Center
    • mean = average = greek letter mu (μ) for population; Roman letter x with a bar (x) for a sample
    • median = middle value – if even number, take the mean of the middle two numbers
    • mode = data value w/ greatest frequency – your set can have more than one mode, or no mode (all frequcies are 1)
    • weighted mean = some data points contribure more than others
  • Measures of Variability
    • standard deviation = greek letter sigma (σ) for population; roman letter s for a sample
s = √ (x1x)2 + … + (xnx)2

                 n – 1

Frequency Polygon

  • Drawing – classes w/ fequencies of those classes
  • Classes are represented by the midpoint
  • Find the number of classes, the range between them, then their midpoint – that is the class “name” on the x-axis
  • y values are the count or “frequency” in the class

Contingency Table

  • Two way table

Relative frequency = proportion of the event vs. all

Financial Topics

Financial Ratios

  • profit margin = net incomesales
  • return on equity = net incomestockholders equity
  • asset turnover ratio = net salesassets
  • current ratio = current assetscurrent liabilities
  • acid test or quick ratio = cash & securities (not inventory)current liabilities
  • debt to assets ratio = debt (owned by others)assets (owned by shareholders) or otherwise total liabilitiestotal assets

Interest and Time Value of Money

  • Interest = Principal x Rate x Time
  • Maturity = Principal + Interest
  • Compound Interest = INterest from period 1 is added to principal, then new principal number is used to calculate interest in period 2, etc.

Probability Thoery

  • Intersection = B ∩ C = “B and C”
  • Union = B ∪ C = “B or C”
  • Complement = B = “not B”
  • Addition formula for probability – indepdendent events → P(C∪F) = P(C) + P(F) – P(C∩F)
  • Probability complement rule → P(A) = 1 – P(A)
  • Probability of independent events → P(A∩B) = P(A) x P(B)
  • If mutually exclusive, then P(A∩B) = 0
  • “B occurs or A does not occur (or both)” → B∪A
    • …if mutually exclusive then → B∪A = A
  • Conditional probability → probability that an event will occur given another event has occurred
    • …or P(A|B) = P(A∩B)/P(B)

Economics Prep Work (Notes on A Concise Guide to Macro Economics by David A. Moss)

  • Understanding the Macro Economy → Three Pillars = 1) Output, 2) Money, 3) Expectations
    • Output
      • Output is measured by Gross Domestic Product (GDP) which is the market value of all final goods and services produced within a country over a given year.
      • GDP = C + I + G + EX – IM
        • where C is consumption by housholds
        • I is investment in productive assets
        • G is government spending on goods and services
        • EX is exports
        • IM is imports
      • GDP is the ultimate budget contstraint for a nation
        • Balance of Payments statement
      • Theory of Comparative Advanatage
        • People and nations should produce where they have a comparative, not absolute, advantage
      • What makes GDP go up? → 1) ↑ labor, 2) ↑ capital, 3) ↑ efficiencies in the use of labor and capital
    • Money
    • Facilitation of exchange
    • Affects key Macroeconomic variables: 1) interest rates, 2) exchange rates, 3) aggregate price level.
      • Interest Rates: price of holding money, or cost of investment funds
      • Time value of money: preference to having money now vs. later
    • Expecatations

Statistics Prep, Descriptive Statistics

Chapter 2: Data Collection

A data set consists of all the values of all the variables we have chosen to observe. It often is an array with n rows and m columns. Data sets may be univariate (one variable), bivariate (two variables), or multivariate (three or more variables). There are two basic data types; categorical data (categories that are described by lables) or numerical (meaningful numbers). Numerical data are discrete if the values are integers or can be counted or continuous if any interval can contain more data values. Nominal measurements are names, ordinal measurements are ranks, interval measurements have meaningful distances between data values, and ratio measurements have meaningful ratios and a zero reference point. Time series data are observations measured at n different points in time or over sequential time intervals, while cross-sectional data are observations among n entities such as individuals, firms, or geographic regions. Among random samples, simple random samples pick items from a list using random numbers, systematic samples take every kth item, cluster samples select geographic regions, and stratified samples take into account known population proportions. Non-random samples include convenience or judgement samples, gaining time but sacrificing randomness. Focus groups give in-depth information. Survey design requires attention to question wording and scale definitions. Survey techniques (mail, phone, interview, web, direct observation) depend on time, budget, and the nature of the questions and are subject to various sources of error.

Chapter 3: Describing Data Visually

For a set of observations on a single numerical variable, a stem-and-leaf plot or a dot plot displays the individual data values, while a frequency distribution classifies the data into classes called bins for a histogram of frequencies for each bin. The number of bins and their limits are matters left to your judgement, though Sturges’ Rule offers advice on the number of bins. The line chart shows values of one or more time series variables plotted against time. A log scale is sometimes used in time series charts when data vary by orders of magnitude. The bar chart or column chart shows a numerical data value for each category of an attribute. However, a bar chart also can be used for a time series. A scatter plot can reveal the association (or lack of association) between two variables X and Y. The pie chart (showing a numerical data value for each category of an attribute if the data values are parts of a whole) is common but should be used with caution. Sometimes a simple table is the best visual display. Creating effective visual displays is an aquired skill. Excel offers a wide range of charts from which to choose. Consider using R (it’s free) if you want to learn more about programming. Deceptive graphs are found frequently in both media nad business presentations, and the consumer shoudl be aware of common errors.

Chapter 4: Descriptive Statistics

The mean and median describe a sample’s center and alsi indicate skewness. The mode is useful for discrete data with a small range. The trimmed mean eliminates extreme values. The geometric mean mitigates high extremes but cannot be used when zeros or negative values are present. The midrange is easy to calculate but is sensitive to extremes. Variability is typically measured by the standard deviation, while relative dispersion is given by the coefficient of variation for nonnegative data. Standardized data reveal outliers or unusual data values, and the Empirical Rule offers a comparison with normal distribution. In measuring dispersion, the mean absolute deviation (MAD) is easy to understand but lacks nice mathematical properties. Quartiles are meaningful even for fairly small data sets, while percentiles are used only for large data sets. Box plots show the quartiles and data range. The correlation coefficient measures the degree of linearity between two variables. The covariance measures the degree to which two variables move together. We can estimate many common descriptive statistics from grouped data. Sample coefficients of skewness and kurtosis allow more precise inferences about the shape of the population being sampled instead of relying on histograms.

Chapter 5: Probability

The sample space for a random experiment contains all possible outcomes. Simple events in a discrete sample space can be enumerated, while outcomes of a continuous sample space can only be described by a rule. An empirical probability is based on relative frequencies, a classical probability can be deduced from the nature of the experiment, and a subjective probability is based on judgement. An event’s complement is every outcome except the event. The odds are the ratio of an event’s probability to the probability of its complement. The union of two events is all outcomes in either or both, while the intersection is only those events in both. Mutually exclusive events cannot both occur, and collectively exhaustive events cover all possibilities. The conditional probability of an event is its probability given that another event has occurred. Two events are independent if the conditional probability of one is the same as its unconditioned probability. The joint probability of independent events is the product of their probabilities. A contingency table is a cross-tabulation of frequencies for two variables with categorical outcomes and can be used to calculate probabilities. A tree visualizes events in a sequential diagram. Bayes’ Theorem shows how to revise a prior probability to obtain a conditional or posterior probability when another event’s occurrence is known. The number of arrangements of sampled items drawn from a population is found with the forumla for permutations (if order is important) or combinations (if order does not matter).

Chapter 6: Discrete Probability Distributions

A random variable assigns a numerical value to each outcome in the sample space of a random process. A discrete random variable has a countable number of distinct values. Probabilities in a discrete probability distribution must be between zero and must sum to one. The expected value is the mean of the distribution, measuring center, and its variance is a measure of variability. A known dsitribution is described by its parameters, which imply its probability distribution function (PDF) and its cumulative distribution function (CDF).

As summarized in the table below, teh uniform distribution has two parameters (a, b) that define its domain. The Bernoulli distribution has one parameter (π, the probability of success) and two outcomes (o or 1). The binomial distribution has two parameters (n, π). It describes the sume of n independent Bournoulli random experiements with constant probability of success. It may be skewed left (π > .50) or right (π > .50) or be symmetric (π = .50) but becomes less skewed as n increases. The Poisson distribution has one parameeter (λ, the mean arrival rate). It describes arrivals of independent events per unit of time or space. It is always right-skewed, becoming less so as λ increases. The hypergeometric distribution has three parameters (N, n, s). It iss like a binomial, except that sampling of n items is without replacement from a finite population of N items containing s successes. The geometric distribution is a one-parameter model (π, the probability of success) that describes the number of tirals until the first success. The figure below shows the relationships between these five discrete models.

Model Parameters Mean Variance Characteristics PDF
Bernoulli π π π(1 – π) Used to generate the binomial
Binomial n, π (1 – π) Skewed right if π < .50, left if π > .50 P(X = x) = n!/x!(n – x)! * πx(1 – π)n – x
Geometric π 1/π (1 – π)/π2 Always skewed right and leptokurtic
Hypergeometric N, n, s where π = s/N (1 – π) * [(Nn)/(N – 1)] Like binomial except sampling without replacement from a finite population
Poisson λ λ λ Always skewed right and leptokurtic P(X = x) = λxe/x! , for X = 1, 2, 3, 4 …
Uniform a, b (a + b)/2 (b – a + 1)2 – 1)/12 Always symetric and platykurtic

Personal Notes

  • PDF → P(X = x)
  • CDF → P(X ≤ x)

Chapter 7: Continuous Probability Distributions

  • 7.1: Continuous Probability Distributions
    • Continuous Random Variables → arises from measuring something such as the waiting time until the next customer arrives.
      • Can have non-integer values
      • Probabilities are described as areas under a curve called the probability density function (PDF)
      • Intervals like P(53.5 ≤ X ≤ 54.5)
      • For a continuous random variable, the PDF is an equation that shows the height of the curve f(x) at each possible value of X.
      • CDF is denoted F(x) and shows P(X ≤ x), the cumulative area to the left of a given value of X.
      • CDF is useful for probabilities, while the PDF reveals the shape of the distribution.
    • Probabilities as areas
      • With discrete random variables we take sums of prob. over groups of points. But continuous prob. functions are smooth curves, so the area at any point would be zero. We speak of areas under curves. In calculus terms → P(a < X < b) is the integral of the probability density function f(x) over the interval from a to b.
      • Area under any PDF must be 1
    • Expected Value and Variance
      • Mean and variance of continuous random variable are anaalagous to E(X) and Var(X) for a discrete random variable, but use integral sign and summation sign. Integrals are taken over all X-values.
      • Continuous Random Variable Discrete Random Variable
        Mean E(X) = μ = -∞+∞ xf(x)dx E(X) = μ = all xxP(x)
        Variance var(X) = σ2 = -∞+∞(x – μ)2f(x)dx var(X) = σ2 = all x[x – μ]2P(x)
  • 7.2: Uniform Continuous Distribution
    • Uniform Continuous Distribution → If X is a random variable that is uniformly distributed, then its PDF has a constant height and CDF is a straight line.
    • Only used when you think no X value is more likely than any other.
  • 7.3: Normal Distribution
    • Always symetric, bell shape, measured on a continuous scale, possess a clear center, have only one peak, tapering tails
  • 7.4: Standard Normal Distribution
    • If X is normally distributed N(μ, σ) the standardized variable Z has a standard normal distrubution. It s mean is 0 and its standard dev. is 1, denoted N(0, 1)
    • z = x – μ/σ
    • Standard Normal Distribution
      Parameters μ = population mean
      σ = pop standard deviation
      PDF f(z) = 1/√2π e-z2/2 where z = x – μ/σ
      Domain -∞ < z < +∞
      Mean 0
      Standard Deviation 1
      Shape Symetric, mesokurtic, and bell-shaped
    • Normal dist shows that z = 1.96 (2 sigma interval) yields a 95% area under the curve (rounded to 2, this is the basis for the emprical rule)
    • Inverse Normal → solving for percentiles → use the z score equation to solve for x
  • 7.5: Normal Approximations
    • Normal approximation to the binomial
      • When ≥ 10 and n(1 – π) ≥ 10 → use normal approximation to the binomial. Set the normal μ and σ to the binomial mean and standard deviation:
      • μ = nπ and σ = √(nπ(1 – π))
    • Normal approximation to the Poisson
      • Works when λ is fairly large (greater than 20)
      • Set the normal μ and σ to the Poisson mean and standard deviation:
      • μ = λ and σ = √λ
  • 7.6: Exponential Distribution
    • Focus on the waiting time between arrivals (Poisson distribution) of the next event, a continuous variable
    • Exponential Distribution
      Parameter λ = mean arrival rate per unit of time or space (same as Poisson mean)
      PDF f(x) = λe-λx
      CDF P(X ≤ x) = 1 – e-λx
      Domain X ≥ 0
      Mean 1/λ
      Standard Deviation 1/λ
      Shape Always right skewed
    • Exponential waiting times are often called mean time between events (MTBE)
      • MTBE = 1/λ
      • 1/MTBE = λ
  • 7.7: Triangular distribution
    • Useful for what if simulations

Chapter 8: Sampling Distributions and Estimation

  • 8.1: Sampling and Estimation
    • sampling variation → when a sample doesn’t represent the population well – variation is inevitable, but there is a tendency for the sample means to be close to the population mean
    • This is the basis for statistical estimation
    • Can make inferences about the population based on the behavior of the sample mean and other statistical estimators, taking into account 4 factors:
      1. sample variation (uncontrollable)
      2. population variation (uncontrollable)
      3. sample size (controllable)
      4. desired confidence in the estimate (controllable)
    • Estimator → a statistic derived from a sample to infer the value of a population parameter. An estimate is the value of the estimator in a particular sample.
    • Sampling Error → difference between an estimate and the coreesponding population parameter. → x – μ
    • Properties of estimators
      • Bias → the difference between the expected value of the estimator and the true parameter. → E(X) – μ
      • sampling error is random, bias is systematic
      • Efficiency → variance of the estimators sampling distribution
      • preference for the minimum variance unbiased estimator (MVUE)
      • Consistency → a consisten estimator converges toward the parameter being estimated as the sample size increases
  • 8.2: Central Limit Theorem
    • Sampling distribution of an estimator is the probability dist. of all values of the statistic calculated from all possible random samples of size n.
    • standard error of the mean → the sampling eerror of the sample mean can be described by its standard deviation
    • σx = σ/n
    • If population is normal then the sample mean has a normal dist. centered at μ and a standard error of σ/√n
    • Central Limit Theorem
      • Allow us to approximate the shape of the sampling distribution of X when we don’t know what the population distribution looks like.
      • Even if population is not normal, if the sample size is large enough the sample mean will have approx. a normal dist.
      • Mean approaches centered at μ and a standard error of σ/√n
      • n ≥ 30 needed
  • 8.3: Confidence interval for a mean &mu: with known σ
    • A sample mean x calculated from a random sample is a point estimate of hte unknown pop mean μ – now we indicate our uncertainty of about the estimate using interval estimates. We construct a confidence interval for the unknown mean μ by adding or subtracting a margin of error from x. Expressed as a percentage.
    • x ±za/2 σ/√n where σ/√n is the standard error of the mean
    • Greater confidence implies the loss of precision – as confidence interval is wider, there are wider possible range of μ
  • 8.4: Confidence interval for a mean &mu: with unknown σ
    • Use the Student’s t distribution, not the normal z distribution.
    • x ±ta/2 s/√n where s/√n is the estimated standard error of the mean
    • t distribution has slightly lower peak at the mean of a normal distribution, and thicker tails
    • degrees of freedom → can calculate when we know the sample size (abreviated d.f.) → sample size minus 1 → d.f. = n – 1
    • t is always larger than z
  • 8.5: Confidence interval for a proportion (π)
    • CLT applies – sample proportion p = x/n is a consistent estimator of π
    • As sample size increases the dist. of the sample proportion p = x/n approaches a normal distribution with a mean π and a standard error:
    • σp = √(π(1 – π)/n)
    • Sample proportion p = x/n may be assumed normal when the sample has at least 10 “successes” and at least 10 “failures” – i.e. when x ≥ 10 and n – x ≥ 10
    • Confidence interval for π → p ± za/2 √(p(1 – p)/n)
    • Rule of Three → If in n independent trials, no events occur, the upper 95 percent confidence bound is approximately 3/n
  • 8.6: Estimating from Finite Populations
    • Finite population correction factor (FPCF) → √(N – n/N – 1)
    • reduces the margin of error and provides a more precise interval estimate
  • 8.7: Sample size determination for a mean
    • Formula to determine the sample size needed for a desired margin of error E to estimate μ:
    • n = (/E)2
    • Four ways to estimate σ
      • Take a preliminary sample
      • Assume uniform population
      • Assume normal population
      • Poisson Arrivals
  • 8.8: Sample size determination for a proportion
    • Formula to determine the sample size needed for a desired margin of error E to estimate π:
    • n = (z/E)2 π(1 – π)
    • Three ways to estimate π
      • Take a preliminary sample
      • Assume π = .50
      • Use a prior sample or historical data
  • 8.9: Confidence interval for a population variance, σ2