Data Science and Computing with Python for Pilots and Flight Test Engineers
Introduction to Statistics
Introduction
Knowing and being able to apply statistical methods is important for anyone working with data and dealing with measurements.
Real-world measurements always contain errors and uncertainties from a variety of sources, ranging from systematic biases, to random measurement inaccuracies from the instrument, to outliers due to a measurement or operator failure. This is why data always come with error bars, indicating the extent of these inaccuracies (if known and correctly recognized).
Dealing with these uncertainties is an everyday occurrence for test pilots and flight test engineers and requires the knowledge of statistics.
In the Statistics Part of this course, we follow at first Chapter 6 of the National Test Pilot School (NTPS) textbook Math & Physics for Flight Testers, Professional Course, Volume 1, National Test Pilot School, Mojave, California, October 2021. Specifically, we develop code to solve many of the problems in the statistics problem sets of that book, which NTPS also teaches in the second week of their course T&E 4001 Fundamentals of Flight Test. Such problems also comprise the end-of-course statistics exam of T&E 4001. (Nota bene, T&E 4001 teaches much more than just statistics.)
There are three of reasons why we do this. First of all, NTPS has made a good selection of the topics useful for flight testers. Second, we would like to refer the reader to the aforementioned Volume 1 of NTPS’s textbook series to read up on the statistics theory. While we give brief statistics theory reviews in our lessons to explain what we code, these reviews are not necessarily intended as a pedagogical introduction for someone who has never heard of any of the material. Our course is primarily about implementing things in code (including this statistics part of it), and using this as motivation to pick up some underlying theoretical knowledge as we go. Third, after acquiring these statistics basics, we would like to press on and focus more on some advanced aspects of statistics, e.g. some of those relevant for flight controls and autonomous flight.
With our complementary approach, we achieve a lot of synergy. There is no point in recreating all the instructional texts in statistics, which NTPS has already created and kindly makes available publicly in this case. At the same time, NTPS students are not explicitly taught in T&E 4001 how to implement any of it in code, which is exactly what we do here. Therefore, our statistics lessons here may be of considerable interest also to NTPS students taking T&E 4001 – if you want to be able to solve most of the problems on the final course exam with one line of code, you have come to the right place! We demonstrate and illustrate, how to do so, in our Solutions to NTPS Statistics Problem Sets lesson.
After having gotten a great synergistic start into the basics of statistics, we proceed to explore some statistics topics of our own (especially computational ones). Examples include Kalman filters for parameter estimation and object tracking (which we will use in control theory later for vehicle state estimation), and Markov Chain Monte Carlo (MCMC), an algorithm designed to sample from probability distributions, for model parameter inference from measurements.
Code
All the lessons in the Statistics Part of this course assume that you have imported the following modules at the beginning of your Jupyter notebook. This will not be stated at the beginning of each lesson (only if additional modules are needed, this will be specified).
import math
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as st
from scipy.optimize import curve_fit
Throughout the introductory statistics lessons, we will write code to create the following statistics classes of our own: DataStatistics, ProbabilityTables, NormalDistribution, Student_t_Distribution, Chi2_Distribution, NonparametricTests, TestOfNormality, SampleSize, CEP, Chauvenet_Criterion
In some of the lessons, we will run some code right away to illustrate the usage of the methods in these classes. For the most part, however, we will just define the classes in the lessons and create their methods. We will then use all of them in one big problem solving lesson, working through many of the problems of the statistics problem sets in the aforementioned NTPS textbook (Math & Physics for Flight Testers, Volume 1, Chapter 6). It is in that lesson that the code is brought to use, with most of the previous lessons before creating the necessary tools.
Not only is such an approach pedagogically transparent, it also introduces the student casually to a practical application of object-oriented programming using classes in Python, which is one of our tangential computer programming learning goals in this course.