Edit 2022/10/10: it seems that my data source was faulty. For comparison, here is the prediction done by Guido David’s student (visually edited to protect my sources):
One may simply visually inspect my own time series data sourced from DOH’s Google Data Studio to see discrepancies with the Masters student’s:
Notably, while the first two upsurges have relatively nearby peaks in the student’s data, the upsurges in my data are wildly different. Of course, some conclusions like no structural breaks being present (as an ARIMA by definition will not work across structural breaks) still hold, but a better analysis with better data awaits.
Original article:
The claim that “the methods of science are the only reliable ways to secure knowledge of anything” is not itself a scientific claim, nor something that can be established using scientific methods. Indeed, that science is even a rational form of inquiry (let alone the only rational form of inquiry) is not something that can be established scientifically. For scientific inquiry rests on a number of philosophical assumptions: the assumption that there is an objective world external to the minds of scientists; the assumption that this world is governed by regularities of the sort that might be captured in scientific laws; the assumption that the human intellect and perceptual apparatus can uncover and accurately describe these regularities; and so forth. Since the scientific method presupposes these things, it cannot attempt to justify them without arguing in a circle. To break out of this circle requires “getting outside” of science altogether and discovering from that extra-scientific vantage point that science conveys an accurate picture of reality – and, if scientism is to be justified, that only science does so. But then the very existence of that extra-scientific vantage point would falsify the claim that science alone gives us a rational means of investigating objective reality.
Edward Feser
A saying goes that with lies and damned lies come statistics. The field itself has developed many ways to prevent misuse of data and techniques. Hypotheses tests, robustness checks, diagnostics, among other practices take priority over mere point estimates. While high schoolers may scratch their heads over Excel ANOVA tests, good statisticians must apply many more tests to keep integrity in analysis.
In practice, however, statistics is misused and abused. The Pillar of Liberty has mentioned p-value hacking once, but the rot goes deeper. The distribution of residuals is ignored. The self-selection fallacy appears. Counterfactual1 estimates are more counterfeit than correct. Whether empirical research or industry, the misuse and abuse of statistics is dominant.
Some fields and professions fare better than others. Empiricist Economists surprisingly have a good knowledge of statistical techniques. Indeed, Econometrics (statistical techniques for Economics) has contributed to Statistical and Probability Theory as much as Mathematicians and Statisticians have. A combination of Austrian Economic theory and Econometrics would be a perfect alternative to the New Neoclassical Synthesis prevailing today. Other fields, and much of industry, however, need improvement. To give just one example: NEDA’s staff do not know what p-values are. They have no knowledge of impact evaluation techniques. This is from personal experience witnessing NEDA staff training: they do not have the knowledge or expertise at all to evaluate the impact of economic policies.
Government bureaucrat incompentence is one thing, which we all expect. Professors and so-called “scholars”, however, are a caste that have not yet received the tearing down that they deserve. Many of the working caste can already tell how much nonsense the regime’s priests try to feed them.
For this essay’s purposes, I will focus on OCTA Research, and specifically its Filipino co-founder Guido David. Some will remember that I had a disagreement with him on social media over the recent lottery scandal - which common sense dictates was no scandal at all. That incident, however, was only the entrance of a long rabbit hole about the undue privilege that professors get, and the mystique that their signs and symbols have.
An Anecdote
Dr Guido David once had a student who was taking his Master’s dissertation. Said dissertation involved forecasting COVID-19 cases using Artificial Neural Networks. Machine learning in general is a hot topic despite just being linear algebra, but that’s besides the point. The dissertation was hailed and praised by Guido David and many of the UP Diliman faculty, despite not amounting to anything against even BS Mathematics theses. The student, however, asked Guido David simple questions about time series analysis: autocorrelation functions, partial autocorrelation functions, stationarity, autocovariance, and so on. In fact, a conversation showed that Guido David manually uses lags in his estimations. Anyone with even the most basic time series knowledge knows that lags are supposed to be embedded in a model instead of being treated as exogenous regressors. While funny on the surface, this anecdote reveals deeper and darker truths about where our taxes go.
How we forecast
Many data, whether economic, financial, or indeed pandemic-related, are what we call time series data. Simply put, for each time period (year, month, day), we have a value assigned to it. Whatever the field, time series data can be reliably analyzed and forecasted (or projected) through rather simple methods. The simplest and most reliable method is what we call ARIMA. The AR stands for autoregression: we estimate how past data affects present ones. The MA stands for moving average: we estimate how a set value (the average) is affected by statistical noise for some time periods. The I stands for integrated. Explaining this last term is hard, so please read this instead2.
More complicated models and methods exist to project time series data, whether Machine Learning, Kalman Filters, Fast Fourier Transforms, and other big words that many readers will not understand. ARIMA, however, is as simple and reliable as it gets. For economic and financial data, ARIMA is good enough when exact pinpoint accuracy is not a concern. In fact, ARIMA can project inflation much better than overcomplicated methods (as a friend who is a higher up in the Bangko Sentral says). For physical and natural data, which is less complex than economic and financial data, ARIMA can easily do pinpoint accurate projections.
Technically speaking, an ARIMA model is an equation that relates some past data and some past noise to present ones. Depending on some statistical tests, we may say for example that for a year’s value, the best predictors are 2 times last year’s value plus 3 times the value 2 years ago plus 0.9 times statistical noise last year. We verify that this is the equation two ways: by certain information criteria, and by simply seeing if the forecast for next year (or month, or day, etc) is accurate. However, sometimes this equation may give good results for some time periods, and bad ones for others. This is what we call a structural break: when the underlying process that generates data changes.
A structural break in the COVID data is exactly what lockdowns aimed to do: flatten the curve. To see whether a structural break exists in the data, normally we use a statistical test called the Chow test. However, this relies on subjective choices of structural breaks. Instead, we use what we call the Bai-Perron test3 to see whether multiple structural breaks exist, computing them from the data itself instead of subjective picks. Here is code and data for those interested in trying out the Bai-Perron test for themselves in R:
data <- cases[30:712,]
install.packages("strucchange")
library(strucchange)
bp.covid <- breakpoints(data$NewCases ~ 1)
summary(bp.covid)
plot(bp.covid)
plot(data$NewCases)
lines(bp.covid)
ci.covid <- confint(bp.covid)
ci.covid
lines(ci.covid)
I have plotted whether breaks exist in the data:
There are no breakpoints at all. This shows that the lockdowns had no effect on “flattening the curve”. We may say that OCTA Research was very irresponsible in calling for more lockdowns, but at least we have observational data to confirm that lockdowns were of no use.
A lesson in forecasting and estimation
We note that when we estimate models, we should not use all the data. Doing so makes overfitting a danger. Overfitting is when our equation describes the data so well, that it no longer describes future events. Explaining why this happens is very technical, too much for this essay. Those interested can read this IBM blogpost instead4. Instead, I will just explain that statisticians often only use 80% of the data to estimate the equation. I have attached the forecasting code here, using the same data as above. Note that instead of taking the time and effort to do nitty gritty details, I automatically used the best ARIMA settings using information criterion.
library(forecast)
library(Metrics)
train = data[1:577,]
test = data[578:nrow(data),]
model = auto.arima(train$NewCases)
forecast = predict(model,nrow(test))
mape(test$NewCases, forecast$pred)
To evaluate our model performance, we use the Mean Absolute Percentage Error. This value gives us the percentage of accuracy against real values.
[1] 4.304365
We see that our model is very accurate in forecasting COVID cases, with only around 4.3% error at worst. A schlub with a mediocre gaming computer did in a short time what the government pays millions to OCTA Research to do, except they did it worse and with more hours. For those interested, I have attached the details of the ARIMA estimation here:
Series: train$NewCases
ARIMA(1,1,4)
Coefficients:
ar1 ma1 ma2 ma3 ma4
0.7306 -1.3208 0.2981 -0.0610 0.2423
s.e. 0.0570 0.0642 0.0798 0.0673 0.0433
sigma^2 = 1429396: log likelihood = -4897.42
AIC=9806.85 AICc=9806.99 BIC=9832.98
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 12.77429 1189.341 732.2575 -5.034779 21.96257 0.9011784 0.005116598
Advertisement: I sent the RCEP Analysis to the Office of the President. The Office of the President sent me a letter explaining that they sent it to NEDA. NEDA sent me a letter explaining that they sent it to the DTI. The DTI sent me a latter basically sidestepping the issues I raised, and instead told me to look a list of benefits. Interested readers may see it here:
Guido David’s credentials and ability
OCTA Research’s co-founder, Guido David, teaches in UP Diliman’s Institute of Mathematics. Even from a credentialist standpoint, however, his mettle to lead pandemic consultancy is found wanting. To begin with, he took his PhD in Biomedical Engineering5, not in Mathematics, Statistics, nor even Economics or Physics. While nothing says that he could not have dug deeper into Pure Mathematics or Mathematical Statistics, more evidence backs up how he has no business leading pandemic consultancy.
We must now turn to his list of papers6. Most of these involve computational models, with his Biomedical Engineering background standing out. A few (ie Markov Chain Solution to the 3-Tower Problem) are computational curiosities which, while indeed contributions to the literature, are not exactly Mathematical (proofs, not computations). His published Mathematical papers (ie Symmetric form of the von neumann poker model, A Variation of Discrete Silverman’s Game with Varying Payoffs) were collaborations with undergraduate students. Unfortunately, it is a common practice in Philippine universities for professors to attach their names to students’ papers submitted for publication. Guido David cannot be faulted here for following it, but that these were his only published Mathematical papers shines revealing light at who leads our pandemic consultancy.
Lastly, and most damning, Guido David has not shown that he has any experience or knowledge in Mathematical Epidemiology before proceeding to play the role of public authority through his group, and possibly not even since then.. He has revealed the group’s methodology in this Manila Bulletin article7. Note that they give undue focus on the R0 (“R naught”/basic reproduction number). He never says anything about standard comparmentalized models like the SIR model. For those who don't know, the SIR model is an equation that relates the change in those susceptible to infection, to new infections and the current number of infected individuals. It is a complicated topic, which interested readers may see here8. One must note that the R0 is a descriptive number. It describes the situation for a certain period of time: Guido David says that their lower end estimates cover a 3 day period. However, the R0 is an effect. It is not useful for prediction or policy, except as feedback that prediction and policy are working. OCTA Research has not revealed their exact model for predicting cases, nor have they revealed their exact methodology. We are left in the dark about what they do.
Those who know their Ordinary Differential Equations and Real Analysis may read Mathematical Epidemiology by Bauer, Drissche, and Wu for an introduction to the topic. OCTA Research’s staff may be incompetent and ignorant, but those willing to learn no longer have to be.
Priests of power
OCTA Research’s undue influence in dictating national policy comes on the back of incomptence, hand-waving, and black box methodolgies. Guido David is the prime example of obedience being valued over competence in the Managerial order. Theodore Kaszcynski says it best:
It is enough to go through a training program to acquire some petty technical skill, then come to work on time and exert the very modest effort needed to hold a job. The only requirements are a moderate amount of intelligence and, most of all, simple obedience. If one has those, society takes care of one from cradle to grave.
Industrial Society and its Future, Paragraph 40
How else could have OCTA Research risen from an obscure academic consultancy to the most prominent pundit on the pandemic? OCTA has grown from small roots to a large tree overnight, its branches entering the airs of election polling, economic forecasting, financial modeling, and so many fields. News outlets give it all time of day, with other private sector sources left in the dust. OCTA has become our own Mouth of Sauron, preaching the regime's gospels through esoteric dogma that is also blatantly heretical.
However, at day’s end Guido David, his colleagues, and other government-backed academics are mere priests of power. The real power lies in the bureaucracy and managers. I will end with this last advert: those who wish to learn more about Philippine society’s true nature will do well to read A Gentle Introduction to Pillar of Liberty when it releases.
https://people.duke.edu/~rnau/411diff.htm
Bai, Jushan, and Pierre Perron. "Estimating and testing linear models with multiple structural changes." Econometrica (1998): 47-78.
https://www.ibm.com/cloud/learn/overfitting
https://orcid.org/0000-0003-2057-7862
https://math.upd.edu.ph/faculty/david-fredegusto-guido
https://mb.com.ph/2021/08/09/octa-explains-science-behind-its-covid-19-pandemic-model-projections/
https://www.maa.org/press/periodicals/loci/joma/the-sir-model-for-spread-of-disease-the-differential-equation-model