By Jonathan Lee
The Rise of Big Data
When Mark Zuckerberg was considering a change in Facebook’s “timeline,” users were furious. Having grown used to a time-based formula in which status updates appear based on when they were posted, large groups of users were complaining about the new “news feed.” But the data didn’t lie. Under the new system, users were spending more and more time on the Facebook app. Why change it back when the data says what’s really going on?
Large tech companies have capitalized on expansive data sets more and more over the last two decades. For some companies like Facebook and Amazon, aggregating data defines their business model. Facebook naturally uses data analysis to change the way it organizes content. Amazon alters the way it displays products to customers through its recommendation engine. Thousands of companies target customers using search history, purchases, and time spent on applications – all data willingly given by users. The premise is simple: large volumes of data enable decision makers to make informed decisions – and often even allow artificial intelligence to do the same. As with many private sector innovations, this begs the question: what military applications exist?
The term “Big Data” first appeared in a 1997 NASA paper describing data sets that are “generally quite large, taxing the capacities of main memory, local disk, and even remote disk.”  With the evolution of supercomputing, the capacity to store, organize, and analyze large amounts of data has become mainstream. Today, however, the amount of data is less important than its utility. Aggregated data that provides insight into decisions is far more valuable than large quantities of data for the sake of size alone.
With the renewal of its contract with Palantir Technology’s Vantage in December of 2020, the Army is well positioned to take advantage of the analysis of big data.  Vantage enables leaders to analyze HR, medical, and readiness metrics prior to making decisions. The system proved useful throughout the last year during the Army’s fight against COVID-19. Vantage is thorough – it combines multiple systems to enable an understanding of each unit’s personnel readiness, medical readiness, and training. It enables commanders at every level to see their organization down to the individual soldier. Palantir further operates software designed to sift through years of aerial surveillance footage and analyze patterns in human behavior – an excellent use of data analysis.  Current Vantage data focuses on HR metrics, Defense Training Management System inputs, and medical records with the additional capacity to see the schools, awards, and past assignments that soldiers within an organization have held. This data is phenomenal for a commander looking for someone with unit movement officer qualifications or determining which company in a brigade has the highest percentage of soldiers qualified on particular weapon systems. However, it doesn’t currently facilitate detailed statistical analysis beyond aggregation of multiple systems into a platform that can be used to view readiness metrics.
With Vantage, the Army has the potential to expand its use of data to address a much larger scope of issues. The Army should focus its future data analysis endeavors on three key aspects: personnel, equipment, and training. This data should be managed at the battalion level to facilitate decision-making for personnel and talent management, maintenance, and training by commanders at every echelon.
The Army’s recently released Army People Strategy looks to build a more ready and resilient force proclaiming that “the Army’s greatest asset is our people.”  With that strategy, the need for data analysis on personnel is crucial to ensure that people get the training and resources that they need. While this may seem counterintuitive – treating individuals as data points may seem impersonal – the resulting outcomes more than justify the use of data. Data analysis on personnel can be used to develop more efficient marksmen, better pilots, and more athletic special operations forces.
The Army could train infantry soldiers to fire more accurately and pilots to fly more efficiently using biometric measurements such as those used by Olympic athletes.  Newly developed systems that attach to weapons facilitate analysis of individual marksmanship that could enhance the accuracy and lethality of the force. ACFT data has been – and can continue to be used – to evaluate strengths and weaknesses and allow commanders to tailor training to a soldier’s needs. Psychological evaluations using mental health data can also be used to reduce mental health issues within formations. Data driven analysis such as that conducted at the Battalion Commander Assessment program can be used to manage talent at every level as forcing leaders to compete will lead to better job placement based on skillset.
In his book Everybody Lies: Big Data, New Data, and What the Internet can Tell us about Who We Really Are, Seth Stephens Davidowitz discusses the use of search history and Google trends to predict illnesses, election results, and determine personal well-being. The Army is more than capable of pulling such data to gather overall trends for an organization, barracks, or base. For example, if the Fort Bragg zip code, 28310, were to see an uptick in google searches for “black mold” or “COVID-19 symptoms,” the garrison could predict future issues that it may need to address. After monitoring trends in Google search history within an area, garrison leaders can make informed decisions and predict issues based on intelligence that they wouldn’t otherwise have access to.
From an equipment standpoint, the Army already has a robust system of supply management that drives demand signals for many bench stock and shop stock programs. GCSS-Army, when used effectively, can allow personnel to rapidly locate on-hand parts to reduce equipment down time.  When used effectively over time, GCSS-Army can help identify the most in demand shop stock and bench stock to ensure that highly demanded parts are readily available. Using this program effectively allows decision makers to plus up parts that are frequently used – and allocate funding to do so. This directly contributes to a higher readiness rate for vehicles within an organization. Where the program could use improvement is through the use of predictive analytics based on environmental factors.
The Army is called to deploy – and conduct maintenance – in a wide variety of operational environments. With climates ranging from the Alaskan winter to summer in the Mojave Desert, it should be no surprise that certain parts break more often in different weather conditions. Take, for example, the strap pack – an essential component of the AH-64 Apache rotor that holds the blade onto the aircraft. In 2017 and 2018, the nut component of the strap pack was corroding and cracking in hot and humid climates. One Army Regiment in Hawaii, for example, was replacing more than half of its strap packs due to corrosion. Just a small percentage of strap packs made it to the end of their usable lives – an abnormally low number which all parties involved attributed to the corrosive environment. 
While enduring a series of adhoc replacements, the unit analyzed strap pack life span (Figure 2). That analysis gives clear indication that strap packs installed on the island had a significantly shorter lifespan than those installed prior to arrival on Oahu. The distribution shown below indicates just how corrosive the environment was to the strap packs as the life span was reduced significantly after exposure to the heat and humidity on the island – down to an average of 11-14 months. Some of the strap packs that failed on Oahu had been installed since 2010 before failing in 2017. These strap packs remained functional for 6 good years prior to arrival on the island upon which they became unusable due to corrosion within one year.
While the strap pack issue was occurring, the suppliers were confused as they had never seen the effects of such an environment on this essential aircraft component. Maintaining data based on environmental conditions would prepare the Army to conduct maintenance in all climates. Predictive analytics would allow units to decide which bench stock to plus up based on demand signals within certain environments, thereby producing a more mission capable force of ready vehicles and aircraft. If an Apache unit were deploying to the Baltics, for example, they could use such data to determine what maintenance problems may occur and conduct P4T3 on those problems prior to the deployment, potentially saving millions of dollars.
The use of data analysis can be used to facilitate safe and effective training that simulates real-world environments and allows units to improve through friendly competition. Already used to train staffs in warfighter exercises, simulations and data analysis can further be used at the company and platoon level. Looking to determine what company is the most efficient at clearing a building? Monitor the movements of each individual soldier as they clear it. Trying to determine which rifle company is most lethal? Evaluate their round impacts during a stress shoot. Evaluating units based on specific data points also facilitates competition, as each unit will strive to be the most effective at their METL tasks.
One specific area for improvement was recently published in the National Commission on Military Aviation Safety. The executive summary simply states “data can save lives.”  When it comes to safety in aviation training – and combat for that matter – predictive data “models future outcomes, such as determining a pilot’s likelihood of success completing a specific task.”  With sufficient safety data to include cockpit voice and flight data recordings as well as biometric sensing for aircrews, aviation safety can be significantly enhanced. Aviation units in the Army already maintain annual aircrew coordination requirements that use anecdotal evidence to teach aircrews about effective crew coordination. An investment in larger data sets could provide predictive analysis to aircrews that greatly reduces the likelihood of a mishap and draws away from the anecdotal approach to a more statistics-based methodology. Biometric data and data from previous accidents could inform training plans that create more effective, safer pilots. This in turn has the potential to drastically increase the quality of aviation training and reduce both training and combat related accidents.
For unit and team level aviation training, ensuring that pilots have the opportunity to fly in a radar contested environment and pulling that data would enhance competence as the threat of an integrated air defense system becomes a reality on the future battlefield. Evaluating crew and team capacity to avoid radar detection would give commanders an understanding of crew competence and facilitate decision making and crew selection on future missions.
The Use of Data
With all of this data available to drive decisions and influence demand signals, the question quickly becomes: who will manage it all? With company commanders and battalion staffs already saturated with a wide variety of administrative and training requirements, that creates an excellent question. To facilitate the analysis of data, the Army has a few options. It can create a new government service organization responsible for compilation and analysis of the data, it can create a course similar to the unit movement officer course and require a trained individual at each battalion, or it can create a new MTOE position at the battalion level similar to an FA49. To simplify the situation, Vantage could also be used to track and manage all of the data from the moment a soldier joins the Army or a piece of equipment is fielded. An in-processing soldier, for example, could stop at a single kiosk and input all of his or her information to preclude filling out a dozen such sheets at the installation, brigade, battalion, and company level. Likewise, soldiers could track maintenance on new equipment from the day it was fielded. Each option presents a unique challenge but facilitates the compilation, analysis, and employment of big data within battalions – an essential part of effective future data driven operations.
As data analysis maneuvers its way into maintenance demand signals and the personnel decisions of commanders at every echelon, it is important to also leave a note of caution into its potential for misuse. While using big data to spawn competition and evaluate subordinates through tasks such as marksmanship skills and aerial gunneries is effective, it is also ripe to be misused by evaluators. Take, for example, the operational readiness rate dilemma. Commanders often see their fates determined by their ability to achieve a certain readiness rate. Commanders seeking to get ahead from a statistical standpoint can simply never run certain vehicles or aircraft in order to maintain a higher operational readiness rate. While the intent of the readiness rate is to ensure that vehicles and aircraft are combat ready, the evaluation measures create a perverse incentive to not seek out faults on equipment. Commanders should be weary of such dilemmas as the use of big data becomes more prevalent. Data analysis in military operations presents an amazing opportunity for efficiency and effective decision-making, but the Army should also be wary of the incentives it creates.
Vantage offers an example of the readily available data for battalion and brigade-level units to conduct simple data analysis, provided the data is input and easy to access. Many older systems fail to synthesize data into usable form, either creating administrative burden or lost opportunity as units disregard data due to opportunity costs of use.  Armed with the capacity to collect and analyze larger, more relevant data in the future, the Army can make strides at managing talent and personnel, predicting maintenance in multiple operational environments, and facilitating safe and effective training in the future. The analysis of big data will help facilitate the Army’s People First initiative while continuing to drive a more efficient, safer force.
 Xiao Song, “Military Simulation Big Data: Background, State of the Art, and Challenges”, Mathematical Problems in Engineering, 2015, Accessed March 3, 2021, https://www.hindawi.com/ journals/mpe/2015/298356/.
 Lisa Gordon, “Army Vantage Reaffirms Palantir Partnership with $114M Agreement”, Bloomberg Business, December 21, 2020, Accessed March 4 2021, https://www.bloomberg.com/press-releases/2020-12-21/army-vantage-reaffirms-palantir-partnership-with-114m-agreement.
 Annie Jacobsen, “Palantir’s God’s-Eye View of Afghanistan”, Wired, January 20, 2021, Accessed March 4, 2021, https://www.wired.com/story/palantirs-gods-eye-view-of-afghanistan/.
 Madison Bonzo and Sara Hauck, “People First: TRADOC forges path through Operationalizing the Army’s People Strategy”, October 6, 2020, Accessed March 8, 2021, https://www.army.mil/article/239727/people_first_tradoc_forges_path_through_operationalizing_the_armys_people_strategy.
 Taylor Soper, “How Olympic athletes use machine learning and data analysis to reach peak performance levels”, GeekWire, August 4, 2016, Accessed March 8, 2021, https://www.geekwire.com/2016/olympic-athletes-use-machine-learning-data-analysis-reach-peak-performance-levels/.
 Major Justin L. Darnell, “A shop stock optimization system”, August 23, 2018, Accessed March 9, 2021, https://www.army.mil/article/210112/a_shop_stock_optimization_sytem.
 Dan Murphy, “Statistical Analysis of AH-64D Strap Pack Failure Trends in 2-6 CAV”.
 Richard A. Cody, “Military Aviation Losses FY2013-2020”, National Commission on Military Aviation Safety, December 1, 2020, vi.
 Ibid, 21.
 John Bolton, “Overkill: Army Mission Command Systems Inhibit Mission Command”, Small Wars Journal, August 29, 2017, Accessed March 24, 2021, http://smallwarsjournal.com/jrnl/art/overkill-army-mission-command-systems-inhibit-mission-command