 Open Access
 Authors : Kajal Sheth, Dhvanil Patel, Gautam Swami
 Paper ID : IJERTV13IS060063
 Volume & Issue : Volume 13, Issue 06 (June 2024)
 Published (First Online): 22062024
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Strategic Insights into Vehicles Fuel Consumption Patterns: Innovative Approaches for Predictive Modeling and Efficiency Forecasting
Kajal Sheth, New York Tech, Dhvanil Patel, Texas A&M University,
Gautam Swami, Tulane University
Abstract This study explores significant trends in fuel efficiency and carbon emissions within the passenger vehicle sector, utilizing extensive data from the U.S. Department of Energy and Environmental Protection Agencys fueleconomy.gov. Analyzing vehicle data from 1984 to the present, we identified key patterns in milespergallon performance, tailpipe CO2 emissions, and economic impacts of vehicle efficiency. Our research highlights advancements in vehicle technology and shifts in consumer choices, underscoring their implications for environmental policy and sustainable transportation strategies. Employing advanced statistical analyses and machine learning in R and visualizations in Tableau, this study provides insights that support informed policymaking and strategic decisions in the transportation sector.
KeywordsFuel efficiency; Carbon emissions; Transportation sector; Environmental policy; Predictive analytics

INTRODUCTION
The transportation sector is a significant contributor to global carbon emissions, accounting for nearly a quarter of direct CO2 emissions from fuel combustion [1]. The urgency to reduce these emissions has intensified under the pressures of climate change and environmental degradation. Advances in vehicle technology have shown promise in improving fuel efficiency and reducing emissions, yet the pace of improvement and adoption varies widely across vehicle types and regions. This study leverages a comprehensive dataset from the U.S. Department of Energys official fuel economy information portal, fueleconomy.gov [2], to examine the evolution of fuel efficiency and carbon emissions within the passenger vehicle sector.
Despite abundant data on vehicle performance, there remains a gap in understanding the longterm trends and their implications for policy and consumer behavior. Most existing studies focus on shortterm impacts or specific vehicle types, lacking a holistic view of the transportation sectors progress toward sustainability goals. This research aims to fill this gap by systematically analyzing fuel consumption patterns and emission trends over several decades, highlighting the technological and regulatory shifts that have influenced these trends. Specifically, the study seeks to identify key trends in fuel efficiency improvements, analyze the variance in emissions among different vehicle classes, and assess the economic implications of evolving fuel economy standards.
Using data curated and maintained by the U.S. Department of Energy's (DOE) Office of Energy Efficiency and Renewable Energy and supplemented by the U.S. Environmental Protection Agency (EPA), this analysis spans a broad spectrum of vehicles from 1984 to the present. The dataset includes
detailed metrics such as milespergallon in various driving contexts, tailpipe CO2 emissions, and potential cost savings across diverse vehicle classes, including sedans, SUVs, and trucks [2]. This rigorous dataset, derived from standardized laboratory tests, provides a reliable foundation for comparing vehicle performance equitably. By exploring these comprehensive data, the study aims to provide strategic insights that can inform both policymaking and consumer decisions, ultimately supporting the DOE and EPA's mandate under the Energy Policy Act of 1992 to deliver precise and actionable fuel economy data to the public [1].

METHODS

Exploratory Data Analysis
In our exploratory data analysis (EDA), we thoroughly examined the dataset, which consists of 43,156 entries across 83 distinct attributes. This extensive dataset provided a robust framework for our analysis. Initially, we focused on generating summary statistics for key variables that are critical to our research questions. These statistics offered initial insights into the data's central tendencies and variability, setting the stage for a deeper, more focused analysis. The subsequent section presents these summary statistics, detailing the measures of central tendency, dispersion, and the distribution of these selected variables.
Fig. 1. Summary Statistics for EDA
To better understand the distribution of our categorical variables, namely Make, FuelType1, FuelType2, Supercharger, Turbocharger, and Number of Cylinders, we conducted a frequency analysis. Frequency tables were generated to depict the occurrence of each category within these variables, providing a clear visual representation of the data distribution.
table i. Frequency Table for FuelType1
FuelType1
Freq
Regular Gasoline
28,722
Premium Gasoline
12,791
Diesel
1,196
Electricity
257
Midgrade Gasoline
130
Natural Gas
60
table ii. Frequency Table for FuelType2
FuelType2
Freq
E85
1,477
Electricity
206
Natural Gas
20
Propane
8
table iii. Frequency Table for Number of Cylinders
Number of Cylinders
Freq
4
16,809
6
14,858
8
9,216
5
774
12
668
3
317
10
179
2
61
16
14
The frequency tables provide a straightforward means to examine the distribution of categorical variables within the dataset. For instance, the data reveal that the most common engine types in vehicles are 4 and 6 cylinders, as illustrated in Table III, with 14 vehicles equipped with 16 cylinders, highlighting the rarity of such configurations.
Following the analysis of categorical data, we proceed to examine continuous variables through visual means. Box plots have been utilized to depict the distribution of the Miles Per Gallon (MPG) attributes, tailpipe emissions, and fuel consumption rates. These visualizations are essential for understanding the spread and central tendencies of these variables, as well as for identifying outliers that may influence subsequent analyses.
Fig. 2. Spread of MPG Attributes
Fig. 3. Spread of Tailpipe Emissions
Fig. 4. Average Fuel Economy by Fuel Type
From the spread of the MPG attributes, we notice that the mean values of City MPG, Highway MPG, and Combined MPG values are very close to each other and are around 20 miles per gallon. The spread of tailpipe emissions shows the mean emissions in the dataset are around 400 grams per mile, and the
average number of barrels used for fuel consumption is around 15.
The other variables we use in our dataset are the Model and the Make of the vehicles. The chart below shows the top manufacturers by the number of models. The top three model manufacturers in the dataset are MeredesBenz, BMW, and Chevrolet. Cadillac, Suzuki, and Subaru have the least number of models in the dataset.
Fig. 5. Top Manufacturers by Count of Model
Two other key parameters for the analysis are the average MPG and the primary fuel type. Therefore, we plot the yearly trend of average MPG by fuel type. Intuitively, this trend is indicative of the improvement in efficiency in the transportation sector and vehicle technology in general. The results are shown below.
Fig. 6. Average Fuel Economy by Fuel Type
As can be seen in the image above, gasoline and electric vehicles show the maximum improvement in fuel economy, while diesel and CNG vehicle show fluctuations. Although the scale on the y axis is relative for each graph, in general, electric vehicles have the highest fuel economy, followed by CNG vehicles, diesel vehicles and lastly, the gasoline vehicles showing the lowest fuel economy. This conforms with the current trends in the transportation sector.

Data Manipulation and Cleaning
The raw dataset contained 83 variables, out of which only 21 were relevant for the purpose of this analysis. Therefore, these were selected and stored in a new data frame. The volume of the vehicle was a parameter of interest for analysis, so six
attributes for luggage and passenger volumes were summed together to get the total volume of all the vehicles. Therefore, the final dataset used for all the analysis contains 43,023 records and 16 variables. It was decided to keep all the records, not just the distinct ones since some manufacturers release models with the same name for multiple years.
Next, the categorical variables were converted to factors for the analysis. These include Make, Model, FuelType1, Fueltype2, Supercharger, and Turbocharger.
Next, the empty cells in various attributes were replaced with 0 or black values to ensure consistency in data. The summary statistics after cleaning the data are shown below:
Fig. 7. Summary Statistics after Data Manipulation
Next, the following 2 categorical variables were grouped together for narrowing down the scope of the analysis. This was done using regular expressions. The change in FuelType1 is described below:
table iv. Cleaning of FuelType1 Variable
Unclean
Clean
Regular Gasoline
Gasoline
Premium Gasoline
Diesel
Diesel
Natural Gas
Natural Gas
Electricity
Electricity
NA
Midgrade Gasoline
NA
The vehicle Class variable was also grouped from 34 unique values down to 5. These new levels are Other, Cars, Vans, Pickup Trucks, and SUVs.


STATISTICAL ANALYSIS

Efficiency of superchargers vs Turbochargers
This part aims to investigate the impact superchargers and turbochargers have on IC engine vehicles. Both these components are used to increase the combustion efficiency of the engine and increase the power output. While the supercharger draws power directly from the crankshaft, the turbocharger uses the exhaust gases created by combustion to power the compressor. This is an inferential type question which compares 2 independent attributes.
For the purpose of the analysis, the vehicles with either a turbocharger or a supercharger are filtered. The summary statistics are shown below:
table v. Summary Statistics for Superchargers and
TurboCharger
SuperCharger
N
Mean
Median
SD
True
False
7822
22.23907
22
4.696939
True
True
74
23.93243
25
3.897070
False
True
831
18.21540
18
3.193450
Turbochargers
Fig. 8. Comparison of Superchargers and Turbochargers
It can be ovserved that there is some intersection between the 2 attributes. The intersecting records are eliminated to ensure that both the attributes are independent. The MPG values can be assumed to be normally distributed and hence we can use ttest to run hypothesis test. For this test, the hypotheses are:

H0: Turbocharger = Supercharger (i.e. there is no difference in average fuel economies)

Ha: Turbocharger Supercharger (i.e. there is a difference in average fuel economies)
The results of the ttest are shown below:
Fig. 9. TTest on Superchargers and Turbochargers
Due to the extremely small pvalue, the null hypothesis is rejected. Therefore, there is significant statistical evidence that fuel economy is different for vehicles with superchargers and turbochargers. Out of the two, vehicles with turbocharger have significantly better fuel economy.


Fuel economy comparison
This section aims to investigate the difference in fuel economies for different classes of vehicles such as cars, vans, pickup trucks, SUVs, and others. This analysis is performed on IC engine vehicles only to ensure comparison between similar entities. This question aims to check what effect the utility of the vehicle has on the MPG. This is an inferential type question
which compares the means of a common attribute between different groups.
Vehicles with Fueltype1 as Gasoline, Diesel, CNG and Fueltype2 as E85, Propane, CNG are filtered. This ensures no hybrid EV or battery EV is included for the analysis. The summary statistics are shown below:
table vi. Summary Statistics for FuelType1
Class
N
Mean
Median
SD
Cars
22809
22.42799
21
5.216095
Other
4632
18.25712
18
4.951738
Pickup Trucks
6136
16.69801
16
3.099634
SUV
6625
19.55894
19
4.14002
Vans
2306
15.55421
15
2.739264
Fig. 10. Comparison of MPG by Vehicle Class
From the figure above, we can see that the average MPG are not significantly different, however, it is important to note that the data contains outliers, which may skew the observation. The data is independent and can be assumed to be normally distributed, therefore, we will use ANOVA method to statistically model the data and test the hypothesis. For this test, the hypotheses are:

H0: Vans = SUV = Trucks = Cars = Others (i.e. all means are the same)

Ha: Vans SUV Trucks Cars Others (i.e. at least some of the means are different)
To perform the ANOVA test, a linear regression is performed, and an ANOVA table is created using the anova function. The result is shown below:
Fig. 11. ANOVA table
Due to the extremely small pvalue returned, the null hypothesis is rejected. Therefore, there is significant statistical evidence that fuel economy is different for different classes of IC engine vehicles. This difference throws light on the relation
between MPG and the load carrying capacity of the vehicle as well. Cars, which are mainly passenger vehicles show the highest MPG, whereas, trucks and vans, whic are mainly cargo vehicles show the lowest MPG values.


Engine Displacement and Vehicle Volume Correlation
This part aims to find out if there is a correlation between the engine displacement and the volume of the vehicle. Before performing a correlation test on these variables, we choose all the vehicles which are not electric vehicles and consider a vehicle volume of more than 20 cubic feet but less than 230 cubic feet to eliminate outliers. The null values in the engine displacement attribute are also ignored.
The summary statistics for the attribute vehicle volume show a total of 22788 rows with a mean of 120.23 cubic feet, standard deviation of 36.36 and a median vehicle volume of 111 cubic feet. These summary statistics are visualized using a box plot as shown below
Fig. 12. Box Plot for Vehicle Volume
Assuming the two attributes are not normally distributed and hence we use the Kendall Rank Correlation Test. The results of the test are shown below
Fig. 13. Kendall's Rank Correlation Test
Tau is the Kendall correlation coefficient. The p value for the test is 0.1385 and the tau value is – 0.0068. Since the value of tau is very close to zero, the two variables have no correlation. A scatter plot between the two confirms the same as shown below
Fig. 14. Correlation between Engine Displacement and Vehicle Volume
The above scatter plot is made using the ggpubr library function ggscatter which displays the correlation coefficient and the significance level on the plot. A regression line is added to the chart shown in red and the points are grouped by the volume of the vehicle.

Forecast of tailpipe CO2 emissions
As of 2018, the transportation sector in the US accounted for
~28% of the total greenhouse gas (GHG) emissions. While electric vehicles (EVs) are expected to be widely adopted in the near future, it is necessary to reduce the tailpipe emissions from internal combustion (IC) vehicles in the meantime. Thus, the aim of this question was to analyze the trend in the tailpipe CO2 emissions from 1984 until 2020 and forecast the emissions until 2030.
To perform this analysis, autoregressive integrated moving average (ARIMA) forecasting method was used. ARIMA models are widely used in time series forecasting. In this method, the lag in time series as well as the lagged forecast errors are used to predict future values based on the past values of a time series. Since EVs do not have CO2 tailpipe emissions, they were filtered out the data frame using the function filter. Makes with 5 highest count of vehicles in the raw data set were considered. These included Chevrolet, Ford, Dodge, GMC and Toyota. An average of tailpipe emissions was taken for each year (19842020) using the aggregate function. This was followed by converting the average emissions by year into a time series format with a frequency of 1 year using the ts function from the forecast package. The auto.arima function was used on the time series. The auto.arima command was used as it returns the best ARIMA model, rather than creating a custom ARIMA model. Finally, the forecast function was used to get the predicted tailpipe emissions for 10 years and this information was plotted.
Fig. 15. Tailpipe CO2 Emissions Forecast for Top 5 Manufacturers
As seen from the plots above, Toyota and Ford have a decreasing trend in tailpipe CO2 emissions, while Chevrolet, Dodge, and GMC have a forecast that stays constant until 2030. Out of all the vehicles considered, Toyota has the least GHG emissions, with the forecast expected to reach ~230 grams/mile by 2030. This is followed by Ford, who is expected to reach ~350 grams/mile. Chevrolet, Dodge, and GMC are forecasted to have CO2 tailpipe emissions in the range of 450
500 grams/mile by 2030. It should be noted the data set only considers new models released every year. Thus, it can be concluded that the new cars released by Toyota are cleaner on average compared to other makes considered in this analysis.

City MPG and Highway MPG values for Gasoline vs Electric Vehicles
For this, we wanted to find out if there is a difference between the city and highway MPG values for gasoline and electric vehicles in the dataset. To analyze this, we choose only the Toyota vehicles to represent the gasoline vehicles and Tesla represents the electric vehicles in the dataset. The city and highway MPG values for gasoline and electric vehicles are represented by a ratio of the two to be able to account for both the factors. The statistical analysis is performed on a random sample of 100 gasoline vehicles and the first 100 electric vehicles. The summary statistics are shown below
table vii. Summary Statistics for Gasoline and EV MPG values
Group
Count
Mean
SD
EV_ratio
100
1.0014044
0.07194879
Gas_ratio
100
0.7903093
0.07833846
From the summary statistics shown in the table above, we notice the mean MPG ratio for electric vehicles is higher than the mean MPG ratio for gasoline vehicles. This is shown in the density chart below
Fig. 16. Density plot for Gasoline and EVs MPG Ratios
It can be noticed from the plot above that the mean MPG ratio for electric vehicles is higher than that for gasoline vehicles.
To perform the analysis for this question, we use the Ftest and the Ttest to check if the two groups have the same variances and if there is a significant difference between gasoline and electric vehicle MPG ratios.
The Ftest is done to check if the two populations have the same variances. The hypotheses for this test are shown below

H0: Â² (Ratio of City MPG/Highway MPG for Gasoline) = Â² (Ratio of City MPG/Highway MPG for Electric Vehicles) [The variances of the two groups are the same]

Ha: Â² (Ratio of City MPG/Highway MPG for Gasoline) Â² (Ratio of City MPG/Highway MPG for Electric Vehicles) [The variances of the two groups are different]
The Ttest is done to check if the two populations have the same means. The hypotheses for this test are shown below

H0: (Ratio of City MPG/Highway MPG for Gasoline) = (Ratio of City MPG/Highway MPG for Electric Vehicles) [The means of the two groups are the same]

Ha: (Ratio of City MPG/Highway MPG for Gasoline) (Ratio of City MPG/Highway MPG for Electric Vehicles) [The means of the two groups are different]
The results of the Ftest and Ttest performed are shown below
Fig. 17. FTest Results for Gasoline vs EV MPG Ratios
Fig. 18. Ttest Result for Gasoline vs EV MPG Ratios
The pvalue of the Ftest is 0.3987, which is greater than the significance level alpha = 0.05. In conclusion, there is no difference between the variances of gasoline and electric vehicle ratios. Therefore, we can use the Ttest, which assumes equality of variances. The Ttest result gives a pvalue less than 0.05, from which we can conclude that the ratios of the two groups are significantly different. Thus, we can also infer that the variability in the City MPG and Highway MPG values for Electric Vehicle is low compared to that of gasoline vehicles.


The trend of new vehicle models each year by fuel type
While gasoline has been the most popular fuel when it comes to passenger vehicles, the need to reduce GHG emissions in the transportation sector has led auto makers to diversify their portfolios to include cars that use alternative fuels. The aim of this question was to study the trend of vehicles by fuel types from 19842020.
To perform this analysis, the fuel type of each vehile was identified. The fuel types included in this analysis were gasoline, hybrid electric vehicle (HEV), battery electric vehicle (BEV), diesel, natural gas and hybrid. Since the original dataset contained 2 variables for fuel types depending on whether the vehicle runs on 1 or 2 fuels, the mutate function was used to create a new column in the dataset which determined the fuel type of the vehicle. In this column, if the fuelType1 variable
was gasoline and fuelType2 was empty, the result in the new
column would return gasoline.
Similar code was written for diesel and natural gas. If the fuelType1 variable was gasoline and fuelType2 was electricity, then the result in the new column would return HEV. If the fuelType1 variable was electricity, then the result in the new column would return BEV. If the fuelType1 variable was gasoline and fuelType2 was E85 or propane or natural gas, then the result in the new column would return Hybrid. This was done using the if_else function which checks the logical condition of the inputs.
The data for year 2021 was filtered out as it did not have enough data points and thus could have misled the result. The count of vehicles was found using the tally function and each fuel type was filtered and plotted over time to see the trend. A linear regression line was added for gasoline, HEV and BEVs for further analysis.
Fig. 19. Count of Vehicle Type with Model Year and the Linear Model
As seen from the plot, vehicles were fueled by either gasoline or diesel until the early 1990s. After 2000, the portfolio of automakers widened in terms of the fuels used. Alternative fuels for vehicles were introduced around this time. From the plot, a steep increase in the number of HEVs and BEVs can also be seen. The regression line shows that there is a steep increase in the new models of HEVs and BEVs while new gasoline vehicles are on the decline. This shows the big picture in the auto industry which suggests a shift towards electrification and the use of cleaner fuels in general.

Groups based on Average MPG and Fuel Savings
This analysis was performed using kmeans clustering which a machine learning method. Kmeans is a centroidbased clustering method which groups data based on distances to a point. Tableau Desktop was used to perform this analysis due to its ease of use and strong data visualization features. The variable youSaveSpend was placed in the Columns shelf and comb08 was placed in the Rows shelf. Averages of both were taken. Vehicle makes was placed in Label and Detail tabs under the Marks section. This showed the vehicle makes based on their average MPG and average fuel savings. Finally, a cluster was added to this model from the Analytics tab. Reference line showing averages of both variables was added to see which vehicle makes are above or below the average.
Note: The youSaveSpend variable stands for 5year savings/spendings compared to an average car. Negative savings indicate that money spent as opposed to money saved.
table viii. Summary Diagnostics for Cluster Analysis
Number of Clusters:
6
Number of Points:
138
Betweengroup Sum of Squares:
5.1752
Withingroup Sum of Squares:
0.45609
Total Sum of Squares:
5.6313
table ix. Cluster Analysis Results
Clusters
Centers
Number of Items
Average MPG
Average Savings ($)
Cluster 1
2
9.3929
16589.0
Cluster 2
43
16.928
5015.5
Cluster 3
64
21.915
1948.0
Cluster 4
23
13.188
9340.2
Cluster 5
5
63.058
1076.9
Cluster 6
1
102.64
2743.0
Not Clustered
0
Fig. 20. KMeans Clustering for Avg. MPG and Savings by Vehicle Make
6 clusters were formed to group distinct characteristics of the data which also helped identify some of the outliers. Cluster 1 consists of only 2 vehicle makes and this represents makes which have the least average MPG and the lowest savings. On the other hand, cluster 6 consists of only 1 make, Tesla and represents the vehicle make with the highest MPG as well as the most savings over 5 years. Cluster 5 includes makes which mainly have EVs in their portfolio. Cluster 3 has some of the popular makes such as GM, Fiat, Nissan etc. which have decent mileage and are not very expensive to maintain either.
Cluster 2 includes makes such as BMW, Porsche, Audi, etc. which have some highperformance cars in their portfolio. Cluster 4 includes makes such as Lamborghini, Pagani, Bentley, etc. which have low mileage cars due to their high horsepower engines and are also expensive to maintain on average.


CONCLUSIONS
This study has meticulously analyzed the EPA fuel economy dataset to discern key trends and changes within the transportation sector over an extended period. The data, notably clean with minimal inconsistencies, facilitated a robust analysis of fuel economy trends, primarily focusing on miles pergallon (MPG) distributions and tailpipe CO2 emissions.
Our findings indicate that MPG values generally follow a normal distribution, skewed right due to outliers represented by electric vehicles, which exhibit exceptionally high MPG. These outliers were selectively filtered from the analysis to maintain focus on internal combustion engine vehicles. Statistical and machine learning models were employed to test hypotheses and yielded statistically significant results, underpinning the robustness of our analytical methodologies.
Key insights from the study include:

Vehicle Performance: Vehicles equipped with turbochargers generally demonstrated better fuel economy compared to those with superchargers.

Class Comparison: Fuel economy varied significantly across different vehicle classes, inversely related to their loadcarrying capacities. Passenger cars showed higher MPG compared to heavier vehicles like vans and trucks.

Engine and Volume Correlation: There was a negligible correlation between engine displacement and vehicle volume, with a slightly negative trend.

Fuel Type Trends: The analysis of fuel types showed that electric vehicles consistently outperformed others in terms of fuel economy, aligned with global shifts towards more sustainable vehicle technologies.
Additionally, the forecast for tailpipe CO2 emissions indicated a declining trend for manufacturers like Toyota and Ford, highlighting industryleading practices in emissions reduction.
However, some manufacturers like Chevrolet, Dodge, and GMC showed little to no reduction, pointing to areas where further improvements are necessary.
The transportation sector's shift towards Battery Electric Vehicles (BEVs) and Hybrid Electric Vehicles (HEVs) represents a promising trend towards reducing greenhouse gas emissions. This shift is further evidenced by the growing proportion of these vehicles each year, contrasting with a decline in conventional gasoline vehicles.
Lastly, our cluster analysis revealed that electric vehicles, particularly those from manufacturers like Tesla, not only offer superior fuel economy but also present considerable savings over five years ompared to average internal combustion engine cars.


REFERENCES

US EPA, Office of Air and Radiation. (2015, December 29). "Sources of Greenhouse Gas Emissions." Overviews and Factsheets, US EPA. Available: https://www.epa.gov/ghgemissions/sourcesgreenhousegas emissions

U.S. Department of Energy. (n.d.). "Fuel Economy Web Services." Retrieved December 11, 2020, from https://www.fueleconomy.gov/feg/ws/index.shtml#vehicle

Selva, P. (2019, February 18). "ARIMA ModelComplete Guide to Time Series Forecasting in Python." ML+. Available: https://www.machinelearningplus.com/timeseries/arimamodeltime seriesforecastingpython/

Kassambara, A. (n.d.). "Correlation Analyses in R." Easy Guides WikiSTHDA. Retrieved December 11, 2020, from http://www.sthda.com/english/wiki/correlationanalysesinr

Kassambara, A. (n.d.). "Correlation Test Between Two Variables in R." Easy GuidesWikiSTHDA. Retrieved December 11, 2020, from http://www.sthda.com/english/wiki/correlationtestbetweentwo variablesinr

Kassambara, A. (n.d.). "FTest: Compare Two Variances in R." Easy GuidesWikiSTHDA. Retrieved December 11, 2020, from http://www.sthda.com/english/wiki/ftestcomparetwovariancesinr

Kassambara, A. (n.d.). "ggpubr: Publication Ready Plots." Easy GuidesArticlesSTHDA. Retrieved December 11, 2020, from http://www.sthda.com/english/articles/24ggpubrpublicationready plots/

Kassambara, A. (n.d.). "Unpaired TwoSamples Ttest in R." Easy GuidesWikiSTHDA. Retrieved December 11, 2020, from http://www.sthda.com/english/wiki/unpairedtwosamplesttestinr