How to Import, Manipulate & Visualize Data Using the tidyverse in R | readr, dplyr & ggplot2 Package

Statistics Globe
Statistics Globe
1.3 هزار بار بازدید - 8 ماه پیش - This video demonstrates how to
This video demonstrates how to import, manipulate, and visualize data using the tidyverse in the R programming language. The video is part of a teaser series for the Statistics Globe online course on "Data Manipulation in R Using dplyr & the tidyverse". More info: https://statisticsglobe.com/online-co...

Attribution: The data used in this video is taken from here https://www.kaggle.com/datasets/divya...

R code of this video:

install.packages("tidyverse")                   # Install tidyverse packages
library("tidyverse")                              # Load tidyverse packages

my_path <- "D:/Dropbox/Jock/Data Sets/"           # Specify directory path

tib_dest <- read_csv(str_c(my_path,               # Import CSV file
                          "Most_Visited_Destination_in_2018_and_2019.csv"))
tib_dest                                          # Print tibble

tib_dest %>%                                      # Class of data set
 class()

tib_dest %>%                                      # Show entire data set
 View()

tib_dest_new <- tib_dest %>%                      # Rename column
 rename(T2019 = `International  tourist  arrivals  (2019)`)
tib_dest_new                                      # Print updated tibble

tib_dest_new2 <- tib_dest_new %>%                 # Remove certain columns
 select(- ...1, - `International  tourist  arrivals  (2018)`)
tib_dest_new2                                     # Print updated tibble

tib_dest_new3 <- tib_dest_new2 %>%                # Replace values
 mutate(across(everything(), ~ replace(., . == "–", NA)),
        T2019 = as.numeric(str_replace(T2019, " million", "")) * 1e6)
tib_dest_new3                                     # Print updated tibble

tib_dest_new4 <- tib_dest_new3 %>%                # Remove NA rows
 na.omit()
tib_dest_new4                                     # Print updated tibble

tib_dest_new5 <- tib_dest_new4 %>%                # Remove duplicate row
 filter(Destination != "Egypt" | Region == "Africa")
tib_dest_new5                                     # Print updated tibble

my_ggp <- tib_dest_new5 %>%                       # Create ggplot2 plot
 mutate(Destination = reorder(Destination, - T2019)) %>%
 ggplot(aes(x = Destination,
            y = T2019,
            fill = Region)) +
 geom_col() +
 theme(axis.text.x = element_text(angle = 90,
                                  hjust = 1,
                                  vjust = 0.5))
my_ggp                                            # Draw ggplot2 plot

tib_dest %>%                                      # Do all at once
 rename(T2019 = `International  tourist  arrivals  (2019)`) %>%
 select(- ...1, - `International  tourist  arrivals  (2018)`) %>%
 mutate(across(everything(), ~ replace(., . == "–", NA)),
        T2019 = as.numeric(str_replace(T2019, " million", "")) * 1e6) %>%
 na.omit() %>%
 filter(Destination != "Egypt" | Region == "Africa") %>%
 mutate(Destination = reorder(Destination, - T2019)) %>%
 ggplot(aes(x = Destination,
            y = T2019,
            fill = Region)) +
 geom_col() +
 theme(axis.text.x = element_text(angle = 90,
                                  hjust = 1,
                                  vjust = 0.25))

Follow me on Social Media:
Facebook – Statistics Globe Page: Facebook: statisticsglobecom
Facebook – R Programming Group for Discussions & Questions: Facebook: statisticsglobe
Facebook – Python Programming Group for Discussions & Questions: Facebook: statisticsglobepython
LinkedIn – Statistics Globe Page: LinkedIn: statisticsglobe
LinkedIn – R Programming Group for Discussions & Questions: LinkedIn: 12555223
LinkedIn – Python Programming Group for Discussions & Questions: LinkedIn: 12673534
Twitter: Twitter: JoachimSchork
Instagram: Instagram: statisticsglobecom
TikTok: TikTok: statisticsglobe
8 ماه پیش در تاریخ 1402/08/25 منتشر شده است.
1,383 بـار بازدید شده
... بیشتر