The Netflix Database from (https://github.com/lerocha/netflixdb) is a publicly available dataset for demonstrating SQL queries in R. It contains real Netflix data in particular stored in an SQLite database, making it perfect for educational purposes.

1. Download and Set Up the Database

Download the Netflix database from GitHub
- Go to https://github.com/lerocha/netflixdb
- Download the netflixdb.qlite file and save it in your working directory

2. Install and Load Required Packages

install.packages("DBI")
install.packages("RSQLite")
install.packages("dplyr")
library(DBI)
library(RSQLite)
library(dplyr)
library(ggplot2)

3. Connect to the Netflix Database

con <- dbConnect(RSQLite::SQLite(), "netflixdb.sqlite")

4. List Available Tables

dbListTables(con)

5. Querying the Database

test <- dbGetQuery(con, "SELECT * FROM view_summary;")

6. Using dplyr

tbl(con,"view_summary") |> select("cumulative_weeks_in_top10") |> ggplot() + aes(x=cumulative_weeks_in_top10) +  geom_bar(na.rm=TRUE)

7. Writing a New Table to the Database

You can add your own data

new_data <- data.frame(title = c("Test Show 1", "Test Show 2"),
                       release_year = c(2023, 2024))

dbWriteTable(con, "test_shows", new_data, overwrite = TRUE)
dbListTables(con)

8. Closing the Connection

Always disconnect when done
dbDisconnect(con)


Modifié le: mercredi 9 avril 2025, 06:29