Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- ---
- title: "A13 by Noah Perry & Andrew Hall"
- output: html_notebook
- ---
- Use the provide SQL code for a basket of goods analysis to extract the frequency with which particular pairs of products occur in the same basket.
- Insert the SQL code using the ```{sql, connection = con, output.var = dataframe} feature of R markdown
- Create a heat map using geom_tile().
- Include code that can facilitate necessary adjustments to make the heat map useful.
- ```{r}
- library(tidyverse)
- library(kableExtra)
- library(DBI)
- library(scales)
- #online DB connect
- conn<-dbConnect(RMySQL::MySQL(),"richardtwatson.com",dbname="ClassicModels",user="student",password="student")
- ```
- ```{r}
- normalize <- function(x){
- return((x-min(x)) / (max(x)-min(x)))
- }
- ```
- ```{sql connection=conn, output.var = 'basket'}
- SELECT (SELECT productName FROM Products WHERE Products.productCode = OD1.productCode) AS `Product 1`,
- (SELECT productName FROM Products WHERE Products.productCode = OD2.productCode) AS `Product 2`,
- count(*) as Frequency
- FROM OrderDetails AS OD1 JOIN OrderDetails AS OD2 ON OD1.orderNumber = OD2.orderNumber
- WHERE OD1.productCode > OD2.productCode
- GROUP BY `Product 1`, `Product 2` HAVING Frequency > 10
- ORDER BY Frequency DESC, `Product 1`, `Product 2`;
- ```
- ```{r cleaning}
- basket<-as_tibble(basket)
- basket_scaled<-basket %>% rename("product1" = `Product 1`,
- "product2" = `Product 2`,
- "pairFrequency" = Frequency) %>% arrange(product1)%>%
- mutate(pairFrequency = normalize(pairFrequency)) %>% slice(1:15)
- ```
- ```{r heatMap}
- ggplot(basket_scaled, aes(product1, product2 )) +
- geom_tile(aes(fill = pairFrequency), color = "white") +
- scale_fill_gradient(low = "red", high = "steelblue") +
- theme(axis.text.x = element_text(angle=45, vjust=1, hjust=1))
- ```
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement