Homework 2

Basic R, Data Description and Graphics

Homework
Important

Due Friday, Sep 19, 11:59 PM

Homework Questions

# ==============================================================================
## Vector
# ==============================================================================
poker_vec <- c(140, -50, 20, -120, 240)
roulette_vec <- c(-24, -50, 100, -350, 10)
days_vec <- c("Mon", "Tue", "Wed", "Thu", "Fri")
names(poker_vec) <- days_vec
names(roulette_vec) <- days_vec
  1. Vector. The code above shows a Marquette student poker and roulette winnings from Monday to Friday. Copy and Paste them into your R and complete the problem 1.
    1. Assign to the variable total_daily how much you won or lost on each day in total (poker and roulette combined).
    2. Calculate the winnings overall total_week. Print it out.


# ==============================================================================
## Factor
# ==============================================================================
# Create speed_vector
speed_vec <- c("medium", "slow", "slow", "medium", "fast")
  1. Factor.
    1. speed_vec above should be converted to an ordinal factor since its categories have a natural ordering. Create an ordered factor vector speed_fac by completing the code below. Set the argument ordered to TRUE, and set the argument levels to c("slow", "medium", "fast"). Print speed_fac.


# ==============================================================================
## Data frame
# ==============================================================================
name <- c("Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", 
          "Uranus", "Neptune")
type <- c("Terrestrial planet", "Terrestrial planet", "Terrestrial planet", 
          "Terrestrial planet", "Gas giant", "Gas giant", 
          "Gas giant", "Gas giant")
diameter <- c(0.382, 0.949, 1, 0.532, 11.209, 9.449, 4.007, 3.883)
rotation <- c(58.64, -243.02, 1, 1.03, 0.41, 0.43, -0.72, 0.67)
rings <- c(FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE)
  1. Data Frame. You want to construct a data frame that describes the main characteristics of eight planets in our solar system. You feel confident enough to create the necessary vectors: name, type, diameter, rotation and rings that have already been coded up as above. The first element in each of these vectors correspond to the first observation.
    1. Use the function data.frame() to construct a data frame. Pass the vectors name, type, diameter, rotation and rings as arguments to data.frame(), in this order. Call the resulting data frame planets_df.
    2. Use str() to investigate the structure of the new planets_df variable. Which are categorical (qualitative) variables and which are numerical (quantitative) variables? For those that are categorical, are they nominal or ordinal? For those numerical variables, are they interval or ratio level? discrete or continuous?
    3. From planets_df, select the diameter of Mercury: this is the value at the first row and the third column. Simply print out the result.
    4. From planets_df, select all data on Mars (the fourth row). Simply print out the result.
    5. Select and print out the first 5 values in the diameter column of planets_df.
    6. Use $ to select the rings variable from planets_df.
    7. Use (f) to select all columns for planets that have rings.


  1. Data Description and Graphics. We use the data set mtcars to do data summary and graphics. Type ?mtcars for the description of the data set.
    1. Use the function pie() to create a pie chart for the number of cylinders (cyl). Show the plot. What the number of cylinders has the most frequencies in the data?
    2. Use the function barplot() to create a bar chart for the number of gears (gear). Show the plot. What the number of gears has the most frequencies in the data?
    3. Use the function hist() to generate a histogram of the gross horsepower (hp). Show the plot. Is it right or left-skewed?
    4. Use the function boxplot() to generate a boxplot of car weight (wt). Show the plot. Are there any outliers?
    5. Use the function plot() to create a scatter plot of displacement (disp) vs. miles per gallon (mpg). Show the plot. As the displacement increases, the miles per gallon tends to increase or decrease?
    6. Compute the mean, median and standard deviation of the miles per gallon (mpg).

To save your figures, in RStudio you go to tab Plots > Export > Save as Image > choose Image format (PNG is good), choose where the image is saved (Directory), type the File name, decide Width and Height > Click Save. To download an image file to your local computer, select the file, go to More > Export > Download. You could take screenshots of plots or code to show your work too.


  1. (MSSC) R List Data Structure. A list in R allows you to gather a variety of objects under one name (that is, the name of the list) in an ordered way. These objects can be matrices, vectors, data frames, even other lists. Alos, it is not required that these objects are related to each other in any way.
    1. Use command list() to create a list named employee of three elements, name, salary, and union with value "Joe", 55000 and TRUE respectively. Print employee.
    2. Construct a list, named my_list that contains 3 list components: days_vec, planets_df, and employee. Print the structure of my_list.
    3. What is the difference between my_list[[1]] and my_list[1]? What is the data type they return?
    4. Obtain the salary in employee from my_list using $.


  1. AI Usage Declaration. Using GenAI is permitted for this course. If you choose to use GenAI to assist with your homework, you must include a brief statement documenting your use. Please provide the following information:
    1. Why/How I Used AI Why do you need to use GenAI? Which tool did you use? Describe your prompts or questions. What and how did you ask the AI to help you?
    2. Generated Output Include a screenshot or excerpt (copy and paste) of the AI’s response.
    3. How I Used the Output Did you revise it? Did you use it directly, or compare it with your answers? What decisions did you make based on the output?