# R: Local tutorial

### From Wiki1

Contributions to this tutorial are more than welcome.
**Quick links**

- Data manipulation and data mining: http://www.togaware.com/datamining/

## Contents |

## Basics

### Basic data types

mode(2) # numeric mode(pi) mode(2+3i) # complex mode("the lazy dog") # character mode(FALSE) # logical

### Basic data structures

#### vector

c(1,5,12) c('a','large','boat') c(FALSE,TRUE) 1:12 # sequence seq( from = 1, to = 12, by = 1.5)

#### matrix

mat <- matrix( 1:12, ncol = 3) # example of assignment, Note: does not print result mat # prints result ( mat2 <- matrix( 1:12, ncol = 3, byrow = T) ) # assignment and printing rownames(mat) <- c('John','Qing','Tao','Ye') mat colnames(mat) <- LETTERS[1:3] t(mat)

#### array

arr <- array( 1:24, dim = c(2,3,4)) arr dimnames(arr)3 <- letters[1:4] arr dimnames(arr)[c(1,2)] <- list( Row = c('row 1','row 2'), Column = paste( "Col", 1:3)) arr names( dimnames( arr ) [3]) <- "Panel" dimnames(arr) arr aperm(arr, c(2,1,3))

#### list

- Elements can have different modes and different structures

list( 1, 'a', FALSE)

### more advanced data types

#### factors

#### selecting subsets

#### names

- Objects can have 'post-it' notes attached to them. These are attributes. The most common attribute names the elements of an object.

x <- c( a = 11, b = 12, c = 13) x x <- 11:13 names(x) <- c('a','b','c') x mat <-

### Basic language

### Exercises

library(car) data(Prestige)

- How many occupation have education > 12
- Let z <- list(a=1:4,b=c(10,13,34), c=4, d=c(NA,2))
- Find the maximum value in each element of the list

- Find the mean income for each type of occupation
- Write a program 'prime(n)' that will find all primes from 1 to n
- Plot means, standard deviations, and standard errors by group

## More advanced

### Exercises

## Even more advanced

### Exercises

## Data input

In practice, most data generated by graduate students in Psychology are entered with Excel or with SPSS. This section describes how to convert Excel or SPSS files to R.

### Excel

- Format the file so the top row contains variables names. It's a good idea to use "NA" (without quotes) as a missing value code. Obey variable name rules for R: no spaces (use '.' instead of a space), only letters, numbers (not in the first place) and periods. Avoid the octothorp (#) because it turns all that follows on the line into a comment.
- Save the file as a '.csv' (comma-separated values) file. Note that only the active worksheet will be saved. Be aware that a .csv file can have no more than 256 columns. This is a significant limitation for some projects in which data is recorded for many subscales of psychological tests. Let's suppose that you save the file as 'c:\newdata.csv' in Windows.
- In R use the command

> newdata <- read.csv("c:/newdata.csv")

or

> newdata <- read.csv(file.choose())

(Note the use of a forward slash where the usual Windows syntax would use a backward slash.)

### SPSS

More up-to-date information is available at R:_Data_conversion_from_SPSS

SPSS files can be read directly with the 'read.spss' function in the 'foreign' package:

> library( foreign ) > newdata <- read.spss("c:/newdata.sav")

Files created with very recent versions of SPSS will produce a warning message but the problem seems innocuous. Missing data codes need to be processed further in the R file.

The plain use of the 'read.spss' command, above, produces a 'list' instead of a 'data.frame'. Also, value labels have extra spaces to stretch them to 256 characters. Generally is is better to use:

> library( foreign ) > newdata <- read.spss("c:/newdata.sav", trim.factor.names = T, to.data.frame = T)

The 'read.spss' function was written for older versions of SPSS and works best if variable names in the SPSS file have at most 8 characters. If your variable names are longer they will be turned into shorter but unique names. You can change the names in the R data.frame back to the original names if you wish.

For more information, see R:_Data_conversion_from_SPSS

### Quick entry by cutting and pasting

## =

## Handling data frames

- See some examples in the PSYC 6140 notes for week 2
- Good sample script in in Fox's UCLA tutorial [1]