Monday, March 12, 2018

Machine Learning :R






  1. Expressions

  2. 1.1

    Type anything at the prompt, and R will evaluate it and print the answer.
    Let's try some simple math. Type the below command.
    RedoComplete
    > 2
    [1] 2
    
    There's your result, 2. It's printed on the console right after your entry.
  3. Type the string "Arr, matey!". (Don't forget the quotes!)
    RedoComplete
    > "Arr,matey!"
    [1] "Arr,matey!"
    

  4. Now try multiplying 6 times 7 (* is the multiplication operator).
    RedoComplete
    > 6*7
    [1] 42
    

  5. Logical Values1.2

    Some expressions return a "logical value": TRUE or FALSE. (Many programming languages refer to these as "boolean" values.) Let's try typing an expression that gives us a logical value:
    RedoComplete
    > 3<4 code="" true="">

  6. And another logical value (note that you need a double-equals sign to check whether two values are equal - a single-equals sign won't work):
    RedoComplete
    > 2+2 ==5
    [1] FALSE
    

  7. T and F are shorthand for TRUE and FALSE. Try this:
    RedoComplete
    > T== TRUE
    [1] TRUE
    

  8. Variables1.3

    As in other programming languages, you can store values into a variable to access it later. Type x <- 42="" code=""> to store a value in x.
    RedoComplete
    > x <- 42="" code="">

  9. x can now be used in expressions in place of the original result. Try dividing x by 2 (/ is the division operator).
    RedoComplete
    > x/2
    [1] 21
    

  10. You can re-assign any value to a variable at any time. Try assigning "Arr, matey!" to x.
    RedoComplete
    > x <- code="" matey="" rr="">

  11. You can print the value of a variable at any time just by typing its name in the console. Try printing the current value of x.
    RedoComplete
    > x
    [1] "Arr, matey!"
    

  12. Now try assigning the TRUE logical value to x.
    RedoComplete
    > x <- code="" true="">

  13. Functions1.4

    You call a function by typing its name, followed by one or more arguments to that function in parenthesis. Let's try using the sum function, to add up a few numbers. Enter:
    RedoComplete
    > sum(1,3,5)
    [1] 9
    

  14. Some arguments have names. For example, to repeat a value 3 times, you would call the rep function and provide its times argument:
    RedoComplete
    > rep("Yo ho!",times = 3)
    [1] "Yo ho!" "Yo ho!" "Yo ho!"
    

  15. Try calling the sqrt function to get the square root of 16.
    RedoComplete
    > sqrt(16)
    [1] 4
















  1. Vectors

    • Try R is Sponsored By:

      O'Reilly
    • Complete to
      Unlock

      Chapter 2 Badge
    The name may sound intimidating, but a vector is simply a list of values. R relies on vectors for many of its operations. This includes basic plots - we'll have you drawing graphs by the end of this chapter (and it's a lot easier than you might think)!
    Course tip: if you haven't already, try clicking on the expand icon (Expand Sidebar) in the upper-left corner of the sidebar. The expanded sidebar offers a more in-depth look at chapter sections and progress.
  2. Vectors2.1

    A vector's values can be numbers, strings, logical values, or any other type, as long as they're all the same type. Try creating a vector of numbers, like this:
    RedoComplete
    > c(4, 7, 9)
    [1] 4 7 9
    
    The c function (c is short for Combine) creates a new vector by combining a list of values.
  3. Now try creating a vector with strings:
    RedoComplete
    > c('a', 'b', 'c')
    [1] "a" "b" "c"
    
  4. Vectors cannot hold values with different modes (types). Try mixing modes and see what happens:
    RedoComplete
    > c(1, TRUE, "three")
    [1] "1"     "TRUE"  "three"
    
    All the values were converted to a single mode (characters) so that the vector can hold them all.
  5. Sequence Vectors2.2

    If you need a vector with a sequence of numbers you can create it with start:end notation. Let's make a vector with values from 5 through 9:
    RedoComplete
    > 5:9
    [1]  5  6  7  8  9
    
  6. A more versatile way to make sequences is to call the seq function. Let's do the same thing with seq:
    RedoComplete
    > seq(5, 9)
    [1] 5 6 7 8 9
    
  7. seq also allows you to use increments other than 1. Try it with steps of 0.5:
    RedoComplete
    > seq(5, 9, 0.5)
    [1] 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0
    
  8. Now try making a vector with integers from 9 down to 5:
    RedoComplete
    > 9:5
    [1] 9 8 7 6 5
    
  9. Vector Access2.3

    We're going to create a vector with some strings in it for you, and store it in the sentence variable.
    You can retrieve an individual value within a vector by providing its numeric index in square brackets. Try getting the third value:
    RedoComplete
    > sentence <- c="" plank="" the="" walk=""> sentence[3]
    [1] "plank"
    
  10. Many languages start array indices at 0, but R's vector indices start at 1. Get the first value by typing:
    RedoComplete
    > sentence[1]
    [1] "walk"
    
  11. You can assign new values within an existing vector. Try changing the third word to "dog":
    RedoComplete
    > sentence[3] <- code="" dog="">
  12. If you add new values onto the end, the vector will grow to accommodate them. Let's add a fourth word:
    RedoComplete
    > sentence[4] <- code="" to="">
  13. You can use a vector within the square brackets to access multiple values. Try getting the first and third words:
    RedoComplete
    > sentence[c(1, 3)]
    [1] "walk" "dog"
    
  14. This means you can retrieve ranges of values. Get the second through fourth words:
    RedoComplete
    > sentence[2:4]
    [1] "the" "dog" "to"
    
  15. You can also set ranges of values; just provide the values in a vector. Add words 5 through 7:
    RedoComplete
    > sentence[5:7] <- c="" code="" deck="" poop="" the="">
  16. Now try accessing the sixth word of the sentence vector:
    RedoComplete
    > sentence[6]
    [1] "poop"
    
  17. Vector Names2.4

    For this challenge, we'll make a 3-item vector for you, and store it in the ranks variable.
    You can assign names to a vector's elements by passing a second vector filled with names to the names assignment function, like this:
    RedoComplete
    > ranks <- 1:3=""> names(ranks) <- c="" code="" first="" second="" third="">
  18. Assigning names for a vector can act as useful labels for the data. Below, you can see what our vector looks like now.
    You can also use the names to access the vector's values. Try getting the value for the "first" rank:
    RedoComplete
    > ranks
     first second  third
         1      2      3
    > ranks["first"]
    first 
        1
    
  19. Now set the current value for the "third" rank to a different value using the name rather than the position.
    RedoComplete
    > ranks["third"] <- 4="" code="">
  20. Plotting One Vector2.5

    The barplot function draws a bar chart with a vector's values. We'll make a new vector for you, and store it in the vesselsSunk variable.
    Now try passing the vector to the barplot function:
    RedoComplete
    > vesselsSunk <- 1="" 5="" c=""> barplot(vesselsSunk)
    
  21. If you assign names to the vector's values, R will use those names as labels on the bar plot. Let's use the namesassignment function again:
    RedoComplete
    > names(vesselsSunk) <- c="" code="" ngland="" orway="" rance="">
  22. Now, if you call barplot with the vector again, you'll see the labels:
    RedoComplete
    > barplot(vesselsSunk)
    
  23. Now, try calling barplot on a vector of integers ranging from 1 through 100:
    RedoComplete
    > barplot(1:100)
    
  24. Vector Math2.6

    Most arithmetic operations work just as well on vectors as they do on single values. We'll make another sample vector for you to work with, and store it in the a variable.
    If you add a scalar (a single value) to a vector, the scalar will be added to each value in the vector, returning a new vector with the results. Try adding 1 to each element in our vector:
    RedoComplete
    > a <- 2="" 3="" c=""> a + 1
    [1] 2 3 4
    
  25. The same is true of division, multiplication, or any other basic arithmetic. Try dividing our vector by 2:
    RedoComplete
    > a / 2
    [1] 0.5 1.0 1.5
    
  26. Now try multiplying our vector by 2:
    RedoComplete
    > a * 2
    [1] 2 4 6
    
  27. If you add two vectors, R will take each value from each vector and add them. We'll make a second vector for you to experiment with, and store it in the b variable.
    Try adding it to the a vector:
    RedoComplete
    > b <- 5="" 6="" c=""> a + b
    [1] 5 7 9
    
  28. Now try subtracting b from a:
    RedoComplete
    > a - b
    [1] -3 -3 -3
    
  29. You can also take two vectors and compare each item. See which values in the a vector are equal to those in a second vector:
    RedoComplete
    > a == c(1, 99, 3)
    [1]  TRUE FALSE  TRUE
    
    Notice that R didn't test whether the whole vectors were equal; it checked each value in the a vector against the value at the same index in our new vector.
  30. Check if each value in the a vector is less than the corresponding value in another vector:
    RedoComplete
    > a < c(1, 99, 3)
    [1] FALSE  TRUE FALSE
    
  31. Functions that normally work with scalars can operate on each element of a vector, too. Try getting the sine of each value in our vector:
    RedoComplete
    > sin(a)
    [1] 0.8414710 0.9092974 0.1411200
    
  32. Now try getting the square roots with sqrt:
    RedoComplete
    > sqrt(a)
    [1] 1.000000 1.414214 1.732051
    
  33. Scatter Plots2.7

    The plot function takes two vectors, one for X values and one for Y values, and draws a graph of them.
    Let's draw a graph showing the relationship of numbers and their sines.
    First, we'll need some sample data. We'll create a vector for you with some fractional values between 0 and 20, and store it in the x variable.
    Now, try creating a second vector with the sines of those values:
    RedoComplete
    > x <- 0.1="" 20="" seq=""> y <- code="" sin="" x="">
  34. Then simply call plot with your two vectors:
    RedoComplete
    > plot(x,y)
    
    Great job! Notice on the graph that values from the first argument (x) are used for the horizontal axis, and values from the second (y) for the vertical.
    • 5101520-1.0-0.50.00.51.0xy
  35. Your turn. We'll create a vector with some negative and positive values for you, and store it in the values variable.
    We'll also create a second vector with the absolute values of the first, and store it in the absolutes variable.
    Try plotting the vectors, with values on the horizontal axis, and absolutes on the vertical axis.
    RedoComplete
    > values <- -10:10=""> absolutes <- abs="" values=""> plot(values, absolutes)
    
    • -10-505100246810valuesabsolutes
  36. NA Values2.8

    Sometimes, when working with sample data, a given value isn't available. But it's not a good idea to just throw those values out. R has a value that explicitly indicates a sample was not available: NA. Many functions that work with vectors treat this value specially.
    We'll create a vector for you with a missing sample, and store it in the a variable.
    Try to get the sum of its values, and see what the result is:
    RedoComplete
    > a <- 3="" 7="" 9="" c="" na=""> sum(a)
    [1] NA
    
    The sum is considered "not available" by default because one of the vector's values was NA. This is the responsible thing to do; R won't just blithely add up the numbers without warning you about the incomplete data. We can explicitly tell sum (and many other functions) to remove NA values before they do their calculations, however.
  37. Remember that command to bring up help for a function? Bring up documentation for the sum function:
    RedoComplete
    > help(sum)
    sum                    package:base                    R Documentation
    
    Sum of Vector Elements
    
    Description:
        'sum' returns the sum of all the values present in its arguments.
    
    Usage:
        sum(..., na.rm = FALSE)
    ...
    
    As you see in the documentation, sum can take an optional named argument, na.rm. It's set to FALSE by default, but if you set it to TRUE, all NA arguments will be removed from the vector before the calculation is performed.
  38. Try calling sum again, with na.rm set to TRUE:
    RedoComplete
    > sum(a,na.rm =TRUE)
    [1] 20
    
  39. Chapter 2 Completed

    Chapter 2 Badge

    Share your plunder:

    You've traversed Chapter 2… and discovered another badge!
    In this chapter, we've shown you all the basics of manipulating vectors - creating and accessing them, doing math with them, and making sequences. We've shown you how to make bar plots and scatter plots with vectors. And we've shown you how R treats vectors where one or more values are not available.
    The vector is just the first of several data structures that R offers. See you in the next chapter, where we'll talk about… the matrix.

    More from O'Reilly

    Did you know that our sponsor O'Reilly has some great resources for big data practitioners? Check out the Strata Newsletter, the Strata Blog, and get access to five e-books on big data topics from leading thinkers in the space.

No comments:

Post a Comment