16.5 Tensors (cont.)

  • Deep learning frameworks generally manipulate data in the form of tensors
  • A tensor is basically a multidimensional array
  • Frameworks like TensorFlow pack all your data into one or more tensors
    • Use to perform the mathematical calculations that enable neural networks to learn
  • Tensors can become quite large

16.5 Tensors (cont.)

  • Chollet discusses the types of tensors typically encountered in deep learning: [Chollet, François. Deep Learning with Python. Section 2.2. Shelter Island, NY: Manning Publications, 2018.]
  • 0D (0-dimensional) tensor—One value and is known as a scalar
  • 1D tensor—Similar to a one-dimensional array and is known as a vector
    • Might represent a sequence, such as hourly temperature readings from a sensor or the words of one movie review
  • 2D tensor—Similar to a two-dimensional array and is known as a matrix
    • Could represent a grayscale image in which the tensor’s two dimensions are the image’s width and height in pixels, and the value in each element is the intensity of that pixel

16.5 Tensors (cont.)

  • 3D tensor—Similar to a three-dimensional array
    • Could represent a color image
      • First two dimensions would represent the width and height of the image in pixels
      • The depth at each location might represent the red, green and blue (RGB) components of a pixel’s color
    • Also could represent a collection of 2D tensors containing grayscale images
  • 4D tensor
    • Could represent a collection of color images in 3D tensors
    • Could represent one video
      • Each frame in a video is essentially a color image
  • 5D tensor
    • Could represent a collection of 4D tensors containing videos

16.5 Tensors (cont.)

  • Let’s assume we’re creating a deep-learning network to identify and track objects in 4K (high-resolution) videos that have 30 frames-per-second
    • Each frame in a 4K video is 3840-by-2160 pixels
  • Also assume the pixels are presented as red, green and blue components of a color
  • So each frame would be a 3D tensor containing a total of 24,883,200 elements (3840 2160 3) and each video would be a 4D tensor containing the sequence of frames

16.5 Tensors (cont.)

  • If the videos are one minute long, you’d have 44,789,760,000 elements per tensor!
  • Over 600 hours of video are uploaded to YouTube every minute so, in just one minute of uploads, Google could have a tensor containing 1,612,431,360,000,000 elements to use in training deep-learning models—that’s big data
  • As you can see, tensors can quickly become enormous, so manipulating them efficiently is crucial
  • This is one of the key reasons that most deep learning is performed on GPUs
  • More recently Google created TPUs (Tensor Processing Units) that are specifically designed to perform tensor manipulations

High-Performance Processors

High-Performance Processors (cont.)

  • Google TPUs (Tensor Processing Units)
  • Recognizing that deep learning is crucial to its future, Google developed TPUs (Tensor Processing Units), which they now use in their Cloud TPU service, which “can provide up to 11.5 petaflops of performance in a single pod” (that’s 11.5 quadrillion floating-point operations per second)
  • TPUs are designed to be especially energy efficient—a key concern for companies like Google with already massive computing clusters that are growing exponentially and consuming vast amounts of energy

©1992–2020 by Pearson Education, Inc. All Rights Reserved. This content is based on Chapter 5 of the book Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud.

DISCLAIMER: The authors and publisher of this book have used their best efforts in preparing the book. These efforts include the development, research, and testing of the theories and programs to determine their effectiveness. The authors and publisher make no warranty of any kind, expressed or implied, with regard to these programs or to the documentation contained in these books. The authors and publisher shall not be liable in any event for incidental or consequential damages in connection with, or arising out of, the furnishing, performance, or use of these programs.