Employee
table that might be used in a personnel system to store employees’ attributes:
Employee
table each department and location, and present the data in increasing order by department numbersqlite3
module, using a script provided in the ch17
example’s folder’s sql
subfoldersqlite3
command creates a SQLite database named books.db
and executes the books.sql
SQL script, which defines how to create the database’s tables and populates them with data:!sqlite3 books.db < ./sql/books.sql
<
indicates that books.sql is input into the sqlite3
command.sqlite3
connect
function connects to a database and returns a Connection
object:import sqlite3
connection = sqlite3.connect('books.db')
authors
Table¶authors
, author_ISBN
and titles
authors
table stores the authors and has three columns: id
first
—First name (a string)last
—Last name (a string)author
s Table’s Contents¶authors
tableimport pandas as pd
pd.options.display.max_columns = 10
pd.read_sql('SELECT * FROM authors', connection,
index_col=['id'])
author
s Table’s Contents (cont.)¶read_sql
executes a SQL query and returns a DataFrame
containing the query’s resultsConnection
objectindex_col
keyword argumentDataFrame
’s row indicesindex_col
index values from 0, otherwiseSELECT
gets rows and columns from one or more tablesSELECT * FROM authors
titles
Table's Contents¶titles
table stores the books and has four columns:isbn
title
—Book title stringedition
—Integer edition numbercopyright
—Copyright year as a stringtitles
Table's Contents (cont.)¶pd.read_sql('SELECT * FROM titles', connection)
author_ISBN
Table¶authors
table with their books in the titles
table: id
—Integer author id isbn
—String ISBN of a bookid
column is a foreign keyauthors
table’s id
primary-key column isbn
column is a foreign key titles
table’s isbn
primary-key columnauthor_ISBN
Table (cont.)¶books.sql
script)id
and isbn
columns in this table form a composite primary keyauthor_ISBN
Table (cont.)¶df = pd.read_sql('SELECT * FROM author_ISBN', connection)
df.head() # view first 5 of this table's many entries
author_ISBN
Table (cont.)¶books
Database¶authors
and author_ISBN
.authors
end there’s a 1
, and on the author_ISBN
end there’s an infinity symbol (¥).authors
table, there can be an arbitrary number of ISBNs for books written by that author in the author_ISBN
tableauthors
table's id
column (the primary key) to the author_ISBN
table's id
column (a foreign key)titles
and author_ISBN
tables illustrates a one-to-many relationshiptitles
table's primary key isbn
to the author_ISBN
table's foreign keyauthor_ISBN
table is to provide a many-to-many relationship between the authors
and titles
tablesbooks
database, demonstrating SQL queries and statements using the SQL keywords in the following table. SQL keyword | Description |
---|---|
SELECT |
Retrieves data from one or more tables. |
FROM |
Tables involved in the query. Required in every SELECT . |
WHERE |
Criteria for selection that determine the rows to be retrieved, deleted or updated. Optional in a SQL statement. |
GROUP BY |
Criteria for grouping rows. Optional in a SELECT query. |
ORDER BY |
Criteria for ordering rows. Optional in a SELECT query. |
INNER JOIN |
Merge rows from multiple tables. |
INSERT |
Insert rows into a specified table. |
UPDATE |
Update rows in a specified table. |
DELETE |
Delete rows from a specified table. |
SELECT
Queries¶SELECT
statements with the *
wildcard character get all the columns from a tablepd.read_sql('SELECT first, last FROM authors', connection)
WHERE
Clause Specifies Selection Criteria¶pd.read_sql("""SELECT title, edition, copyright
FROM titles
WHERE copyright > '2016'""", connection)
WHERE
clause operators: <
, >
, <=
, >=
, =
, <>
(not equal) and LIKE
LIKE
is used for pattern matchingpd.read_sql("""SELECT id, first, last
FROM authors
WHERE last LIKE 'D%'""",
connection, index_col=['id'])
pd.read_sql("""SELECT id, first, last
FROM authors
WHERE first LIKE '_b%'""",
connection, index_col=['id'])
ORDER BY
Clause¶ASC
and DESC
, respectivelyASC
pd.read_sql('SELECT title FROM titles ORDER BY title ASC',
connection)
ORDER BY
pd.read_sql("""SELECT id, first, last
FROM authors
ORDER BY last, first""",
connection, index_col=['id'])
pd.read_sql("""SELECT id, first, last
FROM authors
ORDER BY last DESC, first ASC""",
connection, index_col=['id'])
WHERE
and ORDER
BY
Clauses¶pd.read_sql("""SELECT isbn, title, edition, copyright
FROM titles
WHERE title LIKE '%How to Program'
ORDER BY title""", connection)
INNER
JOIN
¶head
of the resultpd.read_sql("""SELECT first, last, isbn
FROM authors
INNER JOIN author_ISBN
ON authors.id = author_ISBN.id
ORDER BY last, first""", connection).head()
INNER JOIN
(cont.)¶INNER JOIN
’s ON
clause uses a primary-key column in one table and a foreign-key column in the other to determine which rows to merge from each table.authors
table’s first
and last
columns with the author_ISBN
table’s isbn
column and sorts the results in ascending order by last
then first
. authors.id
in the ON
clause is a qualified name INSERT
INTO
Statement¶sqlite3
Cursor
object to modify the databaseConnection
’s cursor
methodcursor = connection.cursor()
INSERT INTO
statement inserts a row into a tableINSERT
INTO
Statement¶Cursor
method execute
cursor = cursor.execute("""INSERT INTO authors (first, last)
VALUES ('Sue', 'Red')""")
VALUES
and a comma-separated list of values in parenthesesINSERT
INTO
Statement¶id
is autoincremented, for every new row, SQLite assigns a unique id
value that is the next value in the autoincremented sequence pd.read_sql('SELECT id, first, last FROM authors',
connection, index_col=['id'])
'
).'O''Malley'
).UPDATE
Statement¶cursor = cursor.execute("""UPDATE authors SET last='Black'
WHERE last='Red' AND first='Sue'""")
UPDATE
keyword is followed by the table to update, the keyword SET
and a comma-separated list of column_name =
value pairs indicating the columns to change and their new valuesWHERE
clauseUPDATE
Statement (cont.)¶Cursor
object’s rowcount
attribute contains an integer value representing the number of rows that were modifiedUPDATE
modified one row:cursor.rowcount
authors
table’s contents:pd.read_sql('SELECT id, first, last FROM authors',
connection, index_col=['id'])
DELETE
FROM
Statement¶cursor = cursor.execute('DELETE FROM authors WHERE id=6')
cursor.rowcount
WHERE
clause determines which rows to deleteWHERE
is omitted, all the table’s rows are deletedauthors
table after the DELETE
operationpd.read_sql('SELECT id, first, last FROM authors',
connection, index_col=['id'])
Connection
’s close
method connection.close()
DataFrame
for which the data may be distributed over many computers in a Spark cluster.©1992–2020 by Pearson Education, Inc. All Rights Reserved. This content is based on Chapter 5 of the book Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud.
DISCLAIMER: The authors and publisher of this book have used their best efforts in preparing the book. These efforts include the development, research, and testing of the theories and programs to determine their effectiveness. The authors and publisher make no warranty of any kind, expressed or implied, with regard to these programs or to the documentation contained in these books. The authors and publisher shall not be liable in any event for incidental or consequential damages in connection with, or arising out of, the furnishing, performance, or use of these programs.