Relational Databases

raphael krantz
2 min readJul 31, 2020
https://geeksgod.com/storage-structure-in-database-management/

Previous blogs in this introductory series on data science have focused primarily on philosophical considerations. I’ll now start to look at data science directly through one of it’s most common tools: databases.

We live in a data-saturated world where almost all people have at least some familiarity with data; 5 billion people have a mobile device of some kind and half of those possess a smartphone.¹ We follow polls, listen to podcasts filled with health reports. Many people track their health using a Fitbit or other wearable devices. And long before the smartphone revolution, countless people have used data collection products such as Microsoft Excel to keep track of hours worked by complete their timesheets.

While there are more advanced tools for data processing like python with powerful libraries such as pandas and scikit-learn, the fact remains that one of the most commonly utilized tools for data analysis is a language that is almost 50 years old: SQL.

Indeed jobs for data scientists categorized as data analyst often list SQL as the prime requisite form employment. So by all means learn Python & R. But we’ll begin here.

SQL stands for Structured Query Language and it is utilized for Querying Relational Databases. What’s a Relational Database? Generally, it’s a way of storing data such that categories of data are related to each other.

For example, Let's imagine that we are operating a restaurant. What kinds of data might be important to keep track of? Employees, menus, inventory, customers and complaints to name a few. Imagine today a loyal customer Elizabeth talks to you, the manager, and tells you that last week she received terrible service from one of your waitstaff but she doesn’t know his name. She does remember the day she visited- it was a second date on Saturday night, July 25th. You have 100 employees in your restaurant and 50 of them are waitstaff. On any given night you have 15 waiters, servers and busers on duty. So how would you proceed in tracking down the wayward employee?

In relational databases, the collection of data objects is called a schema. Below is part of a schema for a generic restaurant database that displays the relationships between data objects:

Each box represents a table defined by several pieces of information:

  • Table Name
  • Name of each data field
  • Data type for each field
  • Primary key for the table

Next we’ll explore databases in greater detail.

Stay tuned!

--

--