Гид по SQL

SQL is one of the core skills of a data engineer and data scientist. This mini-tutorial explains the four fundamental SQL functions: Create, Read, Update, and Delete using a fun example of movie quotes database.

Introduction

Maybe it is hard to believe, but SQL is used everywhere around us. Every application that is manipulating any kind of data needs to store that data somewhere. Whether it’s Big Data or just a table with few simple rows, a government or a small startups, or a big database that spans over multiple servers or a mobile phone that runs its own small database, SQL is ubiquitous.

But what is a SQL? SQL stands for Structured Query Language, and usually is pronounced as “ess-que-el”. SQL is the language of databases, and is specifically built to communicate with databases. SQL is a simple language and is similar to the English language, as commands are structured almost like English sentences. Those sentences are structured like declared statements, thus SQL is also called a declarative language.

Why learn a whole new language when there are many available tools for writing SQL queries visually? When working with some SQL tools, it is important to know SQL language and to understand what the visual tools are doing, and why. Sometimes there are needs to write few SQL statements manually, not only because it is the fastest way but because it is more powerful and often the only way to achieve targeted goals.

What is a Database

We have mentioned that SQL is the language of databases. What is a database? Databases are a storage mechanism designed to offer access to stored information and their manipulation. Information in the database is stored in objects called tables. Tables are uniquely identified by their names and are comprised of columns and rows. Columns contain the column name, column data type, and any other attributes for the column. Rows contain the records or data for the columns. Many of the tables in a database will have relationships, or links between them, either in a one-to-one or a one-to-many relationship. This is why this kind of databases is called relational model databases.

The easiest way to describe a database structure is by comparing it with an Excel spreadsheet, with which many are familiar. A database is one spreadsheet file. Sheets in the spreadsheet are tables, each one with a given name. Columns and rows are the same in both. SQL language can be used to create new tables, or alter existing ones, and to fetch data, update data, or delete data.

Say we have a big collection of famous movie quotes stored in random separate text files. Even if we are more organized and use Excel spreadsheet, the problem we have is the same. Having quotes stored in that way, we can’t quickly get all quotes from one movie, or get all quotes from one character. If we move our text files or spreadsheet into a database, and create tables with relations between them, all this becomes possible. What does relational really mean? The relational model is a way to describe the data and the relationship between those data entities. In our example, a relation is a connection between every single quote with a table where movie titles are stored, or all characters are stored.

Продолжение и подробнее здесь.

Data Scientist # 1

Машинное обучение, большие данные, наука о данных, анализ данных, цифровой маркетинг, искусственный интеллект, нейронные сети, глубокое обучение, data science, data scientist, machine learning, artificial intelligence, big data, deep learning

Данные — новый актив!

Эффективно управлять можно только тем, что можно измерить.
Copyright © 2016-2021 Data Scientist. Все права защищены.