13. Database

Database

SELECT _ FROM _ WHERE _ : SELECT COL FROM TABLE WHERE CONDITION

Time Complexity
approach table by checking each element => O(n)
-> set email as a primary key by adding a table and search in the original table using id as a reference key => O(2*logN)

Not good in developer stand, but for safety (backup)

Sharding : divide into half and put each one into different machines
Replicating : write the same data into different machines
In case of sharding the data might be nicely distributed and hence the queries.
To “improve query” response on reading data, replication will help. You could write away to your primary and read from secondaries to distribute the queries. Also the primary then is relieved of the expensive reads, and can be busy with only writing.

In case of replicating existing shards, there will be more hosts to respond to a query request.

There is a some improvement with sharding if you choose a good shard key. Writing queries away ‘might’ be distributed if you do that correctly. The main reason for sharding is to “horizontally expand your database”. Working with big data, and not wanting to create/insert bigger and bigger disks … you can just create new servers next to it, as much as you want.