Saturday, April 4, 2020

Amazon DynamoDB


Amazon DynamoDB

With so much going around in the field of Database, “NoSQL” is something that is capturing a lot of attention, so I thought let’s give this a try and understand what exactly it is. I started with basic NoSQL concepts and while going through many articles I came across Amazon DynamoDB.

Amazon DynamoDB – Recapping what I’ve learned so far!!!!!

Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. A key-value and document database that delivers single-digit millisecond performance at any scale.

I being from SQL background it was very important for me to relate to my existing knowledge. The following are the basic DynamoDB components much more relatable to SQL:

        Tables – Similar to other database systems, DynamoDB stores data in tables.
        Items – An item is a group of attributes that are uniquely identifiable among all of the other items like a row in a table.
       Attributes – Each item is composed of one or more attributes. Attributes in DynamoDB are similar in many ways to fields or columns in other database systems.

Compelling Features

       High Availability and Durability – Automatically spreads the data and traffic for your tables over a sufficient number of servers to handle your throughput and storage requirements, while maintaining consistent and fast performance.
       LSI and GSI – Indexes in DynamoDB are used for improving data accesses and retrievals. There are two different kinds of indexes that are supported, Local secondary indexes (LSI) and Global secondary indexes (GSI).
       SDK Options – SDKs are available for a wide variety of languages Java, Python and more.
       Time to Live (TTL) Lets you define when items in a table expire so that they can be automatically deleted from the database.
       Adaptive Capacity Enables DynamoDB to run imbalanced workloads indefinitely.

There are other features as well as PITR, Global Tables, Encryption, Auto Scale, Accelerator (DAX), and much more worth exploring.

In conclusion, all I want to convey is DynamoDB is an extremely powerful Non-relational Database service that is trying to solve a lot of existing database limitations like issues of scale, schema-less data storage and have immense scope in various industries like Ad Tech and Gaming.

Resources and interesting reads:

Medium (https://medium.com/)
       10 Things you should know about DynamoDB
       Why Amazon DynamoDB isn’t for everyone

Monday, January 14, 2019

SQL Server 2016 New Features

STARTING WITH FIRST TWO FEATURES:  



1DROP IF EXISTS (a.k.a. DIE): DROP IF EXISTS is a new T-SQL language enhancement that eliminates the need to test before you drop.

BEFORE:

IF OBJECT_ID('dbo.Person', 'U') IS NOT NULL
DROP TABLE dbo.Person;

AFTER:

DROP TABLE IF EXISTS dbo.Person;

1.      SESSION_CONTEXT returns the value of the specified key in the current session context. The value is set by using the sp_set_session_context procedure.


Syntax: SESSION_CONTEXT (N'key')

NOTE: The Prefix N conveys to the SQL Server that following literal string is of Unicode type.





As mentioned earlier, the value is set by using the sp_set_session_context procedure :
For @read_only = 0;


 For @read_only = 1;

 When a SESSION_CONTEXT has been created with ‘@read_only’ set to 1, an error will be   raised if an attempt is made to change the value:



1.      DYNAMIC DATA MASKING: Data Masking is the method of creating a new representation of data with random characters or data that are structurally similar to original data. It is normally done for the protection of our sensitive data.

Dynamic Data Masking is a new security feature introduced in SQL Server 2016 that limits the access of unauthorized users to sensitive data at the database layer. In simple words,
Dynamic data masking (DDM) is an evolving technology that objectives at real-time data masking of production data. DDM changes the data stream so that the data requester does not get access to the sensitive data while no physical changes to the original production data occur.

To implement DDM, first, you need to specify your sensitive data, the role to mask it and specify designated privileged users that have access to that sensitive data. The next step is to select and implement a masking function.

Types of Masks

There are 4 types of masks offered by SQL Server 2016:

·        Default Mask
The default mask masks the full data according to the assigned datatype.
For string types (char, nchar, varchar, nvarchar, text, ntext) it replaces the data with XXXX or fewer Xs, in case size of field is less than 4.
For numeric datatypes (bigint, bit, decimal, int, money, numeric, smallint, smallmoney, tinyint, float, real) it uses 0 value.
For binary datatypes (binary, varbinary, image) it uses a single byte of binary value 0.

·        Email Mask
The email mask specially for the fields that stores emails. It exposes only the first letter of email followed by XXX, followed by @ sign, followed by XXXX and a constant suffix “.com” to form an email. E.g. aXXX@XXXX.com

·        Custom Mask
SQL Server 2016 allows you to define your own mask for a specific field. In this method, you can define the prefix and suffix characters to be exposed, and the padding will be shown as it is. “prefix,[padding],suffix”.

·        Random Mask
Random mask can be defined over any of the numeric field. It exposes a random number within the defined range.
Example:

Syntax:
ALTER TABLE [Table Name]
ALTER COLUMN [Column Name] ADD MASKED WITH (FUNCTION='[masking function]')





1.      Row Level Security: SQL Server 2016 has introduced Row Level Security (RLS) which is a feature that enables fine grained control over access to rows in a table. RLS allows you to easily control which users can access which data with complete transparency to the application. This enables us to easily restrict the data based on the user identity or security context.




1.      ALWAYS ENCRYPTED: Always Encrypted is a feature designed to protect sensitive data, such as credit card numbers or national identification numbers (for example, U.S. social security numbers), stored in Azure SQL Database or SQL Server databases. Always Encrypted allows clients to encrypt sensitive data inside client applications and never reveal the encryption keys to the Database Engine (SQL Database or SQL Server). As a result, Always Encrypted provides a separation between those who own the data (and can view it) and those who manage the data (but should have no access). By ensuring on-premises database administrators, cloud database operators, or other high-privileged, but unauthorized users, cannot access the encrypted data, Always Encrypted enables customers to confidently store sensitive data outside of their direct control. This allows organizations to encrypt data at rest and in use for storage in Azure, to enable delegation of on-premises database administration to third parties, or to reduce security clearance requirements for their own DBA staff.
Always Encrypted makes encryption transparent to applications. An Always Encrypted-enabled driver installed on the client computer achieves this by automatically encrypting and decrypting sensitive data in the client application. The driver encrypts the data in sensitive columns before passing the data to the Database Engine, and automatically rewrites queries so that the semantics to the application are preserved. Similarly, the driver transparently decrypts data, stored in encrypted database columns, contained in query results.
               To implement it, run the Always Encrypted wizard



  • Deterministic: Always generates the same encrypted value for a given plain text value, making it possible to use the column for equality joins, point lookups, grouping, and indexing. This method is not as secure as randomizing the values because an unauthorized user might be able to guess the encrypted values by examining data patterns, especially for columns that support a small set of possible values, such as True and False.
  • Randomized: Generates encrypted values in a less predictable manner, making it more difficult for users to infer values from the data patterns. However, this also means you cannot use the column for such operations as equality joins, point lookups, grouping, or indexing.



In order to view original data



1.      STRETCH DATABASE: SQL Server Stretch is a technology introduced in SQL Server 2016 that allows selecting tables (especially very large tables) and deciding to keep part of the data in Azure Cloud and part of the data on premises.
·        Open the stretch Database Advisor
·        Select the database to analyze
·        Read the analysis report
Once you have an eligible table to be stretched, start using this feature.

Limitations for Stretch Database
Not all tables will be eligible for being Stretch-enabled. Certain table properties, data and column types, constraints, and indexes are not supported, such as:
·        Memory-optimized and replicated tables
·        Tables that contain FILESTREAM data, use Change Tracking or Change Data
·        Data types such as timestamp, sql_variant, XML, geography or columns that are Always Encrypted
·        Check and default constraints or foreign key constraints that reference the table
·        XML, full-text, spatial, clustered columnstore and indexed views that reference the Stretch-enabled table
·        You cannot run UPDATE or DELETE statements, or run CREATE INDEX or ALTER INDEX operations on a Stretch-enabled table
NOTE: As it requires Azure subscription and I haven’t used it so far, so giving the best link that could give you idea about Stretch Database.




2.     TEMPORAL TABLE: Temporal, or system-versioned, tables were introduced as a database feature in SQL Server 2016. This gives us a type of table that can provide information about the data that was stored at any specified time rather than just the data that is current.
The most common business uses for temporal tables are:
·        Slowly changing dimensions.
·        Data Auditing.
·        Repairing or recovering record level corruptions.
·        Reproducing financial reports, invoices and statements
·        Analyzing trends


To create a new temporal table, you just need to set the temporal table option to ON (for example, SYSTEM_VERSIONING = ON). When the temporal table option is enabled, SQL Server 2016 generates the “historical” table automatically, and internally maintains both parent and historical tables, one for storing the actual data and the other for the historical data. The temporal table’s SYSTEM_TIME period columns (for example SysStartTime and SysEndTime) enable the mechanism to query data for a different time slice more efficiently. The updated or deleted data moves into the “historical” table, whilst the “parent” table keeps the latest row version for updated records.



However, it’s not possible to use DROP TABLE DDL for a temporal table. First, SYSTEM_VERSIONING has to be turned OFF.
ALTER TABLE dbo.Department SET (SYSTEM_VERSIONING = OFF);
-- Create and populate a system-versioned table with hidden period columns
CREATE TABLE Employee(
       EmployeeId int PRIMARY KEY,
       FirstName varchar(20) NOT NULL,
       LastName varchar(20) NOT NULL,
       DepartmentName varchar(50) NOT NULL,
    StartDate datetime2 GENERATED ALWAYS AS ROW START HIDDEN NOT NULL,
    EndDate   datetime2 GENERATED ALWAYS AS ROW END HIDDEN   NOT NULL,
       PERIOD FOR SYSTEM_TIME (StartDate, EndDate)
)
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmployeeHistory))
GO

INSERT INTO Employee (EmployeeId, FirstName, LastName, DepartmentName) VALUES
 (1, 'Chandler', 'Bing', 'Executive'),
 (2, 'Rachel', 'Green', 'Fashion'),
 (3, 'Monica', 'Geller', 'Cooking')

-- Hidden period columns are not returned with SELECT *
SELECT * FROM Employee

-- Hidden period columns can be returned explicitly
SELECT EmployeeId, LastName, StartDate, EndDate FROM Employee


/* Schema changes */

-- Add a column (gets added to history table automatically)
ALTER TABLE Employee
   ADD Gender VARCHAR(1) NULL

SELECT * FROM Employee
SELECT * FROM EmployeeHistory

-- Cleanup
ALTER TABLE Employee SET (SYSTEM_VERSIONING = OFF)
DROP TABLE Employee
DROP TABLE EmployeeHistory
1.      BUILT-IN JSON SUPPORT: SQL Server 2016 has added support for JSON, a lightweight format for exchanging data between different source types, similar to how XML is used. JSON, short for JavaScript Object Notation, is based on a subset of the JavaScript programming language and is noted for being human readable and easy for computers to parse and generate. SQL Server 2016 continues to use the NVARCHAR type to store JSON data. 
Sample JSON Snippet
[{"BusinessEntityID":5,"Title":"Ms.","FirstName":"Gail"},{"BusinessEntityID":6,"Title":"Mr.","FirstName":"Jossef"}]