Some SQL basics

1, Index

An index is a set of data pointers stored on disk associated with a single table. The main advantage is they greatly speed up select, update, and delete statements, as the query optimizer only has to search the index rather than the entire table to find the relevant rows. There are

2

potential disadvantages: they slow down insert statements slightly, as an index pointer must be createdforeverynewrow that is inserted, and they increase the amount of storage requiredforthe database. In most cases the advantage of the increase in speedforselect, update, and delete statements far out ways the disadvantage of the slight increase in time to perform inserts and the additional storage requirements.

CREATE INDEX index_name
ON table_name (column_name);

By default, when you create this table, your data will be stored on disk and sorted by the "Id" primary key column. This default sort is called the "Clustered Index". Affects the physical order of data so there can only one clustered index.

But if you search by other non-primary key columns most of the time on this table, then you might want to consider changing the clustered index to this column instead.

There a few things to keep in mind when changing the default clustered index in a table:

Lookups from non-clustered indexes must look up the query pointer in the clustered index to get the pointer to the actual data records instead of going directly to the data on disk (usually this performance hit is negligble).
Inserts will be slower because the insert must be added in the exact right place in the clustered index. (NOTE: This does not re-order the data pages. It just inserts the record in the correct order in the page that it corresponds to. Data pages are stored as doubly-linked lists so each page is pointed to by the previous and next. Therefore, it is not important to reorder the pages, just their pointers and that is only in the case where the newly inserted row causes a new data page to be created.)

Non-clustered indexes are not copies of the table but a sorting of the columns you specify that "point" back to the data pages in the clustered index. With a non clustered index there is a second list that has pointers to the physical rows. You can have many non clustered indexes, although each new index will increase the time it takes to write new records.

It is generally faster to read from a clustered index if you want to get back all the columns. You do not have to go first to the index and then to the table.

Writing to a table with a clustered index can be slower, if there is a need to rearrange the data.

小总：Clustered index意思是在选取某个column A，以它排序来存储table里的所有records，所以当你以A为选择条件来做query的时候，因为physically records locate in the same order as the index，通过clustered index可以很快找到符合条件的records。

Non-clustered index意思是比如还是column A，index会存储A的值以及a pointer to the in the table where that value is actually stored.而clusterd index会在leaf node里存储整条record。所以clustered index会更快。

2, SQL aggregate functions:

AVG(), MIN(), MAX(), COUNT(), LAST(), SUM(), FIRST()...

SELECT COUNT(CustomerID) AS OrdersFromCustomerID7 FROM Orders WHERE CustomerID=7;

SELECT AVG(Price) AS PriceAverage FROM Products;

1. DDL – Data Definition Language
DDL is used to define the structure that holds the data. For example. table

2. DML– Data Manipulation Language
DML is used for manipulation of the data itself. Typical operations are Insert,Delete,Update and retrieving the data from the table

3, Transactions

A transaction comprises a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions. Can rollback if system fails.

1. Atomicity
A transaction consists of many steps. When all the steps in a transaction gets completed, it will get reflected in DB or if any step fails, all the transactions are rolled back.

2. Consistency
The database will move from one consistent state to another, if the transaction succeeds and remain in the original state, if the transaction fails.

3. Isolation
Every transaction should operate as if it is the only transaction in the system

4. Durability
Once a transaction has completed successfully, the updated rows/records must be available for all other transactions on a permanent basis

Database lock tells a transaction if the data items in question is being used by other transactions. Share lock enables you to read it while exclusive lock not read.

4, Normalization

The process of removing the redundant data, by splitting up the table in a well defined fashion is called normalization.

1. First Normal Form (1NF)
A relation is said to be in first normal form if and only if all underlying domains contain atomic values only. After 1NF, we can still have redundant data

2. Second Normal Form (2NF)
A relation is said to be in 2NF if and only if it is in 1NF and every non key attribute is fully dependent on the primary key. After 2NF, we can still have redundant data

3. Third Normal Form (3NF)
A relation is said to be in 3NF, if and only if it is in 2NF and every non key attribute is non-transitively dependent on the primary key

5, Primary Key & Foreign Key

A primary key is a column whose values uniquely identify every row in a table. Cannot be null.Value cannot be modified.

A Composite primary key is a set of columns whose values uniquely identify every row in a table. 
A foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table.

6, Select/Insert/Select/Delete

INSERT INTO table_name
VALUES (value1,value2,value3,...);

INSERT INTO table_name (column1,column2,column3,...)
VALUES (value1,value2,value3,...);

SELECT * FROM Customers select Name, count(Name) from table
WHERE Country='Mexico'; group by Name Having count > 1 //Having is used on aggregate function

SELECT DISTINCT City FROM Customers;

SELECT * FROM Customers
ORDER BY Country; or ORDER BY Country DESC;

UPDATE Customers
SET ContactName='Alfred Schmidt', City='Hamburg'
WHERE CustomerName='Alfreds Futterkiste';

DELETE FROM Customers
WHERE CustomerName='Alfreds Futterkiste' AND ContactName='Maria Anders';

SELECT * FROM Customers
WHERE City LIKE 's%';

SELECT Shippers.ShipperName,COUNT(Orders.OrderID) AS NumberOfOrders FROM Orders
LEFT JOIN Shippers
ON Orders.ShipperID=Shippers.ShipperID
GROUP BY ShipperName;

SELECT sID, sName FROM Student WHERE sID IN (SELECT sID FROM Apply WHERE major='CS'); AND sID NOT IN (SELECT sID FROM Appy WHERE major='EE');

其它关键字: in, not in, all, any, exists, = <=, <, >

SELECT sID FROM Student WHERE sizeHS>any(SELECT sizeHS FROM Student);

7,Join

INNER JOIN: Returns all rows when there is at least one match in BOTH tables
LEFT JOIN: Return all rows from the left table, and the matched rows from the right table
RIGHT JOIN: Return all rows from the right table, and the matched rows from the left table
FULL JOIN: Return all rows when there is a match in ONE of the tables

Cross join: 返回的是两个table的笛卡儿积

SELECT * FROM employee CROSS JOIN department; 或 SELECT * FROM employee, department;

Inner join: SELECT * FROM employee, department WHERE employee.DepartmentID = department.DepartmentID 或SELECT * FROM employee inner join department on employee.DepartmentID = department.DepartmentID.

Outer join: SELECT * FROM employee LEFT OUTER JOIN department ON employee.DepartmentID = department.DepartmentID;

Self join: SELECT F.EmployeeID, F.LastName, S.EmployeeID, S.LastName FROM Employee F INNER JOIN Employee S ON F.Country = S.Country

WHERE F.EmployeeID<S.EmployeeID

SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
LEFT JOIN Orders //This will display all rows from table Customers.
ON Customers.CustomerID=Orders.CustomerID
ORDER BY Customers.CustomerName;

SELECT City FROM Customers
UNION //Use UNION ALL to also select duplicated values.
SELECT City FROM Suppliers
ORDER BY City;

8, Constrains: NOT NULL, PRIMARY KEY, FOREIGN KEY, UNIQUE.

SQL constraints are used to specify rules for the data in a table.

If there is any violation between the constraint and the data action, the action is aborted by the constraint.

Constraints can be specified when the table is created (inside the CREATE TABLE statement) or after the table is created (inside the ALTER TABLE statement).

9, Table create/alter/drop

CREATE TABLE Persons( DROP TABLE table_name ALTER TABLE Persons

PersonID int, ADD DateOfBirth date
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);

CREATE INDEX PIndex Drop index: ALTER TABLE table_name DROP INDEX index_name
ON Persons (LastName)

10,View

Views are virtual tables. Unlike tables that contain data, views simply contain queries that dynamically retrieve data when used. Modify to view rewritten to modify base tables.
create view CSaccept as 
select sID, cName
from Apply
where mahor = 'CS' and decision='Y';

Fix:

FIX (financial information exchange) protocol is the global protocol used for Electronic trading of different asset classes e.g Equity, Fixed Income FX (foreign exchange) , Derivatives Futures and Options and its knowledge is essential to understand Electronic trading and FIX messages. FIX is widely used by both the buy side (institutions) as well as the sell side(brokers/dealers) of the financial markets.

Since different exchange uses there proprietary exchange protocol (e.g. HKSE uses OG, TSE uses Arrowhead protocol and NASDAQ uses OUCH protocol), use of FIX engine on exchange side is less as compared to client side, As clients and firms prefer to use FIX protocol for sending orders and receiving executions.

They are composed of a header, a body, and a trailer.

Up to FIX.4.4, the header contained three fields: 8 (BeginString), 9 (BodyLength), and 35 (MsgType) tags. The body of the message is entirely dependent on the message type defined in the header (tag 35, MsgType). The last field of the message is tag 10, FIX Message Checksum. It is always expressed as a three-digit number (e.g. 10=002). The checksum algorithm of FIX consists of summing up the decimal value of the ASCII representation all the bytes up to the checksum field (which is last) and return the value modulo 256.

Fix Engine:

1) Establish Fix Connectivity by sending session level messages.

2) manage FIX Session

3) recover if FIX session lost

4) creating, sending, parsing FIX messages for electronic trading.

5) handles replay

6) supports different FIX protocol version and tags.

Fix session parameters: SocketConnectHost, SocketConnectPort, DataDictionary, SenderCompID, TargetCompID, StartTime, ENdTime, HeartBtInt, BeginString.

ClOrdID11 and OrderID37?
ClOrdId is a unique id assigned by buy-side while the later is assigned by sell-side. OrderID normally remains same for a message chain (e.g. on Cancel and mod orders) while ClOrdID changes with Cancel and Modification.

TransactTime60 and Sending Time52?
TransactTime: Time of execution/order creation (expressed in UTC (Universal Time Coordinated, also known as 'GMT')
SendingTime: Time of message transmission (always expressed in UTC (Universal Time Coordinated, also known as 'GMT')

MsgSeqNum34?
All FIX messages are identified by a unique sequence number. Sequence numbers are initialized at the start of each FIX session starting at 1 and increment throughout the session. Monitoring sequence numbers will enable parties to identify and react to missed messages and to gracefully synchronize applications when reconnecting during a FIX session.
Each session will establish an independent incoming and outgoing sequence series; participants will maintain a sequence series to assign to outgoing messages and a separate series to monitor for sequence gaps on incoming messages. Logically we can divide sequence number into two Incoming and Outgoing Sequence number.
Incoming sequence number is the number any FIX Engine expecting from Counter Party and Outgoing sequence number is the number any FIX engine is sending to Counter Party.

NewOrderSingle message is denoted by MsgType=D and its used to place an Order, OrderCancelReplace Request is modification request denoted by MsgType=G in fix protocol and used to modify Order e.g for changing quantity or price of Order.
OrderCancelRequest is third in this category denoted by MsgType=F in fix protocol and used to cancel Order placed into Market. OrderCancelReject 9. ExecutionReport 8.

What are most common issues encounter when two FIX Engine communicates ?
When Clients connect to broker via fix protocol, their FIX engine connects to each other, while setting up and during further communication many issues can occur below are some of most common ones:
Issues related to network connectivity,
Issues related to Firewall rules
Issue related to incorrect host/port name while connecting.
Incorrect SenderCompID49 and TargetCompID50
Sequence Number mismatch
Issue related to fix version mismatch

What happens if Client connects with Sequence No higher than expected?
If Client FIX Engine connects to Broker Fix Engine with Sequence Number higher than expected (e.g. broker is expecting Sequence Number = 10 and Client is sending = 15). As per fix protocol Broker will accept the connection and issue a Resend Request (MsgType=2) asking Client to resend missing messages (from messages 10 -15) , Now Client can either replay those messages or can issue a Gap Fill Message (MsgType=4 as per fix protocol) in case replaying those messages  doesn't make any sense (could be admin messages e.g. Heartbeat etc).
What happens if Client connects with Sequence No lower than expected?
If Client FIX engine connects to broker FIX engine with Sequence No lower than expected than broker FIX engine will disconnect the connection. As per fix protocol client then may try by increasing its sequence Number until broker accepts its connection.

Which of the following orders would be automatically canceled if not executed immediately?
Fill or Kill (FOK) and Immediate or Cancel (IOC) orders are types of order which either executed immediately or get cancelled by exchange. TimeInForce (tag 59) in fix protocol is used to mark an order as FOK or IOC.

What is the difference between FOK order and IOC Order?
Main difference between FOK and IOC Order is that FOK demands full execution of order i.e. all quantity has to be filled while IOC order is ready to accept partial fills also?
Initial connection
Once network connection gets established you are ready to connect to client. Now client will send logon request (MessagType=A) with sequence no 1, 1 (At start of day) and with SenderCompID and TargetCompID agreed upon. On TCP layer first of all socket connection gets established in client IP and your IP and your Fix Engine listens on port specified. once your Fix Engine gets logon request its validate content and authenticity of client and if all is OK it replies with another logon request message. Now your fix session is established and you are ready to send orders via this connection.