These notes are by no means comprehensive or complete
Nodes on the internet are identified by IP addresses. Data is passed from one router to another, moving each packet closer to its destination, in the hope that it will ultimately be delivered.
Each router must regularly be supplied with up-to-date routing tables. Individual IP addresses are grouped into prefixes. Autonomous Systems (AS) own these prefixes. Routing tables between ASes are maintained using Border Gateway Protocol (BGP).
An Autonomous System can best be defined as a collection of IP routing prefixes, that are under the control of 1 or more network operators, on behalf of a single entity or domain, that present a single,common and clearly defined routing policy to the Internet.
Peering is when two ASes allow free-flow of traffic between them and their downstream customers. Peering relations are free of charge. Peers cannot see each other’s upstreams.
Transit is when an AS (provider) agrees to:
A transit fee is charged by the provider to the customer.
The transit provider gives the customer a list of routes that belong to other ASes/ISPs that the provider is connected to. The transit client can send/receive traffic from those routes through the provider.
The transit provider also advertises its customers’ routes to its connections, so they know that traffic to those routes needs to go via the transit provider.
The transit provider may themselves use other transit networks too.
A transit-free network is a network based only on peering. These are generally the top-level ASes. They provide transit to smaller ASes, like ISPs.
IMPORTANT
A TRANSIT PROVIDER WILL NOT ANNOUNCE A PEER ROUTE TO OTHER PEERS, OR TO NETWORKS IT BUYS TRANSIT FROM
In the above represented network,
BGP was designed without security - it is based on trust. If an AS says it controls certain IPs, all its peers will believe it, and route app traffic to those IPs, to that AS. Security extensions and third-party validations exist, but they’re not widely used.
An AS may announce an IP it doesn’t own, or claim to have a shorter path to it than is already available - even if that “shortest” path doesn’t actually exist.
Generally, ISPs filter BGP traffic. They allow BGP advertisements from their downstream networks to contain only valid IP space (i.e, IP addresses that the downstream network is known to possess). However, hackings have occurred because this isn’t always true.
If an AS is hacked, this can be exploited. However, this will be quickly found out and reversed.
Open Systems Interconnection.
Widely referenced in academic literature, not often used exactly in industry. For example, the TCP/IP protocol doesn’t fit neatly on top of the OSI model.
From lowest to highest, the 7 layers are:
Physical Layer
Deals in bits. Handles transmission and reception of raw bit streams over a physical medium.
Data Link
Deals in frames. Transmission of data frames b/w 2 nodes. Handles frame sync, errors, QoS, physical addressing, etc.
Network Layer
Deals in packets. Handles structuring and managing a multi-node network, including addressing, routing,etc.
Transport Layer
Deals in segments/datagrams.
Handles reliable transmission of data segments b/w points on a network. Handles segmentation, acknowledgment, multiplexing, handshakes, etc.
Session Layer
Manages communication sessions, i.e, continuous exchange of information in the form of multiple back and forth transmissions btw 2 nodes. Creates the setup and controls the connection. Performs teardown. DNS is often put in this layer.
Logon, name lookup, log off occur here.
Authentication in FTP is built into session layer.
Presentation Layer
Translates data b/w a network and an application. Includes character encoding, data compression and encryption/decryption.
AKA Syntax layer.
TLS/SSL is generally considered to be in this layer.
Application Layer High level protocols, such as HTTP or FTP. Generally include file sharing, message handling, DB access, etc.
The Internet Protocol Suite (TCP/IP) is different than the OSI model. Many layers are similar, but their differences start after the network layer.
TCP/IP layers don’t fit neatly into OSI layers.
Layers in TCP/IP suite are:
TCP/IP doesn’t differentiate between session,presentation and application layer.
This is why it doesn’t fit correctly into OSI.
For example, TLS is built on top of transport layer such as TCP. But, applications generally use it as if it was a transport layer.
Some mentions of TCP/IP don’t even contain a physical layer. The Data Link and Physical layers are merged into a common “Link” layer. It is unspecified as if the Link layer does the job of the physical layer, or if TCP/IP assumes that physical hardware already exists underneath.
Nodes organize themselves into an overlay network that is used to spread information. Network routes aren’t involved in group membership.
If event a happens before event b, it is denoted by a → b
If a and b occur in the same process, they occur in the order they were observed.
If
T(a) → T(b)
then a → b
If a is the event of sending a message, and b is the event of the message being received (by another process),then a → b.
If a and b occur in 2 processes that never exchange any messages, then
a → b
and
b → a
are both false. This basically means that nothing can be said, and nothing needs to be said, about when the events happened, or which one happened first.
Each process keeps an internal clock, that is incremented between 2 consecutive events.
Let’s say P sends a message that is received by the receiving process Q.
The message must also include the time when it was sent, i.e, T(a).
When the message is received, Q will set its internal clock to:
max(T(a)+d,current_clock) Generally, d is set to 1.
Totally ordered multicasting is when events must occur in the same order on all nodes.
For example, consider the following scenario.
A bank has 2 branches, each with their own copy of the database. One branch is in Mumbai, another in Delhi.
A particular customer has 1000 in his account.
Two events occur (nearly) simultaneously in real time:
In Mumbai, 1 → 2.
The customer’s final bank balance = 1110
In Delhi, 2 → 1.
The customer’s final bank balance = 1111
This is an inconsistency. The bank will be okay with either amount in the customer’s account, but the amount on all nodes should match with each other.
We solve this using totally ordered multicasting. The order of events must be the same on all nodes, even if the order is different than the real-world order of events. It should just be the same.
This ensures that all nodes have the same copy of the queue.
Replicas are created for servers, to reduce latency and to provide resilience. They need to managed efficiently and consistency needs to be maintained across all replicas.
What data should we propagate?
Propagate only notification of an update. Replica servers will request the actual data from the main server whenever they need it.
Transfer data from one copy to another.
Propagate the update operation.
Server Based.
Updates are propagated to replicas without request.
Server has to keep a list of client replicas as well as their caches.
Client Based.
Client has to explicitly request data.
This approach is based on leases.
Expiration Times for the leases can be based on:
Age - If an object hasn’t changed in a long time, it probably won’t change in the future. We can provide a longer lease for such an object.
Renewal Frequency Based - If a client requests an object more frequently, the expiration time for that client, for that object, will be higher.
State Based - If a server has high load, it will grant leases with shorter expiry time.
Replicas can be of 3 types in this protocol.
Server P picks another server Q at random. They exchange info using one of the following approaches:
Pull-based approaches are generally better than push-based. This is because a susceptible node may not find any node that is willing to push to it. With pull-based approaches, it can pull the new info itself.
Combination of pull and push has been found to work the best in practice.
A mix of anti-entropy and gossipping is the best
Deletion of data is done through the use of death certificates. An item that has been deleted is given a death certificate, and these certificates are propagated the same way as normal data would be.
The accumulation of death certificates becomes an issue. If a lot of items are deleted everyday, then a lot of storage space will be taken by just the death certificates.
Hence,old death certificates are removed using expiration dates.This always runs the risk that there was a node that didn’t receive the death certificate.
A special server can be set up that never removes any death certificate. IF a deleted item is seen again (after its death certificate has expired), this special server will see it as well. This special server will circulate the death certificate once again.
Consistency Protocols have the following classification:
Primary Based
Remote Write
Local Write
Single Copy
Multiple Copies
Replicated Write
Active Replication
Quorum Based
Each data item has a primary node, which is the node responsible for it.
The status of which node is primary for a particular data item x can change.
Only the primary has the data item x. Processes who need x need to request it from the primary (for read access). If a process wants to update x:
Multiple copies exist, but one is primary.
If a process wants to update x:
In this, writes can be carried out anywhere, not just at the primary.
A special process carries out updates at each replica.
A sequencer can be used that assigns a unique ID to each update. Each update is then propagated with this unique ID.
Clients must request and acquire permission from multiple replicas before a read/write operation.
For e.g, get the majority vote before reading/writing.
All processes that vote yes will perform the update and have a newer version of the data.
When reading, majority must say yes and they must all have the same version number of the data. This ensures that the client doesn’t request the data from a replica that has an older version of the data.
Let the number of votes required to read be Nr , and the number of votes to write be Nw. Let the total number of nodes be N.
Gifford’s method states that the value of Nr and Nw should follow the following two rules.
The first rule prevents read-write conflicts.
The second rule prevents write-write conflicts.
A special case of Gifford’s method is Read One Write All
In this, Nr = 1 and Nw = N
By ensuring that updates are written to all nodes, we can read any one of them and be sure that we have the latest version of the data.
Fault Tolerance has the following criteria.
Availability - It measures the degree to which the system is up and ready for use at any instant.
Reliability - It measures the degree to which the system is ready to use for longer periods of time.
Safety - Nothing catastrophic should happen even if the system fails.
Maintainability - The degree to which the system can be brought back up in case it fails, and the underlying issue can be fixed.
Availability | Reliability |
---|---|
Measures instant readiness of system | Measures readiness of system across long periods of time |
Measures the ability of a system to do its job if needed | Measures the ability of a system to perform its function for some interval, without failure. |
Measures the (average) amount of time the service was down | Measures the frequency/probability of failure |
A system that goes down one millisecond every hour has high availability but low reliability | A system that works perfectly usually, but crashes for 2 weeks every August is highly reliable but has only 96% availability |
Availability is 1- (1/3600000), Reliability(measured as failure rate) is once every hour | Availability is 50/52, Reliability (measured as failure rate) is once every year |
Server halts, but worked correctly until it halts. No response is seen from the server after it halts.
AKA Fail-Silent
Server halts, but worked correctly until it halts. Other servers are able to detect it has halted.
Server produces junk output, and other processes are able to recognize it as junk.
Server fails to respond to incoming messages.
Server’s response is outside the specified time interval.
Incorrect response.
Value Failure - Wrong value
State Transition Failure
Server goes through the wrong flow of control.
Server sends arbitrary data at arbitrary times.
AKA conventional cryptography/shared-key systems/secret-key systems.
Sender and receiver share the same key, which is used both for encryption and decryption.
The shared key must be kept private. Anyone in possession of the key can read encrypted messages.
The notation Ka, b is used to denote a secret-key shared by A and B.
AKA Public-key cryptography.
The keys for encryption and decryption are different, but form a unique pair. The key for decryption can only decrypt the data encrypted with its pair key.
Key for encryption - KE.
Key for decryption - KD.
One of the keys is made public, and the other one kept private.
The notation KA+ is used to denote a public key belonging to A, and KA− denotes a private key belonging to A.
If Bob wants to send a message to Alice, he should encrypt it using Alice’s public key. Since Alice is the only person who possessed the corresponding private key, only she can decrypt the message.
A hash function H takes a message m of arbitrary length, and produces a bit-string h having a fixed length.
h = H(m)
By the pigeonhole principle, many inputs can result in the same output.
A good hash function should have the following properties.
i.e, the hash function should be REPEATABLE and IRREVERSIBLE (ONE-WAY).
An asymmetric encryption algorithm named after its inventors - Rivest, Shamir and Adleman.
Based on the fact that prime factorization of very large numbers is a difficult and time-consuming process.
Take 2 very large prime numbers - p and q.
Calculate n = p * q z = (p−1) * (q−1)
Choose d such that d is relatively prime to z.
Compute e such that
(e*d)%z = 1
Now, the number d can be used for decryption, and e for encryption.
One of these is kept private, and the other is made public.
Let the message to be sent be m. Here, m is interpreted simply as a binary number.
0 ≤ mi ≤ n
Each mi is also interpreted as a binary number.
The sender calculates ci = (mie)%n All such ci are calculated and concatenated into a single variable c.
c is sent to the receiver.
The receiver calculates yi = (cid)%n
Based on the properties of modulus, and the way we have chosen e and d, we can easily see that yi = mi∀i.
This way, the receiver is able to reconstruct the message.
Securely sending messages has the following problems to solve.
Confidentiality
No one else other than the intended recipient should be able to read the message.
Integrity
The recipient should have a way to be sure that the contents of the message weren’t tampered.
Authentication
Both parties should have a way to be confident that they are sending messages to the right person.
Alice is denoted by A and Bob by B. Their shared key is denoted by KA, B.
Bob has now verified Alice’s identity. But Alice hasn’t verified Bob’s.
Suppose we try to optimize the above approach.
Alice has to eventually send her challenge to Bob anyway, so she can just send her challenge when she’s sending her identity (as part of step 1).
Similarly, Bob can return the response to Alice’s challenge, and his own challenge, in a single message.
The above protocol can easily be defeated using a reflection attack.
A reflection attack is a way to attack a challenge response system which uses the same protocol in both directions.
We basically try to trick the target into answering its own challenge.
The attacker initiates a connection to a target.
The target attempts to authenticate the attacker by sending it a challenge.
The attacker opens another connection to the target, and sends the target this challenge as its own.
The target responds to that challenge.
The attacker sends that response back to the target (“reflects” it) on the first connection.
Chuck wants to pretend to be Alice, and talk to Bob. Chuck does not possess the key KA, B. So, Chuck will trick Bob into encrypting his own challenge and giving it to Chuck.
The same protocol was being used in both directions. It is always better to use a different challenge for the initiator and the responder.
A basic example would be to make it so that challenges by the initiator (Alice/Chuck) are always odd numbers, and by the responder (Bob) are always even numbers.
(However even this approach would be susceptible to a MITM (Man in the middle) attack)
Bob gave away valuable information in the form of the response KA, B(RC) and KA, B(RB), without knowing who he was giving it to.
This wasn’t violated in the original protocol, where Alice had to prove her identity first.
This system assumes there is some way to verify everyone’s public keys.
i.e, if Alice wants to check whether a particular message was sent by Bob,
This assumes that Alice has a verified way, of getting Bob’s public key, and being absolutely sure that the public key she has indeed belongs to Bob.
Generally, this involves the use of a trusted source, such as a KDC (Key Distribution Center), or a CA (Certificate Authority).
Alice wants to set up a secure channel with Bob. Both possess each other’s public key.
Alice sends a challenge RA to Bob, encrypted with his public key KB+. She knows that only Bob will be able to decrypt this message.
The message takes the form KB+(A,RA).
When Bob receives the message, he will do the following things:
The final message takes the form KA+(RA,RB,KA, B)
Alice encrypts Bob’s challenge (RB) with the session key KA, B and sends it back to Bob. This lets Bob know that the person on the other end is in fact Alice, since only Alice could have decrypted the previous message and extracted RB and KA, B from it.
Alice and Bob communicate further using KA, B for this session. Once the session ends, KA, B is destroyed.
Confidentiality and Integrity needs to be maintained in secure channels.
Digital Signatures are used for this. The document is signed using the sender’s public key, which uniquely ties the sender to the message.
A solution for the second problem is a message digest.
It’s a fixed length string h that’s computed from a message m of arbitrary length, using a hash function H.
If m is changed to m′, then it’s hash H(m′) will not be the same as before (H(m)). Thus, modifications will easily be detected.
Instead of signing m, Alice signs H(m), which becomes the signature.
The message sent to Bob is now KB+(m,KA−(H(m))), where KA−(H(m)) is the signature.
On Bob’s end, Bob will hash the entire message himself, decrypt the signature, and compare the hashes. If they match, all is good.
This is a method for 2 parties to exchange keys without the use of a third party.
This way both Alice and Bob get the same session key, and no one listening from outside will be able to recreate it. This is based on the same principle as RSA.