Scalable cloud

Lessons learned from papers

Exam Prep SSD

Lecturer seems sound and quite good

Overview of general distributed scalable systems
- Search engines (crawl, index and search)
- Social Networking (response time, large amount of data)
- Cloud Computing (availability and access to scalable resources)
- CDNs (Scalable web hosting, file distribution media streaming)
Design, data centres and cloud computing, scalable storage and querying, compute
These are the papers for storage and querying: – "Bigtable: A Distributed Storage System for Structured Data", Seventh Symposium on Operating System Design and Implementation (OSDI), Seattle, WA, November, 2006 – "Dynamo: Amazon's Highly Available Key-Value Store", ACM Symposium on Operating Systems Principles (SOSP), Stevenson, WA, October 2007 – "Spanner: Google's Globally-Distributed Database", Tenth Symposium on Operating System Design and Implementation (OSDI), Hollywood, CA, October, 2012
Papers for Scalable compute: – "MapReduce: Simplified Data Processing on Large Clusters", Sixth Symposium on Operating System Design and Implementation (OSDI), San Francisco, CA, December, 2004. – "Resilient Distributed Datasets", 9th USENIX conference on Networked Systems Design and Implementation (NSDI), San Jose, CA, April 2012
Method for reading papers:
1. Skim the paper and get the gist
2. Come back for a deep read
3. Look at sample questions and find answers in the paper
Heinis will deal with scalable data
Prepare and work on research papers in lectures and seminars, as one of the courseworks is answering questions on a paper
The exam will also have a paper-based question
Resources: – “Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems”, Martin Kleppmann, O'Reilly Media, September 2014:
1. Focuses more on the data management side
2. Recommended – “The Art of Scalability: Scalable Web Architecture, Processes and Organizations for the Modern Enterprise”, Martin L. Abbott, Michael T. Fisher, Addison Wesley, 1st Edition, December 2009:
3. A little more high-level
4. A little outdated
Blogs: – http://highscalability.com/ – http://www.allthingsdistributed.com/ (Werner Vogel’s blog) – http://perspectives.mvdirona.com/ (James Hamilton’s blog)
Spanner is the hardest paper covered

Scalable Distributed Systems

Mainframe:
1. Single point of failure
2. Does not scale incrementally
3. Slow if used as a CDN
Data Centres:
1. Scale out - horizontal
Types of Scalable Systems:
1. Online and user-facing (latency of < 100 ms)
2. Batch processing systems (> 1 hr)
  - Hadoop, Spark
  - Offline data processing
3. Nearline systems (< 1 sec)
  - Dynamic content presented to users
  - CDN-ed content
  - Prediction, recommendations, etc..
Design principles: • Stateless services • Caching • Partition/aggregation pattern • Weaker consistency • Efficient failure recovery