# Coding Interview University
I originally created this as a short to-do list of study topics for becoming a software engineer, but it grew to the large list you see today. After going through this study plan, I got hired as a Software Development Engineer at Amazon (opens new window)! You probably won't have to study as much as I did. Anyway, everything you need is here.
I studied about 8-12 hours a day, for several months. This is my story: Why I studied full-time for 8 months for a Google interview (opens new window)
The items listed here will prepare you well for a technical interview at just about any software company, including the giants: Amazon, Facebook, Google, and Microsoft.
Best of luck to you!
Translations:
Translations in progress:
- हिन्दी (opens new window)
- עברית (opens new window)
- Bahasa Indonesia (opens new window)
- Arabic (opens new window)
- Turkish (opens new window)
- French (opens new window)
- Ukrainian (opens new window)
- Korean(한국어) (opens new window)
- Telugu (opens new window)
- Urdu (opens new window)
- Thai (opens new window)
- Greek (opens new window)
- Italian (opens new window)
- Malayalam (opens new window)
# What is it?
This is my multi-month study plan for going from web developer (self-taught, no CS degree) to software engineer for a large company.
This is meant for new software engineers or those switching from software/web development to software engineering (where computer science knowledge is required). If you have many years of experience and are claiming many years of software engineering experience, expect a harder interview.
If you have many years of software/web development experience, note that large software companies like Google, Amazon, Facebook and Microsoft view software engineering as different from software/web development, and they require computer science knowledge.
If you want to be a reliability engineer or operations engineer, study more from the optional list (networking, security).
# Table of Contents
- What is it?
- Why use it?
- How to use it
- Don't feel you aren't smart enough
- About Video Resources
- Interview Process & General Interview Prep
- Pick One Language for the Interview
- Book List
- Before you Get Started
- What you Won't See Covered
- Prerequisite Knowledge
- The Daily Plan
- Algorithmic complexity / Big-O / Asymptotic analysis
- Data Structures
- More Knowledge
- Trees
- Trees - Notes & Background
- Binary search trees: BSTs
- Heap / Priority Queue / Binary Heap
- balanced search trees (general concept, not details)
- traversals: preorder, inorder, postorder, BFS, DFS
- Sorting
- selection
- insertion
- heapsort
- quicksort
- merge sort
- Graphs
- directed
- undirected
- adjacency matrix
- adjacency list
- traversals: BFS, DFS
- Even More Knowledge
- System Design, Scalability, Data Handling (if you have 4+ years experience)
- Final Review
- Coding Question Practice
- Coding exercises/challenges
- Once you're closer to the interview
- Your Resume
- Be thinking of for when the interview comes
- Have questions for the interviewer
- Once You've Got The Job
---------------- Everything below this point is optional ----------------
# Additional Resources
- Additional Books
- Additional Learning
- Compilers
- Emacs and vi(m)
- Unix command line tools
- Information theory
- Parity & Hamming Code
- Entropy
- Cryptography
- Compression
- Computer Security
- Garbage collection
- Parallel Programming
- Messaging, Serialization, and Queueing Systems
- A*
- Fast Fourier Transform
- Bloom Filter
- HyperLogLog
- Locality-Sensitive Hashing
- van Emde Boas Trees
- Augmented Data Structures
- Balanced search trees
- AVL trees
- Splay trees
- Red/black trees
- 2-3 search trees
- 2-3-4 Trees (aka 2-4 trees)
- N-ary (K-ary, M-ary) trees
- B-Trees
- k-D Trees
- Skip lists
- Network Flows
- Disjoint Sets & Union Find
- Math for Fast Processing
- Treap
- Linear Programming
- Geometry, Convex hull
- Discrete math
- Machine Learning
- Additional Detail on Some Subjects
- Video Series
- Computer Science Courses
- Papers
# Why use it?
When I started this project, I didn't know a stack from a heap, didn't know Big-O anything, anything about trees, or how to traverse a graph. If I had to code a sorting algorithm, I can tell ya it wouldn't have been very good. Every data structure I've ever used was built into the language, and I didn't know how they worked under the hood at all. I've never had to manage memory unless a process I was running would give an "out of memory" error, and then I'd have to find a workaround. I've used a few multidimensional arrays in my life and thousands of associative arrays, but I've never created data structures from scratch.
It's a long plan. It may take you months. If you are familiar with a lot of this already it will take you a lot less time.
# How to use it
Everything below is an outline, and you should tackle the items in order from top to bottom.
I'm using Github's special markdown flavor, including tasks lists to check progress.
Create a new branch so you can check items like this, just put an x in the brackets: [x]
Fork a branch and follow the commands below
Fork the GitHub repo https://github.com/jwasham/coding-interview-university by clicking on the Fork button
Clone to your local repo
git clone git@github.com:<your_github_username>/coding-interview-university.git
git checkout -b progress
git remote add jwasham https://github.com/jwasham/coding-interview-university
git fetch --all
Mark all boxes with X after you completed your changes
git add .
git commit -m "Marked x"
git rebase jwasham/master
git push --set-upstream origin progress
git push --force
More about Github-flavored markdown (opens new window)
# Don't feel you aren't smart enough
- Successful software engineers are smart, but many have an insecurity that they aren't smart enough.
- The myth of the Genius Programmer (opens new window)
- It's Dangerous to Go Alone: Battling the Invisible Monsters in Tech (opens new window)
# About Video Resources
Some videos are available only by enrolling in a Coursera or EdX class. These are called MOOCs. Sometimes the classes are not in session so you have to wait a couple of months, so you have no access.
I'd appreciate your help to add free and always-available public sources, such as YouTube videos to accompany the online course videos.
I like using university lectures.
# Interview Process & General Interview Prep
- [ ] ABC: Always Be Coding (opens new window)
- [ ] Whiteboarding (opens new window)
- [ ] Demystifying Tech Recruiting (opens new window)
- [ ] How to Get a Job at the Big 4:
- [ ] Cracking The Coding Interview Set 1:
- [ ] Cracking the Facebook Coding Interview:
- [ ] Prep Course:
- [ ] Software Engineer Interview Unleashed (paid course) (opens new window):
- Learn how to make yourself ready for software engineer interviews from a former Google interviewer.
- [ ] Python for Data Structures, Algorithms, and Interviews (paid course) (opens new window):
- A Python centric interview prep course which covers data structures, algorithms, mock interviews and much more.
- [ ] Intro to Data Structures and Algorithms using Python (Udacity free course) (opens new window):
- A free Python centric data structures and algorithms course.
- [ ] Data Structures and Algorithms Nanodegree! (Udacity paid Nanodegree) (opens new window):
- Get hands-on practice with over 100 data structures and algorithm exercises and guidance from a dedicated mentor to help prepare you for interviews and on-the-job scenarios.
- [ ] Grokking the Behavioral Interview (Educative free course) (opens new window):
- Many times, it’s not your technical competency that holds you back from landing your dream job, it’s how you perform on the behavioral interview.
- [ ] Software Engineer Interview Unleashed (paid course) (opens new window):
# Pick One Language for the Interview
You can use a language you are comfortable in to do the coding part of the interview, but for large companies, these are solid choices:
- C++
- Java
- Python
You could also use these, but read around first. There may be caveats:
- JavaScript
- Ruby
Here is an article I wrote about choosing a language for the interview: Pick One Language for the Coding Interview (opens new window).
You need to be very comfortable in the language and be knowledgeable.
Read more about choices:
- http://www.byte-by-byte.com/choose-the-right-language-for-your-coding-interview/
- http://blog.codingforinterviews.com/best-programming-language-jobs/
You'll see some C, C++, and Python learning included below, because I'm learning. There are a few books involved, see the bottom.
# Book List
This is a shorter list than what I used. This is abbreviated to save you time.
# Interview Prep
- [ ] Programming Interviews Exposed: Coding Your Way Through the Interview, 4th Edition (opens new window)
- answers in C++ and Java
- this is a good warm-up for Cracking the Coding Interview
- not too difficult, most problems may be easier than what you'll see in an interview (from what I've read)
- [ ] Cracking the Coding Interview, 6th Edition (opens new window)
- answers in Java
# If you have tons of extra time:
Choose one:
- [ ] Elements of Programming Interviews (C++ version) (opens new window)
- [ ] Elements of Programming Interviews in Python (opens new window)
- [ ] Elements of Programming Interviews (Java version)
# Language Specific
You need to choose a language for the interview (see above).
Here are my recommendations by language. I don't have resources for all languages. I welcome additions.
If you read through one of these, you should have all the data structures and algorithms knowledge you'll need to start doing coding problems. You can skip all the video lectures in this project, unless you'd like a review.
Additional language-specific resources here.
# C++
I haven't read these two, but they are highly rated and written by Sedgewick. He's awesome.
- [ ] Algorithms in C++, Parts 1-4: Fundamentals, Data Structure, Sorting, Searching (opens new window)
- [ ] Algorithms in C++ Part 5: Graph Algorithms (opens new window)
- [ ] Open Data Structures in C++ (opens new window)
- Rich and detailed collection of Data Structures and Algorithms
- Great for first-timers
If you have a better recommendation for C++, please let me know. Looking for a comprehensive resource.
# Java
- [ ] Algorithms (Sedgewick and Wayne) (opens new window)
- videos with book content (and Sedgewick!) on coursera:
OR:
- [ ] Data Structures and Algorithms in Java (opens new window)
- by Goodrich, Tamassia, Goldwasser
- used as optional text for CS intro course at UC Berkeley
- see my book report on the Python version below. This book covers the same topics
# Python
- [ ] Data Structures and Algorithms in Python (opens new window)
- by Goodrich, Tamassia, Goldwasser
- I loved this book. It covered everything and more
- Pythonic code
- my glowing book report: https://startupnextdoor.com/book-report-data-structures-and-algorithms-in-python/
- [ ] Open Data Structures in Python (opens new window)
# Before you Get Started
This list grew over many months, and yes, it kind of got out of hand.
Here are some mistakes I made so you'll have a better experience.
# 1. You Won't Remember it All
I watched hours of videos and took copious notes, and months later there was much I didn't remember. I spent 3 days going through my notes and making flashcards, so I could review.
Please, read so you won't make my mistakes:
Retaining Computer Science Knowledge (opens new window).
A course recommended to me (haven't taken it): Learning how to Learn (opens new window).
# 2. Use Flashcards
To solve the problem, I made a little flashcards site where I could add flashcards of 2 types: general and code. Each card has different formatting.
I made a mobile-first website, so I could review on my phone and tablet, wherever I am.
Make your own for free:
- Flashcards site repo (opens new window)
- My flash cards database (old - 1200 cards) (opens new window):
- My flash cards database (new - 1800 cards) (opens new window):
Keep in mind I went overboard and have cards covering everything from assembly language and Python trivia to machine learning and statistics. It's way too much for what's required.
Note on flashcards: The first time you recognize you know the answer, don't mark it as known. You have to see the same card and answer it several times correctly before you really know it. Repetition will put that knowledge deeper in your brain.
An alternative to using my flashcard site is Anki (opens new window), which has been recommended to me numerous times. It uses a repetition system to help you remember. It's user-friendly, available on all platforms and has a cloud sync system. It costs $25 on iOS but is free on other platforms.
My flashcard database in Anki format: https://ankiweb.net/shared/info/25173560 (thanks @xiewenya (opens new window)).
# 3. Start doing coding interview questions while you're learning data structures and algorithms
You need to apply what you're learning to solving problems, or you'll forget. I made this mistake. Once you've learned a topic, and feel comfortable with it, like linked lists, open one of the coding interview books and do a couple of questions regarding linked lists. Then move on to the next learning topic. Then later, go back and do another linked list problem, or recursion problem, or whatever. But keep doing problems while you're learning. You're not being hired for knowledge, but how you apply the knowledge. There are several books and sites I recommend. See here for more: Coding Question Practice.
# 4. Review, review, review
I keep a set of cheat sheets on ASCII, OSI stack, Big-O notations, and more. I study them when I have some spare time.
Take a break from programming problems for a half hour and go through your flashcards.
# 5. Focus
There are a lot of distractions that can take up valuable time. Focus and concentration are hard. Turn on some music without lyrics and you'll be able to focus pretty well.
# What you won't see covered
These are prevalent technologies but not part of this study plan:
- SQL
- Javascript
- HTML, CSS, and other front-end technologies
# The Daily Plan
Some subjects take one day, and some will take multiple days. Some are just learning with nothing to implement.
Each day I take one subject from the list below, watch videos about that subject, and write an implementation in:
- C - using structs and functions that take a struct * and something else as args
- C++ - without using built-in types
- C++ - using built-in types, like STL's std::list for a linked list
- Python - using built-in types (to keep practicing Python)
- and write tests to ensure I'm doing it right, sometimes just using simple assert() statements
- You may do Java or something else, this is just my thing
You don't need all these. You need only one language for the interview.
Why code in all of these?
- Practice, practice, practice, until I'm sick of it, and can do it with no problem (some have many edge cases and bookkeeping details to remember)
- Work within the raw constraints (allocating/freeing memory without help of garbage collection (except Python or Java))
- Make use of built-in types, so I have experience using the built-in tools for real-world use (not going to write my own linked list implementation in production)
I may not have time to do all of these for every subject, but I'll try.
You can see my code here:
You don't need to memorize the guts of every algorithm.
Write code on a whiteboard or paper, not a computer. Test with some sample inputs. Then test it out on a computer.
# Prerequisite Knowledge
[ ] Learn C
- C is everywhere. You'll see examples in books, lectures, videos, everywhere while you're studying
- [ ] The C Programming Language, Vol 2 (opens new window)
- This is a short book, but it will give you a great handle on the C language and if you practice it a little you'll quickly get proficient. Understanding C helps you understand how programs and memory work
- Answers to questions (opens new window)
[ ] How computers process a program:
# Algorithmic complexity / Big-O / Asymptotic analysis
- Nothing to implement
- There are a lot of videos here. Just watch enough until you understand it. You can always come back and review
- If some lectures are too mathy, you can jump down to the bottom and watch the discrete mathematics videos to get the background knowledge
- [ ] Harvard CS50 - Asymptotic Notation (video) (opens new window)
- [ ] Big O Notations (general quick tutorial) (video) (opens new window)
- [ ] Big O Notation (and Omega and Theta) - best mathematical explanation (video) (opens new window)
- [ ] Skiena:
- [ ] A Gentle Introduction to Algorithm Complexity Analysis (opens new window)
- [ ] Orders of Growth (video) (opens new window)
- [ ] Asymptotics (video) (opens new window)
- [ ] UC Berkeley Big O (video) (opens new window)
- [ ] UC Berkeley Big Omega (video) (opens new window)
- [ ] Amortized Analysis (video) (opens new window)
- [ ] Illustrating "Big O" (video) (opens new window)
- [ ] TopCoder (includes recurrence relations and master theorem):
- [ ] Cheat sheet (opens new window)
# Data Structures
# Arrays
- Implement an automatically resizing vector.
- [ ] Description:
- [ ] Implement a vector (mutable array with automatic resizing):
- [ ] Practice coding using arrays and pointers, and pointer math to jump to an index instead of using indexing.
- [ ] New raw data array with allocated memory
- can allocate int array under the hood, just not use its features
- start with 16, or if starting number is greater, use power of 2 - 16, 32, 64, 128
- [ ] size() - number of items
- [ ] capacity() - number of items it can hold
- [ ] is_empty()
- [ ] at(index) - returns item at given index, blows up if index out of bounds
- [ ] push(item)
- [ ] insert(index, item) - inserts item at index, shifts that index's value and trailing elements to the right
- [ ] prepend(item) - can use insert above at index 0
- [ ] pop() - remove from end, return value
- [ ] delete(index) - delete item at index, shifting all trailing elements left
- [ ] remove(item) - looks for value and removes index holding it (even if in multiple places)
- [ ] find(item) - looks for value and returns first index with that value, -1 if not found
- [ ] resize(new_capacity) // private function
- when you reach capacity, resize to double the size
- when popping an item, if size is 1/4 of capacity, resize to half
- [ ] Time
- O(1) to add/remove at end (amortized for allocations for more space), index, or update
- O(n) to insert/remove elsewhere
- [ ] Space
- contiguous in memory, so proximity helps performance
- space needed = (array capacity, which is >= n) * size of item, but even if 2n, still O(n)
# Linked Lists
- [ ] Description:
- [ ] C Code (video) (opens new window) - not the whole video, just portions about Node struct and memory allocation
- [ ] Linked List vs Arrays:
- [ ] why you should avoid linked lists (video) (opens new window)
- [ ] Gotcha: you need pointer to pointer knowledge: (for when you pass a pointer to a function that may change the address where that pointer points) This page is just to get a grasp on ptr to ptr. I don't recommend this list traversal style. Readability and maintainability suffer due to cleverness.
- [ ] Implement (I did with tail pointer & without):
- [ ] size() - returns number of data elements in list
- [ ] empty() - bool returns true if empty
- [ ] value_at(index) - returns the value of the nth item (starting at 0 for first)
- [ ] push_front(value) - adds an item to the front of the list
- [ ] pop_front() - remove front item and return its value
- [ ] push_back(value) - adds an item at the end
- [ ] pop_back() - removes end item and returns its value
- [ ] front() - get value of front item
- [ ] back() - get value of end item
- [ ] insert(index, value) - insert value at index, so current item at that index is pointed to by new item at index
- [ ] erase(index) - removes node at given index
- [ ] value_n_from_end(n) - returns the value of the node at nth position from the end of the list
- [ ] reverse() - reverses the list
- [ ] remove_value(value) - removes the first item in the list with this value
- [ ] Doubly-linked List
- Description (video) (opens new window)
- No need to implement
# Stack
- [ ] Stacks (video) (opens new window)
- [ ] Will not implement. Implementing with array is trivial
# Queue
- [ ] Queue (video) (opens new window)
- [ ] Circular buffer/FIFO (opens new window)
- [ ] Implement using linked-list, with tail pointer:
- enqueue(value) - adds value at position at tail
- dequeue() - returns value and removes least recently added element (front)
- empty()
- [ ] Implement using fixed-sized array:
- enqueue(value) - adds item at end of available storage
- dequeue() - returns value and removes least recently added element
- empty()
- full()
- [ ] Cost:
- a bad implementation using linked list where you enqueue at head and dequeue at tail would be O(n) because you'd need the next to last element, causing a full traversal each dequeue
- enqueue: O(1) (amortized, linked list and array [probing])
- dequeue: O(1) (linked list and array)
- empty: O(1) (linked list and array)
# Hash table
[ ] Videos:
- [ ] Hashing with Chaining (video) (opens new window)
- [ ] Table Doubling, Karp-Rabin (video) (opens new window)
- [ ] Open Addressing, Cryptographic Hashing (video) (opens new window)
- [ ] PyCon 2010: The Mighty Dictionary (video) (opens new window)
- [ ] (Advanced) Randomization: Universal & Perfect Hashing (video) (opens new window)
- [ ] (Advanced) Perfect hashing (video) (opens new window)
[ ] Online Courses:
- [ ] Core Hash Tables (video) (opens new window)
- [ ] Data Structures (video) (opens new window)
- [ ] Phone Book Problem (video) (opens new window)
- [ ] distributed hash tables:
[ ] Implement with array using linear probing
- hash(k, m) - m is size of hash table
- add(key, value) - if key already exists, update value
- exists(key)
- get(key)
- remove(key)
# More Knowledge
# Binary search
- [ ] Binary Search (video) (opens new window)
- [ ] Binary Search (video) (opens new window)
- [ ] detail (opens new window)
- [ ] Implement:
- binary search (on sorted array of integers)
- binary search using recursion
# Bitwise operations
- [ ] Bits cheat sheet (opens new window) - you should know many of the powers of 2 from (2^1 to 2^16 and 2^32)
- [ ] Get a really good understanding of manipulating bits with: &, |, ^, ~, >>, <<
- [ ] words (opens new window)
- [ ] Good intro: Bit Manipulation (video) (opens new window)
- [ ] C Programming Tutorial 2-10: Bitwise Operators (video) (opens new window)
- [ ] Bit Manipulation (opens new window)
- [ ] Bitwise Operation (opens new window)
- [ ] Bithacks (opens new window)
- [ ] The Bit Twiddler (opens new window)
- [ ] The Bit Twiddler Interactive (opens new window)
- [ ] Bit Hacks (video) (opens new window)
- [ ] Practice Operations (opens new window)
- [ ] 2s and 1s complement
- [ ] Count set bits
- [ ] Swap values:
- [ ] Absolute value:
# Trees
# Trees - Notes & Background
- [ ] Series: Trees (video) (opens new window)
- basic tree construction
- traversal
- manipulation algorithms
- [ ] BFS(breadth-first search) and DFS(depth-first search) (video) (opens new window)
- BFS notes:
- level order (BFS, using queue)
- time complexity: O(n)
- space complexity: best: O(1), worst: O(n/2)=O(n)
- DFS notes:
- time complexity: O(n)
- space complexity: best: O(log n) - avg. height of tree worst: O(n)
- inorder (DFS: left, self, right)
- postorder (DFS: left, right, self)
- preorder (DFS: self, left, right)
- BFS notes:
# Binary search trees: BSTs
- [ ] Binary Search Tree Review (video) (opens new window)
- [ ] Series (video) (opens new window)
- starts with symbol table and goes through BST applications
- [ ] Introduction (video) (opens new window)
- [ ] MIT (video) (opens new window)
- C/C++:
- [ ] Binary search tree - Implementation in C/C++ (video) (opens new window)
- [ ] BST implementation - memory allocation in stack and heap (video) (opens new window)
- [ ] Find min and max element in a binary search tree (video) (opens new window)
- [ ] Find height of a binary tree (video) (opens new window)
- [ ] Binary tree traversal - breadth-first and depth-first strategies (video) (opens new window)
- [ ] Binary tree: Level Order Traversal (video) (opens new window)
- [ ] Binary tree traversal: Preorder, Inorder, Postorder (video) (opens new window)
- [ ] Check if a binary tree is binary search tree or not (video) (opens new window)
- [ ] Delete a node from Binary Search Tree (video) (opens new window)
- [ ] Inorder Successor in a binary search tree (video) (opens new window)
- [ ] Implement:
- [ ] insert // insert value into tree
- [ ] get_node_count // get count of values stored
- [ ] print_values // prints the values in the tree, from min to max
- [ ] delete_tree
- [ ] is_in_tree // returns true if given value exists in the tree
- [ ] get_height // returns the height in nodes (single node's height is 1)
- [ ] get_min // returns the minimum value stored in the tree
- [ ] get_max // returns the maximum value stored in the tree
- [ ] is_binary_search_tree
- [ ] delete_value
- [ ] get_successor // returns next-highest value in tree after given value, -1 if none
# Heap / Priority Queue / Binary Heap
- visualized as a tree, but is usually linear in storage (array, linked list)
- [ ] Heap (opens new window)
- [ ] Introduction (video) (opens new window)
- [ ] Naive Implementations (video) (opens new window)
- [ ] Binary Trees (video) (opens new window)
- [ ] Tree Height Remark (video) (opens new window)
- [ ] Basic Operations (video) (opens new window)
- [ ] Complete Binary Trees (video) (opens new window)
- [ ] Pseudocode (video) (opens new window)
- [ ] Heap Sort - jumps to start (video) (opens new window)
- [ ] Heap Sort (video) (opens new window)
- [ ] Building a heap (video) (opens new window)
- [ ] MIT: Heaps and Heap Sort (video) (opens new window)
- [ ] CS 61B Lecture 24: Priority Queues (video) (opens new window)
- [ ] Linear Time BuildHeap (max-heap) (opens new window)
- [ ] Implement a max-heap:
- [ ] insert
- [ ] sift_up - needed for insert
- [ ] get_max - returns the max item, without removing it
- [ ] get_size() - return number of elements stored
- [ ] is_empty() - returns true if heap contains no elements
- [ ] extract_max - returns the max item, removing it
- [ ] sift_down - needed for extract_max
- [ ] remove(i) - removes item at index x
- [ ] heapify - create a heap from an array of elements, needed for heap_sort
- [ ] heap_sort() - take an unsorted array and turn it into a sorted array in-place using a max heap or min heap
# Sorting
[ ] Notes:
- Implement sorts & know best case/worst case, average complexity of each:
- no bubble sort - it's terrible - O(n^2), except when n <= 16
- [ ] Stability in sorting algorithms ("Is Quicksort stable?")
- [ ] Which algorithms can be used on linked lists? Which on arrays? Which on both?
- I wouldn't recommend sorting a linked list, but merge sort is doable.
- Merge Sort For Linked List (opens new window)
- Implement sorts & know best case/worst case, average complexity of each:
For heapsort, see Heap data structure above. Heap sort is great, but not stable
[ ] UC Berkeley:
[ ] Merge sort code:
[ ] Quick sort code:
[ ] Implement:
- [ ] Mergesort: O(n log n) average and worst case
- [ ] Quicksort O(n log n) average case
- Selection sort and insertion sort are both O(n^2) average and worst case
- For heapsort, see Heap data structure above
[ ] Not required, but I recommended them:
- [ ] Sedgewick - Radix Sorts (6 videos) (opens new window)
- [ ] 1. Strings in Java (opens new window)
- [ ] 2. Key Indexed Counting (opens new window)
- [ ] 3. Least Significant Digit First String Radix Sort (opens new window)
- [ ] 4. Most Significant Digit First String Radix Sort (opens new window)
- [ ] 5. 3 Way Radix Quicksort (opens new window)
- [ ] 6. Suffix Arrays (opens new window)
- [ ] Radix Sort (opens new window)
- [ ] Radix Sort (video) (opens new window)
- [ ] Radix Sort, Counting Sort (linear time given constraints) (video) (opens new window)
- [ ] Randomization: Matrix Multiply, Quicksort, Freivalds' algorithm (video) (opens new window)
- [ ] Sorting in Linear Time (video) (opens new window)
- [ ] Sedgewick - Radix Sorts (6 videos) (opens new window)
As a summary, here is a visual representation of 15 sorting algorithms (opens new window). If you need more detail on this subject, see "Sorting" section in Additional Detail on Some Subjects
# Graphs
Graphs can be used to represent many problems in computer science, so this section is long, like trees and sorting were.
Notes:
- There are 4 basic ways to represent a graph in memory:
- objects and pointers
- adjacency matrix
- adjacency list
- adjacency map
- Familiarize yourself with each representation and its pros & cons
- BFS and DFS - know their computational complexity, their trade offs, and how to implement them in real code
- When asked a question, look for a graph-based solution first, then move on if none
- There are 4 basic ways to represent a graph in memory:
[ ] MIT(videos):
[ ] Skiena Lectures - great intro:
- [ ] CSE373 2012 - Lecture 11 - Graph Data Structures (video) (opens new window)
- [ ] CSE373 2012 - Lecture 12 - Breadth-First Search (video) (opens new window)
- [ ] CSE373 2012 - Lecture 13 - Graph Algorithms (video) (opens new window)
- [ ] CSE373 2012 - Lecture 14 - Graph Algorithms (con't) (video) (opens new window)
- [ ] CSE373 2012 - Lecture 15 - Graph Algorithms (con't 2) (video) (opens new window)
- [ ] CSE373 2012 - Lecture 16 - Graph Algorithms (con't 3) (video) (opens new window)
[ ] Graphs (review and more):
- [ ] 6.006 Single-Source Shortest Paths Problem (video) (opens new window)
- [ ] 6.006 Dijkstra (video) (opens new window)
- [ ] 6.006 Bellman-Ford (video) (opens new window)
- [ ] 6.006 Speeding Up Dijkstra (video) (opens new window)
- [ ] Aduni: Graph Algorithms I - Topological Sorting, Minimum Spanning Trees, Prim's Algorithm - Lecture 6 (video) (opens new window)
- [ ] Aduni: Graph Algorithms II - DFS, BFS, Kruskal's Algorithm, Union Find Data Structure - Lecture 7 (video) (opens new window)
- [ ] Aduni: Graph Algorithms III: Shortest Path - Lecture 8 (video) (opens new window)
- [ ] Aduni: Graph Alg. IV: Intro to geometric algorithms - Lecture 9 (video) (opens new window)
- [ ]
CS 61B 2014 (starting at 58:09) (video) (opens new window) - [ ] CS 61B 2014: Weighted graphs (video) (opens new window)
- [ ] Greedy Algorithms: Minimum Spanning Tree (video) (opens new window)
- [ ] Strongly Connected Components Kosaraju's Algorithm Graph Algorithm (video) (opens new window)
Full Coursera Course:
I'll implement:
- [ ] DFS with adjacency list (recursive)
- [ ] DFS with adjacency list (iterative with stack)
- [ ] DFS with adjacency matrix (recursive)
- [ ] DFS with adjacency matrix (iterative with stack)
- [ ] BFS with adjacency list
- [ ] BFS with adjacency matrix
- [ ] single-source shortest path (Dijkstra)
- [ ] minimum spanning tree
- DFS-based algorithms (see Aduni videos above):
- [ ] check for cycle (needed for topological sort, since we'll check for cycle before starting)
- [ ] topological sort
- [ ] count connected components in a graph
- [ ] list strongly connected components
- [ ] check for bipartite graph
# Even More Knowledge
# Recursion
- [ ] Stanford lectures on recursion & backtracking:
- When it is appropriate to use it?
- How is tail recursion better than not?
# Dynamic Programming
- You probably won't see any dynamic programming problems in your interview, but it's worth being able to recognize a problem as being a candidate for dynamic programming.
- This subject can be pretty difficult, as each DP soluble problem must be defined as a recursion relation, and coming up with it can be tricky.
- I suggest looking at many examples of DP problems until you have a solid understanding of the pattern involved.
- [ ] Videos:
- the Skiena videos can be hard to follow since he sometimes uses the whiteboard, which is too small to see
- [ ] Skiena: CSE373 2012 - Lecture 19 - Introduction to Dynamic Programming (video) (opens new window)
- [ ] Skiena: CSE373 2012 - Lecture 20 - Edit Distance (video) (opens new window)
- [ ] Skiena: CSE373 2012 - Lecture 21 - Dynamic Programming Examples (video) (opens new window)
- [ ] Skiena: CSE373 2012 - Lecture 22 - Applications of Dynamic Programming (video) (opens new window)
- [ ] Simonson: Dynamic Programming 0 (starts at 59:18) (video) (opens new window)
- [ ] Simonson: Dynamic Programming I - Lecture 11 (video) (opens new window)
- [ ] Simonson: Dynamic programming II - Lecture 12 (video) (opens new window)
- [ ] List of individual DP problems (each is short): Dynamic Programming (video) (opens new window)
- [ ] Yale Lecture notes:
- [ ] Coursera:
- [ ] The RNA secondary structure problem (video) (opens new window)
- [ ] A dynamic programming algorithm (video) (opens new window)
- [ ] Illustrating the DP algorithm (video) (opens new window)
- [ ] Running time of the DP algorithm (video) (opens new window)
- [ ] DP vs. recursive implementation (video) (opens new window)
- [ ] Global pairwise sequence alignment (video) (opens new window)
- [ ] Local pairwise sequence alignment (video) (opens new window)
# Object-Oriented Programming
- [ ] Optional: UML 2.0 Series (video) (opens new window)
- [ ] SOLID OOP Principles: SOLID Principles (video) (opens new window)
# Design patterns
- [ ] Quick UML review (video) (opens new window)
- [ ] Learn these patterns:
- [ ] strategy
- [ ] singleton
- [ ] adapter
- [ ] prototype
- [ ] decorator
- [ ] visitor
- [ ] factory, abstract factory
- [ ] facade
- [ ] observer
- [ ] proxy
- [ ] delegate
- [ ] command
- [ ] state
- [ ] memento
- [ ] iterator
- [ ] composite
- [ ] flyweight
- [ ] Chapter 6 (Part 1) - Patterns (video) (opens new window)
- [ ] Chapter 6 (Part 2) - Abstraction-Occurrence, General Hierarchy, Player-Role, Singleton, Observer, Delegation (video) (opens new window)
- [ ] Chapter 6 (Part 3) - Adapter, Facade, Immutable, Read-Only Interface, Proxy (video) (opens new window)
- [ ] Series of videos (27 videos) (opens new window)
- [ ] Head First Design Patterns (opens new window)
- I know the canonical book is "Design Patterns: Elements of Reusable Object-Oriented Software", but Head First is great for beginners to OO.
- [ ] Handy reference: 101 Design Patterns & Tips for Developers (opens new window)
- [ ] Design patterns for humans (opens new window)
# Combinatorics (n choose k) & Probability
- [ ] Math Skills: How to find Factorial, Permutation and Combination (Choose) (video) (opens new window)
- [ ] Make School: Probability (video) (opens new window)
- [ ] Make School: More Probability and Markov Chains (video) (opens new window)
- [ ] Khan Academy:
- Course layout:
- Just the videos - 41 (each are simple and each are short):
# NP, NP-Complete and Approximation Algorithms
- Know about the most famous classes of NP-complete problems, such as traveling salesman and the knapsack problem, and be able to recognize them when an interviewer asks you them in disguise.
- Know what NP-complete means.
- [ ] Computational Complexity (video) (opens new window)
- [ ] Simonson:
- [ ] Skiena:
- [ ] Complexity: P, NP, NP-completeness, Reductions (video) (opens new window)
- [ ] Complexity: Approximation Algorithms (video) (opens new window)
- [ ] Complexity: Fixed-Parameter Algorithms (video) (opens new window)
- Peter Norvig discusses near-optimal solutions to traveling salesman problem:
- Pages 1048 - 1140 in CLRS if you have it.
# Caches
- [ ] LRU cache:
- [ ] CPU cache:
# Processes and Threads
- [ ] Computer Science 162 - Operating Systems (25 videos):
- for processes and threads see videos 1-11
- Operating Systems and System Programming (video) (opens new window)
- What Is The Difference Between A Process And A Thread? (opens new window)
- Covers:
- Processes, Threads, Concurrency issues
- Difference between processes and threads
- Processes
- Threads
- Locks
- Mutexes
- Semaphores
- Monitors
- How they work?
- Deadlock
- Livelock
- CPU activity, interrupts, context switching
- Modern concurrency constructs with multicore processors
- Paging, segmentation and virtual memory (video) (opens new window)
- Interrupts (video) (opens new window)
- Process resource needs (memory: code, static storage, stack, heap, and also file descriptors, i/o)
- Thread resource needs (shares above (minus stack) with other threads in the same process but each has its own pc, stack counter, registers, and stack)
- Forking is really copy on write (read-only) until the new process writes to memory, then it does a full copy.
- Context switching
- How context switching is initiated by the operating system and underlying hardware?
- Processes, Threads, Concurrency issues
- [ ] threads in C++ (series - 10 videos) (opens new window)
- [ ] concurrency in Python (videos):
- [ ] Short series on threads (opens new window)
- [ ] Python Threads (opens new window)
- [ ] Understanding the Python GIL (2010) (opens new window)
- [ ] David Beazley - Python Concurrency From the Ground Up: LIVE! - PyCon 2015 (opens new window)
- [ ] Keynote David Beazley - Topics of Interest (Python Asyncio) (opens new window)
- [ ] Mutex in Python (opens new window)
- [ ] Computer Science 162 - Operating Systems (25 videos):
# Testing
- To cover:
- how unit testing works
- what are mock objects
- what is integration testing
- what is dependency injection
- [ ] Agile Software Testing with James Bach (video) (opens new window)
- [ ] Open Lecture by James Bach on Software Testing (video) (opens new window)
- [ ] Steve Freeman - Test-Driven Development (that’s not what we meant) (video) (opens new window)
- [ ] Dependency injection:
- [ ] How to write tests (opens new window)
- To cover:
# Scheduling
- In an OS, how it works?
- Can be gleaned from Operating System videos
# String searching & manipulations
- [ ] Sedgewick - Suffix Arrays (video) (opens new window)
- [ ] Sedgewick - Substring Search (videos) (opens new window)
- [ ] Search pattern in text (video) (opens new window)
If you need more detail on this subject, see "String Matching" section in Additional Detail on Some Subjects.
# Tries
- Note there are different kinds of tries. Some have prefixes, some don't, and some use string instead of bits to track the path
- I read through code, but will not implement
- [ ] Sedgewick - Tries (3 videos) (opens new window)
- [ ] Notes on Data Structures and Programming Techniques (opens new window)
- [ ] Short course videos:
- [ ] The Trie: A Neglected Data Structure (opens new window)
- [ ] TopCoder - Using Tries (opens new window)
- [ ] Stanford Lecture (real world use case) (video) (opens new window)
- [ ] MIT, Advanced Data Structures, Strings (can get pretty obscure about halfway through) (video) (opens new window)
# Floating Point Numbers
# Unicode
# Endianness
- [ ] Big And Little Endian (opens new window)
- [ ] Big Endian Vs Little Endian (video) (opens new window)
- [ ] Big And Little Endian Inside/Out (video) (opens new window)
- Very technical talk for kernel devs. Don't worry if most is over your head.
- The first half is enough.
# Networking
- if you have networking experience or want to be a reliability engineer or operations engineer, expect questions
- Otherwise, this is just good to know
- [ ] Khan Academy (opens new window)
- [ ] UDP and TCP: Comparison of Transport Protocols (video) (opens new window)
- [ ] TCP/IP and the OSI Model Explained! (video) (opens new window)
- [ ] Packet Transmission across the Internet. Networking & TCP/IP tutorial. (video) (opens new window)
- [ ] HTTP (video) (opens new window)
- [ ] SSL and HTTPS (video) (opens new window)
- [ ] SSL/TLS (video) (opens new window)
- [ ] HTTP 2.0 (video) (opens new window)
- [ ] Video Series (21 videos) (video) (opens new window)
- [ ] Subnetting Demystified - Part 5 CIDR Notation (video) (opens new window)
- [ ] Sockets:
# System Design, Scalability, Data Handling
You can expect system design questions if you have 4+ years of experience.
- Scalability and System Design are very large topics with many topics and resources, since there is a lot to consider when designing a software/hardware system that can scale. Expect to spend quite a bit of time on this
- Considerations:
- Scalability
- Distill large data sets to single values
- Transform one data set to another
- Handling obscenely large amounts of data
- System design
- features sets
- interfaces
- class hierarchies
- designing a system under certain constraints
- simplicity and robustness
- tradeoffs
- performance analysis and optimization
- Scalability
- [ ] START HERE: The System Design Primer (opens new window)
- [ ] System Design from HiredInTech (opens new window)
- [ ] How Do I Prepare To Answer Design Questions In A Technical Inverview? (opens new window)
- [ ] 8 Things You Need to Know Before a System Design Interview (opens new window)
- [ ] Algorithm design (opens new window)
- [ ] Database Normalization - 1NF, 2NF, 3NF and 4NF (video) (opens new window)
- [ ] System Design Interview (opens new window) - There are a lot of resources in this one. Look through the articles and examples. I put some of them below
- [ ] How to ace a systems design interview (opens new window)
- [ ] Numbers Everyone Should Know (opens new window)
- [ ] How long does it take to make a context switch? (opens new window)
- [ ] Transactions Across Datacenters (video) (opens new window)
- [ ] A plain English introduction to CAP Theorem (opens new window)
- [ ] Consensus Algorithms:
- [ ] Consistent Hashing (opens new window)
- [ ] NoSQL Patterns (opens new window)
- [ ] Scalability:
- You don't need all of these. Just pick a few that interest you.
- [ ] Great overview (video) (opens new window)
- [ ] Short series:
- [ ] Scalable Web Architecture and Distributed Systems (opens new window)
- [ ] Fallacies of Distributed Computing Explained (opens new window)
- [ ] Pragmatic Programming Techniques (opens new window)
- [ ] Jeff Dean - Building Software Systems At Google and Lessons Learned (video) (opens new window)
- [ ] Introduction to Architecting Systems for Scale (opens new window)
- [ ] Scaling mobile games to a global audience using App Engine and Cloud Datastore (video) (opens new window)
- [ ] How Google Does Planet-Scale Engineering for Planet-Scale Infra (video) (opens new window)
- [ ] The Importance of Algorithms (opens new window)
- [ ] Sharding (opens new window)
- [ ] Scale at Facebook (2012), "Building for a Billion Users" (video) (opens new window)
- [ ] Engineering for the Long Game - Astrid Atkinson Keynote(video) (opens new window)
- [ ] 7 Years Of YouTube Scalability Lessons In 30 Minutes (opens new window)
- [ ] How PayPal Scaled To Billions Of Transactions Daily Using Just 8VMs (opens new window)
- [ ] How to Remove Duplicates in Large Datasets (opens new window)
- [ ] A look inside Etsy's scale and engineering culture with Jon Cowie (video) (opens new window)
- [ ] What Led Amazon to its Own Microservices Architecture (opens new window)
- [ ] To Compress Or Not To Compress, That Was Uber's Question (opens new window)
- [ ] Asyncio Tarantool Queue, Get In The Queue (opens new window)
- [ ] When Should Approximate Query Processing Be Used? (opens new window)
- [ ] Google's Transition From Single Datacenter, To Failover, To A Native Multihomed Architecture (opens new window)
- [ ] Spanner (opens new window)
- [ ] Machine Learning Driven Programming: A New Programming For A New World (opens new window)
- [ ] The Image Optimization Technology That Serves Millions Of Requests Per Day (opens new window)
- [ ] A Patreon Architecture Short (opens new window)
- [ ] Tinder: How Does One Of The Largest Recommendation Engines Decide Who You'll See Next? (opens new window)
- [ ] Design Of A Modern Cache (opens new window)
- [ ] Live Video Streaming At Facebook Scale (opens new window)
- [ ] A Beginner's Guide To Scaling To 11 Million+ Users On Amazon's AWS (opens new window)
- [ ] How Does The Use Of Docker Effect Latency? (opens new window)
- [ ] A 360 Degree View Of The Entire Netflix Stack (opens new window)
- [ ] Latency Is Everywhere And It Costs You Sales - How To Crush It (opens new window)
- [ ] Serverless (very long, just need the gist) (opens new window)
- [ ] What Powers Instagram: Hundreds of Instances, Dozens of Technologies (opens new window)
- [ ] Cinchcast Architecture - Producing 1,500 Hours Of Audio Every Day (opens new window)
- [ ] Justin.Tv's Live Video Broadcasting Architecture (opens new window)
- [ ] Playfish's Social Gaming Architecture - 50 Million Monthly Users And Growing (opens new window)
- [ ] TripAdvisor Architecture - 40M Visitors, 200M Dynamic Page Views, 30TB Data (opens new window)
- [ ] PlentyOfFish Architecture (opens new window)
- [ ] Salesforce Architecture - How They Handle 1.3 Billion Transactions A Day (opens new window)
- [ ] ESPN's Architecture At Scale - Operating At 100,000 Duh Nuh Nuhs Per Second (opens new window)
- [ ] See "Messaging, Serialization, and Queueing Systems" way below for info on some of the technologies that can glue services together
- [ ] Twitter:
- For even more, see "Mining Massive Datasets" video series in the Video Series section
- [ ] Practicing the system design process: Here are some ideas to try working through on paper, each with some documentation on how it was handled in the real world:
- review: The System Design Primer (opens new window)
- System Design from HiredInTech (opens new window)
- cheat sheet (opens new window)
- flow:
- Understand the problem and scope:
- Define the use cases, with interviewer's help
- Suggest additional features
- Remove items that interviewer deems out of scope
- Assume high availability is required, add as a use case
- Think about constraints:
- Ask how many requests per month
- Ask how many requests per second (they may volunteer it or make you do the math)
- Estimate reads vs. writes percentage
- Keep 80/20 rule in mind when estimating
- How much data written per second
- Total storage required over 5 years
- How much data read per second
- Abstract design:
- Layers (service, data, caching)
- Infrastructure: load balancing, messaging
- Rough overview of any key algorithm that drives the service
- Consider bottlenecks and determine solutions
- Understand the problem and scope:
- Exercises:
- Design a CDN network: old article (opens new window)
- Design a random unique ID generation system (opens new window)
- Design an online multiplayer card game (opens new window)
- Design a key-value database (opens new window)
- Design a picture sharing system (opens new window)
- Design a recommendation system (opens new window)
- Design a URL-shortener system: copied from above (opens new window)
- Design a cache system (opens new window)
# Final Review
This section will have shorter videos that you can watch pretty quickly to review most of the important concepts.
It's nice if you want a refresher often.
- [ ] Series of 2-3 minutes short subject videos (23 videos)
- [ ] Series of 2-5 minutes short subject videos - Michael Sambol (18 videos):
- [ ] Sedgewick Videos - Algorithms I (opens new window)
- [ ] Sedgewick Videos - Algorithms II (opens new window)
# Coding Question Practice
Now that you know all the computer science topics above, it's time to practice answering coding problems.
Coding question practice is not about memorizing answers to programming problems.
Why you need to practice doing programming problems:
- Problem recognition, and where the right data structures and algorithms fit in
- Gathering requirements for the problem
- Talking your way through the problem like you will in the interview
- Coding on a whiteboard or paper, not a computer
- Coming up with time and space complexity for your solutions
- Testing your solutions
There is a great intro for methodical, communicative problem solving in an interview. You'll get this from the programming interview books, too, but I found this outstanding: Algorithm design canvas (opens new window)
No whiteboard at home? That makes sense. I'm a weirdo and have a big whiteboard. Instead of a whiteboard, pick up a large drawing pad from an art store. You can sit on the couch and practice. This is my "sofa whiteboard". I added the pen in the photo for scale. If you use a pen, you'll wish you could erase. Gets messy quick. I use a pencil and eraser.
Supplemental:
- Mathematics for Topcoders (opens new window)
- Dynamic Programming – From Novice to Advanced (opens new window)
- MIT Interview Materials (opens new window)
- Exercises for getting better at a given language (opens new window)
Read and Do Programming Problems (in this order):
- [ ] Programming Interviews Exposed: Secrets to Landing Your Next Job, 2nd Edition (opens new window)
- answers in C, C++ and Java
- [ ] Cracking the Coding Interview, 6th Edition (opens new window)
- answers in Java
See Book List above
# Coding exercises/challenges
Once you've learned your brains out, put those brains to work. Take coding challenges every day, as many as you can.
- How to Find a Solution (opens new window)
- How to Dissect a Topcoder Problem Statement (opens new window)
Coding Interview Question Videos:
- IDeserve (88 videos) (opens new window)
- Tushar Roy (5 playlists) (opens new window)
- Super for walkthroughs of problem solutions
- Nick White - LeetCode Solutions (187 Videos) (opens new window)
- Good explanations of solution and the code
- You can watch several in a short time
- FisherCoder - LeetCode Solutions (opens new window)
Challenge sites:
- LeetCode (opens new window)
- My favorite coding problem site. It's worth the subscription money for the 1-2 months you'll likely be preparing
- LeetCode solutions from FisherCoder (opens new window)
- See Nick White Videos above for short code-throughs
- HackerRank (opens new window)
- TopCoder (opens new window)
- InterviewCake (opens new window)
- Geeks for Geeks (opens new window)
- InterviewBit (opens new window)
- Project Euler (math-focused) (opens new window)
- Code Exercises (opens new window)
Language-learning sites, with challenges:
- Codewars (opens new window)
- Codility (opens new window)
- HackerEarth (opens new window)
- Sphere Online Judge (spoj) (opens new window)
- Codechef (opens new window)
Challenge repos:
Mock Interviews:
- Gainlo.co: Mock interviewers from big companies (opens new window) - I used this and it helped me relax for the phone screen and on-site interview
- Pramp: Mock interviews from/with peers (opens new window) - peer-to-peer model of practice interviews
- Refdash: Mock interviews and expedited interviews (opens new window) - also help candidates fast track by skipping multiple interviews with tech companies
- interviewing.io: Practice mock interview with senior engineers (opens new window) - anonymous algorithmic/systems design interviews with senior engineers from FAANG anonymously.
# Once you're closer to the interview
- Cracking The Coding Interview Set 2 (videos):
# Your Resume
- See Resume prep items in Cracking The Coding Interview and back of Programming Interviews Exposed
# Be thinking of for when the interview comes
Think of about 20 interview questions you'll get, along with the lines of the items below. Have 2-3 answers for each. Have a story, not just data, about something you accomplished.
- Why do you want this job?
- What's a tough problem you've solved?
- Biggest challenges faced?
- Best/worst designs seen?
- Ideas for improving an existing product
- How do you work best, as an individual and as part of a team?
- Which of your skills or experiences would be assets in the role and why?
- What did you most enjoy at [job x / project y]?
- What was the biggest challenge you faced at [job x / project y]?
- What was the hardest bug you faced at [job x / project y]?
- What did you learn at [job x / project y]?
- What would you have done better at [job x / project y]?
# Have questions for the interviewer
Some of mine (I already may know answer to but want their opinion or team perspective):
- How large is your team?
- What does your dev cycle look like? Do you do waterfall/sprints/agile?
- Are rushes to deadlines common? Or is there flexibility?
- How are decisions made in your team?
- How many meetings do you have per week?
- Do you feel your work environment helps you concentrate?
- What are you working on?
- What do you like about it?
- What is the work life like?
- How is work/life balance?
# Once You've Got The Job
Congratulations!
Keep learning.
You're never really done.
*****************************************************************************************************
*****************************************************************************************************
Everything below this point is optional.
By studying these, you'll get greater exposure to more CS concepts, and will be better prepared for
any software engineering job. You'll be a much more well-rounded software engineer.
*****************************************************************************************************
*****************************************************************************************************
# Additional Books
These are here so you can dive into a topic you find interesting.
The Unix Programming Environment (opens new window)
- An oldie but a goodie
The Linux Command Line: A Complete Introduction (opens new window)
- A modern option
Head First Design Patterns (opens new window)
- A gentle introduction to design patterns
Design Patterns: Elements of Reusable Object-Oriented Software (opens new window)
- AKA the "Gang Of Four" book, or GOF
- The canonical design patterns book
UNIX and Linux System Administration Handbook, 5th Edition (opens new window)
Algorithm Design Manual (opens new window) (Skiena)
- As a review and problem recognition
- The algorithm catalog portion is well beyond the scope of difficulty you'll get in an interview
- This book has 2 parts:
- Class textbook on data structures and algorithms
- Pros:
- Is a good review as any algorithms textbook would be
- Nice stories from his experiences solving problems in industry and academia
- Code examples in C
- Cons:
- Can be as dense or impenetrable as CLRS, and in some cases, CLRS may be a better alternative for some subjects
- Chapters 7, 8, 9 can be painful to try to follow, as some items are not explained well or require more brain than I have
- Don't get me wrong: I like Skiena, his teaching style, and mannerisms, but I may not be Stony Brook material
- Pros:
- Algorithm catalog:
- This is the real reason you buy this book
- About to get to this part. Will update here once I've made my way through it
- Class textbook on data structures and algorithms
- Can rent it on kindle
- Answers:
- Errata (opens new window)
Write Great Code: Volume 1: Understanding the Machine (opens new window)
- The book was published in 2004, and is somewhat outdated, but it's a terrific resource for understanding a computer in brief
- The author invented HLA (opens new window), so take mentions and examples in HLA with a grain of salt. Not widely used, but decent examples of what assembly looks like
- These chapters are worth the read to give you a nice foundation:
- Chapter 2 - Numeric Representation
- Chapter 3 - Binary Arithmetic and Bit Operations
- Chapter 4 - Floating-Point Representation
- Chapter 5 - Character Representation
- Chapter 6 - Memory Organization and Access
- Chapter 7 - Composite Data Types and Memory Objects
- Chapter 9 - CPU Architecture
- Chapter 10 - Instruction Set Architecture
- Chapter 11 - Memory Architecture and Organization
Introduction to Algorithms (opens new window)
- Important: Reading this book will only have limited value. This book is a great review of algorithms and data structures, but won't teach you how to write good code. You have to be able to code a decent solution efficiently
- AKA CLR, sometimes CLRS, because Stein was late to the game
Computer Architecture, Sixth Edition: A Quantitative Approach (opens new window)
- For a richer, more up-to-date (2017), but longer treatment
Programming Pearls (opens new window)
- The first couple of chapters present clever solutions to programming problems (some very old using data tape) but that is just an intro. This a guidebook on program design and architecture
# Additional Learning
I added them to help you become a well-rounded software engineer, and to be aware of certain
technologies and algorithms, so you'll have a bigger toolbox.
# Compilers
# Emacs and vi(m)
- Familiarize yourself with a unix-based code editor
- vi(m):
- emacs:
- Basics Emacs Tutorial (video) (opens new window)
- set of 3 (videos):
- Emacs Tutorial (Beginners) -Part 1- File commands, cut/copy/paste, cursor commands (opens new window)
- Emacs Tutorial (Beginners) -Part 2- Buffer management, search, M-x grep and rgrep modes (opens new window)
- Emacs Tutorial (Beginners) -Part 3- Expressions, Statements, ~/.emacs file and packages (opens new window)
- Evil Mode: Or, How I Learned to Stop Worrying and Love Emacs (video) (opens new window)
- Writing C Programs With Emacs (opens new window)
- (maybe) Org Mode In Depth: Managing Structure (video) (opens new window)
# Unix command line tools
- I filled in the list below from good tools.
- bash
- cat
- grep
- sed
- awk
- curl or wget
- sort
- tr
- uniq
- strace (opens new window)
- tcpdump (opens new window)
# Information theory (videos)
- Khan Academy (opens new window)
- More about Markov processes:
- See more in MIT 6.050J Information and Entropy series below
# Parity & Hamming Code (videos)
# Entropy
- Also see videos below
- Make sure to watch information theory videos first
- Information Theory, Claude Shannon, Entropy, Redundancy, Data Compression & Bits (video) (opens new window)
# Cryptography
- Also see videos below
- Make sure to watch information theory videos first
- Khan Academy Series (opens new window)
- Cryptography: Hash Functions (opens new window)
- Cryptography: Encryption (opens new window)
# Compression
- Make sure to watch information theory videos first
- Computerphile (videos):
- Compressor Head videos (opens new window)
- (optional) Google Developers Live: GZIP is not enough! (opens new window)
# Computer Security
- MIT (23 videos) (opens new window)
- Introduction, Threat Models (opens new window)
- Control Hijacking Attacks (opens new window)
- Buffer Overflow Exploits and Defenses (opens new window)
- Privilege Separation (opens new window)
- Capabilities (opens new window)
- Sandboxing Native Code (opens new window)
- Web Security Model (opens new window)
- Securing Web Applications (opens new window)
- Symbolic Execution (opens new window)
- Network Security (opens new window)
- Network Protocols (opens new window)
- Side-Channel Attacks (opens new window)
- MIT (23 videos) (opens new window)
# Garbage collection
# Parallel Programming
# Messaging, Serialization, and Queueing Systems
- Thrift (opens new window)
- Protocol Buffers (opens new window)
- gRPC (opens new window)
- Redis (opens new window)
- Amazon SQS (queue) (opens new window)
- Amazon SNS (pub-sub) (opens new window)
- RabbitMQ (opens new window)
- Celery (opens new window)
- ZeroMQ (opens new window)
- ActiveMQ (opens new window)
- Kafka (opens new window)
- MessagePack (opens new window)
- Avro (opens new window)
# A*
# Fast Fourier Transform
# Bloom Filter
- Given a Bloom filter with m bits and k hashing functions, both insertion and membership testing are O(k)
- Bloom Filters (video) (opens new window)
- Bloom Filters | Mining of Massive Datasets | Stanford University (video) (opens new window)
- Tutorial (opens new window)
- How To Write A Bloom Filter App (opens new window)
# HyperLogLog
# Locality-Sensitive Hashing
- Used to determine the similarity of documents
- The opposite of MD5 or SHA which are used to determine if 2 documents/strings are exactly the same
- Simhashing (hopefully) made simple (opens new window)
# van Emde Boas Trees
# Augmented Data Structures
# Balanced search trees
Know at least one type of balanced binary tree (and know how it's implemented):
"Among balanced search trees, AVL and 2/3 trees are now passé, and red-black trees seem to be more popular. A particularly interesting self-organizing data structure is the splay tree, which uses rotations to move any accessed key to the root." - Skiena
Of these, I chose to implement a splay tree. From what I've read, you won't implement a balanced search tree in your interview. But I wanted exposure to coding one up and let's face it, splay trees are the bee's knees. I did read a lot of red-black tree code
- Splay tree: insert, search, delete functions If you end up implementing red/black tree try just these:
- Search and insertion functions, skipping delete
I want to learn more about B-Tree since it's used so widely with very large data sets
AVL trees
- In practice: From what I can tell, these aren't used much in practice, but I could see where they would be: The AVL tree is another structure supporting O(log n) search, insertion, and removal. It is more rigidly balanced than red–black trees, leading to slower insertion and removal but faster retrieval. This makes it attractive for data structures that may be built once and loaded without reconstruction, such as language dictionaries (or program dictionaries, such as the opcodes of an assembler or interpreter)
- MIT AVL Trees / AVL Sort (video) (opens new window)
- AVL Trees (video) (opens new window)
- AVL Tree Implementation (video) (opens new window)
- Split And Merge (opens new window)
Splay trees
- In practice: Splay trees are typically used in the implementation of caches, memory allocators, routers, garbage collectors, data compression, ropes (replacement of string used for long text strings), in Windows NT (in the virtual memory, networking and file system code) etc
- CS 61B: Splay Trees (video) (opens new window)
- MIT Lecture: Splay Trees:
- Gets very mathy, but watch the last 10 minutes for sure.
- Video (opens new window)
Red/black trees
- These are a translation of a 2-3 tree (see below).
- In practice: Red–black trees offer worst-case guarantees for insertion time, deletion time, and search time. Not only does this make them valuable in time-sensitive applications such as real-time applications, but it makes them valuable building blocks in other data structures which provide worst-case guarantees; for example, many data structures used in computational geometry can be based on red–black trees, and the Completely Fair Scheduler used in current Linux kernels uses red–black trees. In the version 8 of Java, the Collection HashMap has been modified such that instead of using a LinkedList to store identical elements with poor hashcodes, a Red-Black tree is used
- Aduni - Algorithms - Lecture 4 (link jumps to starting point) (video) (opens new window)
- Aduni - Algorithms - Lecture 5 (video) (opens new window)
- Red-Black Tree (opens new window)
- An Introduction To Binary Search And Red Black Tree (opens new window)
2-3 search trees
- In practice: 2-3 trees have faster inserts at the expense of slower searches (since height is more compared to AVL trees).
- You would use 2-3 tree very rarely because its implementation involves different types of nodes. Instead, people use Red Black trees.
- 23-Tree Intuition and Definition (video) (opens new window)
- Binary View of 23-Tree (opens new window)
- 2-3 Trees (student recitation) (video) (opens new window)
2-3-4 Trees (aka 2-4 trees)
- In practice: For every 2-4 tree, there are corresponding red–black trees with data elements in the same order. The insertion and deletion operations on 2-4 trees are also equivalent to color-flipping and rotations in red–black trees. This makes 2-4 trees an important tool for understanding the logic behind red–black trees, and this is why many introductory algorithm texts introduce 2-4 trees just before red–black trees, even though 2-4 trees are not often used in practice.
- CS 61B Lecture 26: Balanced Search Trees (video) (opens new window)
- Bottom Up 234-Trees (video) (opens new window)
- Top Down 234-Trees (video) (opens new window)
N-ary (K-ary, M-ary) trees
- note: the N or K is the branching factor (max branches)
- binary trees are a 2-ary tree, with branching factor = 2
- 2-3 trees are 3-ary
- K-Ary Tree (opens new window)
B-Trees
- Fun fact: it's a mystery, but the B could stand for Boeing, Balanced, or Bayer (co-inventor).
- In Practice: B-Trees are widely used in databases. Most modern filesystems use B-trees (or Variants). In addition to its use in databases, the B-tree is also used in filesystems to allow quick random access to an arbitrary block in a particular file. The basic problem is turning the file block i address into a disk block (or perhaps to a cylinder-head-sector) address
- B-Tree (opens new window)
- B-Tree Datastructure (opens new window)
- Introduction to B-Trees (video) (opens new window)
- B-Tree Definition and Insertion (video) (opens new window)
- B-Tree Deletion (video) (opens new window)
- MIT 6.851 - Memory Hierarchy Models (video) (opens new window) - covers cache-oblivious B-Trees, very interesting data structures - the first 37 minutes are very technical, may be skipped (B is block size, cache line size)
# k-D Trees
- Great for finding number of points in a rectangle or higher dimension object
- A good fit for k-nearest neighbors
- Kd Trees (video) (opens new window)
- kNN K-d tree algorithm (video) (opens new window)
# Skip lists
- "These are somewhat of a cult data structure" - Skiena
- Randomization: Skip Lists (video) (opens new window)
- For animations and a little more detail (opens new window)
# Network Flows
# Disjoint Sets & Union Find
# Math for Fast Processing
# Treap
- Combination of a binary search tree and a heap
- Treap (opens new window)
- Data Structures: Treaps explained (video) (opens new window)
- Applications in set operations (opens new window)
# Linear Programming (videos)
# Geometry, Convex hull (videos)
# Discrete math
- See videos below
# Machine Learning
- Why ML?
- Google's Cloud Machine learning tools (video) (opens new window)
- Google Developers' Machine Learning Recipes (Scikit Learn & Tensorflow) (video) (opens new window)
- Tensorflow (video) (opens new window)
- Tensorflow Tutorials (opens new window)
- Practical Guide to implementing Neural Networks in Python (using Theano) (opens new window)
- Courses:
- Great starter course: Machine Learning (opens new window) - videos only (opens new window) - see videos 12-18 for a review of linear algebra (14 and 15 are duplicates)
- Neural Networks for Machine Learning (opens new window)
- Google's Deep Learning Nanodegree (opens new window)
- Google/Kaggle Machine Learning Engineer Nanodegree (opens new window)
- Self-Driving Car Engineer Nanodegree (opens new window)
- Metis Online Course ($99 for 2 months) (opens new window)
- Resources:
# Additional Detail on Some Subjects
I added these to reinforce some ideas already presented above, but didn't want to include them
above because it's just too much. It's easy to overdo it on a subject.
You want to get hired in this century, right?
SOLID
- [ ] Bob Martin SOLID Principles of Object Oriented and Agile Design (video) (opens new window)
- [ ] S - Single Responsibility Principle (opens new window) | Single responsibility to each Object (opens new window)
- [ ] O - Open/Closed Principal (opens new window) | On production level Objects are ready for extension but not for modification (opens new window)
- [ ] L - Liskov Substitution Principal (opens new window) | Base Class and Derived class follow ‘IS A’ principal (opens new window)
- [ ] I - Interface segregation principle (opens new window) | clients should not be forced to implement interfaces they don't use
- [ ] D -Dependency Inversion principle (opens new window) | Reduce the dependency In composition of objects.
Union-Find
More Dynamic Programming (videos)
- 6.006: Dynamic Programming I: Fibonacci, Shortest Paths (opens new window)
- 6.006: Dynamic Programming II: Text Justification, Blackjack (opens new window)
- 6.006: DP III: Parenthesization, Edit Distance, Knapsack (opens new window)
- 6.006: DP IV: Guitar Fingering, Tetris, Super Mario Bros. (opens new window)
- 6.046: Dynamic Programming & Advanced DP (opens new window)
- 6.046: Dynamic Programming: All-Pairs Shortest Paths (opens new window)
- 6.046: Dynamic Programming (student recitation) (opens new window)
Advanced Graph Processing (videos)
MIT Probability (mathy, and go slowly, which is good for mathy things) (videos):
- MIT 6.042J - Probability Introduction (opens new window)
- MIT 6.042J - Conditional Probability (opens new window)
- MIT 6.042J - Independence (opens new window)
- MIT 6.042J - Random Variables (opens new window)
- MIT 6.042J - Expectation I (opens new window)
- MIT 6.042J - Expectation II (opens new window)
- MIT 6.042J - Large Deviations (opens new window)
- MIT 6.042J - Random Walks (opens new window)
Simonson: Approximation Algorithms (video) (opens new window)
String Matching
- Rabin-Karp (videos):
- Knuth-Morris-Pratt (KMP):
- Boyer–Moore string search algorithm
- Coursera: Algorithms on Strings (opens new window)
- starts off great, but by the time it gets past KMP it gets more complicated than it needs to be
- nice explanation of tries
- can be skipped
Sorting
- Stanford lectures on sorting:
- Shai Simonson, Aduni.org (opens new window):
- Steven Skiena lectures on sorting:
# Video Series
Sit back and enjoy. "Netflix and skill" 😛
List of individual Dynamic Programming problems (each is short) (opens new window)
x86 Architecture, Assembly, Applications (11 videos) (opens new window)
MIT 18.06 Linear Algebra, Spring 2005 (35 videos) (opens new window)
Excellent - MIT Calculus Revisited: Single Variable Calculus (opens new window)
Discrete Mathematics by Shai Simonson (19 videos) (opens new window)
Discrete Mathematics Part 1 by Sarada Herke (5 videos) (opens new window)
CSE373 - Analysis of Algorithms (25 videos)
UC Berkeley 61B (Spring 2014): Data Structures (25 videos) (opens new window)
UC Berkeley 61B (Fall 2006): Data Structures (39 videos) (opens new window)
UC Berkeley 61C: Machine Structures (26 videos) (opens new window)
OOSE: Software Dev Using UML and Java (21 videos) (opens new window)
UC Berkeley CS 152: Computer Architecture and Engineering (20 videos) (opens new window)MIT 6.004: Computation Structures (49 videos) (opens new window)
Carnegie Mellon - Computer Architecture Lectures (39 videos) (opens new window)
MIT 6.006: Intro to Algorithms (47 videos) (opens new window)
MIT 6.033: Computer System Engineering (22 videos) (opens new window)
MIT 6.034 Artificial Intelligence, Fall 2010 (30 videos) (opens new window)
MIT 6.042J: Mathematics for Computer Science, Fall 2010 (25 videos) (opens new window)
MIT 6.046: Design and Analysis of Algorithms (34 videos) (opens new window)
MIT 6.050J: Information and Entropy, Spring 2008 (19 videos) (opens new window)
MIT 6.851: Advanced Data Structures (22 videos) (opens new window)
MIT 6.854: Advanced Algorithms, Spring 2016 (24 videos) (opens new window)
Harvard COMPSCI 224: Advanced Algorithms (25 videos) (opens new window)
MIT 6.858 Computer Systems Security, Fall 2014 (opens new window)
Stanford: Programming Paradigms (27 videos) (opens new window)
Introduction to Cryptography by Christof Paar (opens new window)
Mining Massive Datasets - Stanford University (94 videos) (opens new window)
# Computer Science Courses
- Directory of Online CS Courses (opens new window)
- Directory of CS Courses (many with online lectures) (opens new window)
# Algorithms implementation
# Papers
- Love classic papers? (opens new window)
- 1978: Communicating Sequential Processes (opens new window)
- 2003: The Google File System (opens new window)
- replaced by Colossus in 2012
- 2004: MapReduce: Simplified Data Processing on Large Clusters (opens new window)
- mostly replaced by Cloud Dataflow?
- 2006: Bigtable: A Distributed Storage System for Structured Data (opens new window)
- 2006: The Chubby Lock Service for Loosely-Coupled Distributed Systems (opens new window)
- 2007: Dynamo: Amazon’s Highly Available Key-value Store (opens new window)
- The Dynamo paper kicked off the NoSQL revolution
- 2007: What Every Programmer Should Know About Memory (very long, and the author encourages skipping of some sections) (opens new window)
- 2010: Dapper, a Large-Scale Distributed Systems Tracing Infrastructure (opens new window)
- 2010: Dremel: Interactive Analysis of Web-Scale Datasets (opens new window)
- 2012: Google's Colossus (opens new window)
- paper not available
- 2012: AddressSanitizer: A Fast Address Sanity Checker:
- 2013: Spanner: Google’s Globally-Distributed Database:
- 2014: Machine Learning: The High-Interest Credit Card of Technical Debt (opens new window)
- 2015: Continuous Pipelines at Google (opens new window)
- 2015: High-Availability at Massive Scale: Building Google’s Data Infrastructure for Ads (opens new window)
- 2015: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems (opens new window)
- 2015: How Developers Search for Code: A Case Study (opens new window)
- 2016: Borg, Omega, and Kubernetes (opens new window)