Data structrue and algorithm

递归(Recursion)：

将大的计算不断分割成小的计算，递归函数不断调用自身，比如斐波拉契数列f(n)=f(n-1)+f(n-2)

搜索

顺序搜索(linear search)：对于无序Array，需要遍历整个array；对于sorted array，只需要遍历一部分
Binary search：适用于sorted array，运用divide and conquer方法

排序

1. 冒泡排序(Bubble sort)：

像泡泡一样，大的值不断向上走

def bubbleSort( theSeq ):
	n = len( theSeq )
	for i in range( n - 1 ) :
		for j in range( n –i - 1 ) :
			if theSeq[j] > theSeq[j + 1] :
				tmp = theSeq[j]
        		theSeq[j] = theSeq[j + 1]
        		theSeq[j + 1] = tmp

2. 选择排序(Selection sort)：

最直观的排序方法，就像打扑克一样，先选出最小的，再选出第二小的，以此类推

def selectionSort( theSeq ):
	n = len( theSeq ) 	for i in range( n - 1 ): 
		smallNdx = i 		for j in range( i + 1, n ): 
			if theSeq[j] < theSeq[smallNdx] : 
				smallNdx = j 
	if smallNdx != i : 
		tmp = theSeq[i] 
     		theSeq[i] = theSeq[smallNdx]
     		theSeq[smallNdx] = tmp

3. 插入排序(Insertion sort)：

这个也跟打扑克有点像，事实上应该是大部分人打扑克时的排序方法，就是不停的将后面的每个item插入到前面排好的序列中

def insertionSort( theSeq ): 
	n = len( theSeq ) 	for i in range( n-1 ) : 
    	value = theSeq[i]
		pos = i
		while pos > 0 and value < theSeq[pos - 1] : 
     		theSeq[pos] = theSeq[pos - 1]
      		pos -= 1
    	theSeq[pos] = value

4. 归并排序(Merge sort)：

运用divide and conquer (recursion)，将array不停的分割一直分到只有两个item，然后将两个item排序，最后再归并起来

def pythonMergeSort( theList ): 
	# Check the base case. 
	if len(theList) <= 1 : 
		return theList 
	else : 		# Compute the midpoint. 
       mid = len(theList) // 2
       # Split the list and perform the recursive step.
        leftHalf = pythonMergeSort( theList[ :mid ] )
        rightHalf = pythonMergeSort( theList[ mid: ] )
        #Merge the two ordered sublists.
        newList = mergeOrderedLists( leftHalf, rightHalf ) 
	return newList 

5. 快速排序(Quick sort)：

运用divide and conquer (recursion)，随便选择一个item当做pivot，然后将大于它的元素放在左边，将小于它的元素放在右边，这样循环(recursion)下去，

	\# Sorts an array or list using the recursive quick sort algorithm. 
def quickSort( theSeq ):
n = len( theSeq ) recQuickSort( theSeq, 0, n-1 ) 

	\# The recursive implementation using virtual segments. 
def recQuickSort( theSeq, first, last ): 
	if first >= last : # Check the base case. 
		return 
	else : 
      	# Save the pivot value.
     	pivot = theSeq[first]
		# Partition the sequence and obtain the pivot position. 
		pos = partitionSeq( theSeq, first, last ) 
		# Repeat the process on the two subsequences. 
     	recQuickSort( theSeq, first, pos - 1 )
     	recQuickSort( theSeq, pos + 1, last )

6. 基数排序(Radix sort)：

先根据个位数从0-9将元素划分，下一次在I 基础上根据十位数从0-9划分，以此类推

from llistqueue import Queue 
from array import Array 
def radixSort( intList, numDigits ):
\#the latter represents the maximum number of digits possible in the largest key value.
	binArray = Array( 10 ) 
	for k in range( 10 ): 
  		binArray[k] = Queue()
	column = 1
		for d in range( numDigits ): 
			for key in intList : 				digit = (key // column) % 10 
				binArray[digit].enqueue( key ) 
			i=0 			for bin in binArray : 
				while not bin.isEmpty() : 
					intList[i] = bin.dequeue() 
					i += 1 
			column *= 10 

二叉树

1. 基本概念

由分支(edges)和节点(nodes)组成，每棵树都有一个根，节点可以分为父节点和子节点，也可以分为内节点(interior node，至少又一个子节点)，和叶节点(leaf node，没有子节点了）。二叉树分为完美二叉树(perfect binary tree，所有的叶节点都在同一level)和完全二叉树(complete binary tree，从头到h-1层是完美二叉树，最后一层从左到右没有间隔)

2. 遍历二叉树：

1. Preorder traversal (先序遍历)：

先访问Node，再遍历左边，然后遍历右边，这样递归

def preorderTrav( subtree ): 
	if subtree is not None : 
    	print( subtree.data )
    	preorderTrav( subtree.left )
    	preorderTrav( subtree.right )

2. Inorder traversal(中序遍历)：

先遍历左边，再访问Node，最后访问右边

def inorderTrav( subtree ): 
	if subtree is not None : 
        inorderTrav( subtree.left )
        print( subtree.data )
        inorderTrav( subtree.right )

3. Postorder traversal(后序遍历)：

先遍历左边，再遍历右边，最后访问Node

def postorderTrav( subtree ): 
	if subtree is not None :
		postorderTrav( subtree.left )
		postorderTrav( subtree.right )
		print( subtree.data )

4. Breath-First traversal(宽度优先遍历)：

从左往右一层一层的遍历

def breadthFirstTrav( bintree ): # Create a queue and add the root node to it. 
Queue q
q.enqueue( bintree )
\# Visit each node in the tree. 
while not q.isEmpty() :
	# Remove the next node from the queue and visit it. 
   node = q.dequeue()
   print( node.data )
   # Add the two children to the queue.
   if node.left is not None :
   		q.enqueue( node.left ) 
	if node.right is not None : 
		q.enqueue( node.right ) 

3. Expression tree：

叶节点上是数值，其它节点上是操作符，处于低层的操作符优先执行，根节点的操作符最后执行

class ExpressionTree :
	def __init__( self, expStr ): 
    	self._expTree = None
    	self._buildTree( expStr )
	def evaluate( self, varMap ):
		return self._evalTree( self._expTree, varMap ) 
	def __str__( self ): 		return self._buildString( self._expTree ) 
\# ... 
\# Storage class for creating the tree nodes. 

class _ExpTreeNode : 	def __init__( self, data ): 
   		self.element = data
    	self.left = None
    	self.right = None

4. 用二叉树来构建heaps(堆栈)

Max heap(最大堆)：每个节点的数值都大于子节点; Min heap(最小堆)：每个节点的数值都大于子节点

class MaxHeap : # ... # Add a new value to the heap. 
def add( self, value ): 	assert self._count < self.capacity(), 
		"Cannot add to a full heap."
	# Add the new value to the end of the list. 
	self._elements[ self._count ] = value self._count += 1 
   	# Sift the new value up the tree.
  	self._siftUp( self._count - 1 )
			
def _siftUp( self, ndx ): 
	if ndx > 0 : 
    	parent = ndx // 2
		if self._elements[ndx] > self._elements[parent] : 
			tmp = self._elements[ndx]
			self._elements[ndx] = self._elements[parent] 
			self._elements[parent] = tmp 
			self._siftUp( parent ) 
class MaxHeap : 
\# ... 
	def extract( self ): 
		assert self._count > 0, 
			"Cannot extract from an empty heap."
		value = self._elements[0]
		self._count -= 1 		self._elements[0] = self._elements[ self._count ] 
		self._siftDown( 0 ) 

动态规划

1. 公式：

2. 伪代码：

3. Sequence alignment序列比对:

penalty score. For match, mismatch and gap; fill out the score matrix by choosing the max number; backtracking

3. 0-1 Knapsack(0-1背包问题)：

假设背包的承重上限为6，有四种物品，价值分别为30，40，45，100；对应的重量为1，2，3，4.

分析过程：

当只有0个物品或者背包空间为0的时候，什么也拿不了，因此拿到的总重量和价值都为0（第一行和第一列）
当只有一个物品(30, 1)，即价值是30重量是1时，不管背包大小是多少，都只能拿一个，总价值为30
当物品有两个(30,1)和(20,2)时，可以只拿(30,1)，也可以只拿(40,2)，也可以两个都拿(总价值为70)
当物品有三个时，即(30,1)、(40,2)、(45,3)时，我们可以选择(30,1)、(40,2)、(45,3)、(30,1)和(40,2)、(30,1)和(45,3)、(40,2)和(45,3)、(30,1)和(40,2)和(45,3)。所有的情况都没有超出背包的极限
当物品有4个时，以此类推，注意不能超过背包的极限

图论

1. 概念

图G=(V, E), V代表nodes也叫vertices，E表示Edges
Kn：complete graph，所有的edges都相连
Cn：circle graph，不一定所有的edges都相连，但首尾相接
trees：没有circle的graph

2. 遍历图Graph traversal

a. Depth-First search：

DFS keeps walking down a path until it is forced to backtrack; it backtracks until it finds a new path to go down

Recursive implementation: u, s – start node

Stack-based implementa:on of DFS

b. Breath-First search：

3. Minimum spanning tree (MST)

Prim’s algorithm: 在graph中随便找个node作为起点，然后一个一个的加入下一个节点让新的节点加入的edge长度最小。
Kruskal’s algorithm：Add edges in increasing weight and skip those whose addition would create a cycle.
Dijkstra’s algorithm：针对directed graph，根Prim‘s差不多，只是有方向了而已

贪婪算法：

跟动态规划算法相对，可以获取最佳结果

Basic data structrue and algorithm