"Xanadu" and the Flaming-X logo are registered trademarks of Project Xanadu. |
|
8 Visitor(s) on Site 98 Wiki Pages Recently Modified
Sections (edit)How This Wiki Works
|
|||
| Recent Changes Printable View Page History Edit Page | |||
|
Content Last Modified on December 30, 2007, at 12:05 PM CST
About Tree Data Structures- trees versus hashes (we need range operations, which are lacking in hashes) - most trees are organzized along one dimension of search (TBD: add lots of links to background material on wikipedia et. al.) OperationsDynamic set operations, in time proportional to the height of the tree.
Balanced versus UnbalancedA fully balanced tree has a height "logb n" where n is the number of nodes in the tree, and b is the number of children per node. The objective of balancing is to minimize the number of levels in the tree in order to obtain the best running time. Types of Trees
A binary tree balanced by rotating nodes to adjust the height of the tree. A rotation
A
/ \
X B
/ \
Y Z
Goes to:
B
/ \
A Z
/ \
X Y
This will reduce the height of the tree, if Y or Z are more than 2 nodes taller than X. Red/black trees have an invariant that all black nodes are perfectly balanced, and red nodes can fill the levels between black nodes. Alternatively, all paths in the tree have the same number of black nodes in them, and any red node has a black father. See Corman, Leiserson, and Rivest for good explanations. All nodes have 2 or 3 children -- simpler code than Red/Black, but also non-uniform nodes, which complicates storage management.
graphical query tree that where each node divides a 2-d index space into 4 quadrants based on the key value. These are very hard to balance incrementally -- I don't know that anyone does so. Octrees are the 3-d variant. Another multidimensional tree, these support search in N-dimensional spaces: at each level one dimension is used. the 2-variant are XY trees, splitting first on x-coord, then y-coord, then back to x, etc. again, incremental balancing not sensible. CachingLarge trees often cannot all be held in primary storage (RAM) at one time. Instead a virtualization of storage, by combining an in-memory cache with secondary storage is used in these cases. Optimizing the number of nodes becomes more important with splitting storage across primary and secondary. LockingFor external storage, B-trees rule the tree roost because they are easy to work with in multi-user scenarios -- if you split full nodes on the way down, at most 2 nodes need be locked at a time, and the locks are always taken from the top-down, preventing deadlock. High branching factors of 512-4096, typically mean that most b-trees are 3-4 levels even for enormous databases, with the root kept in memory. This limits disk accesses to 2 at most to get to any data -- caching can make the average much smaller. Mutability versus ImmutabilityConcurrency IssuesPerformance re the Big O(insert table of factors) Relevant Citations
|
|||
| Recent Changes Printable View Page History Edit Page | |||
| All trademarks and copyrights on this page are owned by their respective companies. Comments are owned by the Poster. The Rest (c) 2001-2007 Jeff Rush | |||