Parser Stats<br>automatically and incrementally reported with each parse
See AboutUs Getting Started blog post announcing the open-sourcing of this technology. github
Developed sediment inspired uniform sampling algorithm. Add elements at period p until sediment reaches limit. Then decimate the sediment by 1/2 and 2x the period. github
See related reservoir sampling which is happy to be uniformly random but not evenly spaced. wikipedia
My approach here was informed by my positive experience with XML Import in Ruby.