Publications

Stream VByte: Faster Byte-Oriented Integer Compression
Information Processing Letters 130, 2018

Details PDF (arXiv) Code

Roaring Bitmaps: Implementation of an Optimized Software Library
under review

Details PDF (arXiv) Code

On Desirable Semantics of Functional Dependencies over Databases with Incomplete Information
Fundamenta Informaticae (to appear)

Details PDF (arXiv)

Faster Population Counts Using AVX2 Instructions
Computer Journal (to appear)

Details PDF (arXiv) Code

Regular and almost universal hashing: an efficient implementation
Software: Practice and Experience 47 (10), 2017

Details PDF (arXiv) Code

Efficient Integer-Key Compression in a Key-Value Store using SIMD Instructions
Information Systems 66, 2017

Details PDF (arXiv) Code

SIMD Compression and the Intersection of Sorted Integers
Software: Practice and Experience 46 (6), 2016

Details PDF (arXiv) Slides Code

Faster 64-bit universal hashing using carry-less multiplications
Journal of Cryptographic Engineering 6(3), 2016

Details PDF (arXiv) Code

Consistently faster and smaller compressed bitmaps with Roaring
Software: Practice and Experience 46 (11), 2016

Details PDF (arXiv) Slides Code Project

Compressed bitmap indexes: beyond unions and intersections
Software: Practice and Experience 46 (2), 2016

Details PDF (arXiv) Code

Better bitmap performance with Roaring bitmaps
Software: Practice and Experience 46 (5), 2016

Details PDF (arXiv) Slides Code Project

Vectorized VByte Decoding
International Symposium on Web Algorithms 2015, 2015

Details PDF (arXiv) Slides Code

Multidimensional Bloom Filters
Information Systems (54), 2015

Details PDF (arXiv) Code

Measuring academic influence: Not all citations are equal
Journal of the Association for Information Science and Technology 66 (2), 2015

Details PDF (arXiv) Dataset

Functional dependencies with null markers
Computer Journal 58 (5), 2015

Details PDF (arXiv)

Decoding billions of integers per second through vectorization
Software: Practice & Experience 45 (1), 2015

Details PDF (arXiv) Slides Code

A General SIMD-based Approach to Accelerating Compression Algorithms
ACM Transactions on Information Systems 33 (3), 2015

Details PDF (arXiv)

Strongly universal string hashing is fast
Computer Journal 57 (11), 2014

Details PDF (arXiv) Code

Diamond Dicing
Data & Knowledge Engineering 86, 2013

Details PDF (arXiv)

Time Series Classification by Class-Specific Mahalanobis Distances
Advances in Data Analysis and Classification 6 (3), 2012

Details PDF (arXiv)

The universality of iterated hashing over variable-length strings
Discrete Applied Mathematics 160 (4-5), 2012

Details PDF (arXiv)

Reordering Rows for Better Compression: Beyond the Lexicographic Order
ACM Transactions on Database Systems 37 (3), 2012

Details PDF (arXiv) Slides Code Code Code

Reordering Columns for Smaller Indexes
Information Sciences 181 (12), 2011

Details PDF (arXiv)

Extracting, Transforming and Archiving Scientific Data
In VLDL 2011, Berlin, Germany, 2011

Details PDF (arXiv)

A Call to Arms: Revisiting Database Design
SIGMOD Record 40 (3), 2011

Details PDF (arXiv)

Sorting improves word-aligned bitmap indexes
Data & Knowledge Engineering 69 (1), 2010

Details PDF (arXiv) Code

Recursive n-gram hashing is pairwise independent, at best
Computer Speech & Language 24 (4), pages 698-710, 2010

Details PDF (arXiv) Code

Hierarchical Bin Buffering: Online Local Moments for Dynamic External Memory Arrays
ACM Transactions on Algorithms 4(1): 14 (2008)

Details PDF (arXiv) Code

Faster retrieval with a two-pass dynamic-time-warping lower bound
Pattern recognition 42 (9), 2009

Details PDF (arXiv) Code

An Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation
International Journal of Computer Mathematics 86 (7), 2009

Details PDF (arXiv) Code

Pruning Attribute Values From Data Cubes with Diamond Dicing
IDEAS 2008

Details PDF (arXiv)

Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes
DOLAP 2008

Details PDF (arXiv) Code

Tag-Cloud Drawing: Algorithms for Cloud Visualization
Tagging and Metadata for Social Information Organization (WWW 2007)

Details PDF (arXiv) Slides Code Dataset

Removing Manually-Generated Boilerplate from Electronic Texts: Experiments with Project Gutenberg e-Books
CASCON 2007

Details PDF (arXiv) Code

Monotonicity Analysis over Chains and Curves
In Curves and Surfaces 2006, Saint-Malo, France, 2007

Details PDF (arXiv)

A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP
DOLAP 2007, pp. 17-24, 2007

Details PDF (arXiv) Slides Code

A Better Alternative to Piecewise Linear Time Series Segmentation
SIAM Data Mining 2007

Details PDF (arXiv) Code

Streaming Maximum-Minimum Filter Using No More than Three Comparisons per Element
Nordic Journal of Computing 13 (4), pages 328-339, 2006

Details PDF (arXiv) Code Code

Attribute Value Reordering For Efficient Hybrid OLAP
Information Sciences 176 (16) 2006

Details PDF (arXiv)

Slope One Predictors for Online Rating-Based Collaborative Filtering
In SIAM Data Mining (SDM 2005), Newport Beach, California, April 21-23, 2005

Details PDF (arXiv)

Scale and Translation Invariant Collaborative Filtering Systems
Information Retrieval 8 (1), 2005

Details PDF

Collaborative filtering and inference rules for context‐aware learning object recommendation
Interactive Technology and Smart Education 2 (3), 2005

Details PDF

A family of 4-point dyadic high resolution subdivision schemes
In Curves and Surfaces 2002, Saint-Malo, France, 2003

Details PDF Code

Fourier analysis of 2-point Hermite interpolatory subdivision schemes
Journal of Fourier Analysis and Applications 7 (5), 2001

Details PDF

Wavelet time entropy, T wave morphology and myocardial ischemia
IEEE Transactions on Biomedical Engineering 47 (7), 2000

Details PDF

Une famille d'ondelettes biorthogonales sur l'intervalle obtenue par un schéma d'interpolation itérative
Annales des Sciences Mathématiques du Québec 23 (1), 1999

Details PDF