Publications (2005 and later)
PDF files are provided for all papers, just follow the link. Most of my work appears on my arXiv page.
Follow my research
You can subscribe to my research papers by email, through a reader or through my Google Scholar profile.
In preparation
- Antonia Badia and Daniel Lemire, Effective functional dependencies with null markers
- Xiaodan Zhu, Peter Turney, Daniel Lemire, Andre Vellino, Measuring academic influence: Not all citations are equal
To appear
- Owen Kaser and Daniel Lemire, Strongly universal string hashing is fast, Computer Journal (arXiv:1202.4961)
- Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second through vectorization,
Software: Practice & Experience (arXiv:1209.2137)
2013
- Hazel Webb, Owen Kaser, Daniel Lemire,
Diamond Dicing, Data & Knowledge Engineering 86, 2013. (arXiv:1006.3726)
2012
- Zoltán Prekopcsák and Daniel Lemire, Time Series Classification by Class-Based Mahalanobis Distances, Advances in Data Analysis and Classification 6 (3), 2012. (arXiv:1010.1526).
- Daniel Lemire, Owen Kaser, Eduardo Gutarra, Reordering Rows for Better Compression: Beyond the Lexicographic Order, ACM Transactions on Database Systems 37 (3), 2012. (arXiv:1207.2189 )
- Daniel Lemire, The universality of iterated hashing over variable-length strings, Discrete Applied Mathematics 160 (4-5), 2012. (arXiv:1008.1715)
- Cameron Neylon, Jan Aerts, C. Titus Brown, Daniel Lemire, Jarrod Millman, Peter Murray-Rust, Fernando Perez, Neil Saunders, Arfon Smith, Gaël Varoquaux and Egon Willighagen, Changing computational research: The challenges ahead (Editorial), Source Code for Biology and Medicine 7 (2), 2012.
2011
- Antonio Badia and Daniel Lemire, A Call to Arms: Revisiting Database Design, SIGMOD Record 40 (3), 2011. (Free copy from ACM) (arXiv:1105.6001)
- Daniel Lemire and Owen Kaser, Reordering Columns for Smaller Indexes, Information Sciences 181 (12), 2011. (arXiv:0909.1346)
- Andre Vellino and Daniel Lemire, Extracting, Transforming and Archiving Scientific Data, VLDL 2011, 2011. (arXiv:1108.4041)
2010
- Daniel Lemire and Owen Kaser, Recursive n-gram hashing is pairwise independent, at best, Computer Speech & Language 24 (4), pages 698-710, 2010. (arXiv:0705.4676) (source code (C++))
-
Daniel Lemire, Owen Kaser, Kamel Aouiche, Sorting improves word-aligned bitmap indexes. Data & Knowledge Engineering 69 (1), pages 3-28, 2010. (arXiv:0901.3751) (slides)
- Sylvie Noël, Daniel Lemire, On the Challenges of Collaborative Data Processing, in
Collaborative Information Behaviour: User Engagement and Communication Sharing (edited by Jonathan Foster), IGI Global, April 2010. (arXiv:0906.0910)
2009
- Daniel Lemire, Martin Brooks and Yuhong Yan, An Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation. International Journal
of Computer Mathematics 86 (7), 2009. (arXiv:cs/0702142)
- Daniel Lemire,
Faster Retrieval with a Two-Pass Dynamic-Time-Warping Lower Bound, Pattern Recognition 42 (9), pages 2169-2180, 2009. (arXiv:0811.3301) (C++ source code).
About this paper:
To our knowledge, there is only one paper that offers a plausible speedup based on a tighter lower bound—Lemire (2009) suggests a mean speedup of about 1.4 based on a tighter bound. These results are reproducible, and testing on more general data sets we obtained similar results (...) (Wang et al. 2013)
- Kamel Aouiche, Daniel Lemire and Robert Godin, Web 2.0 OLAP: From Data Cubes to Tag Clouds, Lecture Notes in Business Information Processing Vol. 18, pages 51-64, 2009. (arXiv:0905.2657)
2008
- Owen Kaser, Daniel Lemire, Kamel Aouiche,
Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes, DOLAP 2008, 2008. (arXiv:0808.2083) (Free copy from ACM) (C++ source code)
(slides)
- Kamel Aouiche, Daniel Lemire and Owen Kaser, Tri de la table de faits et compression des index bitmaps avec alignement sur les mots (Fact Table Sorting and Word-Aligned Compression for Bitmap Indexes), BDA'08, 2008. (arXiv:0805.3339) (C++ source code)
- Hazel Webb, Owen Kaser, Daniel Lemire, Pruning Attributes From Data Cubes with Diamond Dicing, IDEAS'08, 2008.
(arXiv:0805.0747) (Free copy from ACM)
- Daniel Lemire and Owen Kaser, Hierarchical Bin Buffering:
Online Local Moments for Dynamic External Memory Arrays,
ACM Transactions on Algorithms 4 (1), pages 1-31, 2008. (cs.DS/0610128) (Free copy from ACM)
(C++ source code)
- Kamel Aouiche, Daniel Lemire, Robert Godin, Collaborative OLAP with Tag Clouds: Web 2.0 OLAP Formalism and
Experimental Evaluation, WEBIST 2008, 2008. (arXiv:0710.2156)
2007
- Kamel Aouiche and Daniel Lemire, A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP, DOLAP 2007, pp. 17-24, 2007. (cs.DB/0703058) (Free copy from ACM) (C++ source code) (slides)
- Owen Kaser and Daniel Lemire, Removing Manually-Generated Boilerplate from
Electronic Texts: Experiments with Project Gutenberg e-Books. CASCON 2007, pp. 272-275, 2007. (arXiv:0707.1913) (Free copy from ACM)
- Owen Kaser and Daniel Lemire, Tag-Cloud Drawing: Algorithms for Cloud Visualization. In proceedings of Tagging and Metadata for
Social Information Organization (WWW 2007), 2007. (cs.DS/0703109) (data)
(slides)
(C and Java software)
(alternate C software)
- Kamel Aouiche and Daniel Lemire, Unassuming View-Size Estimation Techniques in OLAP, An Experimental Comparison, Proceedings of ICEIS-2007, pp. 145-150, 2007. (cs.DB/0703056)
- Daniel Lemire, A Better Alternative to Piecewise Linear Time Series Segmentation, SIAM Data Mining 2007, 2007. (cs.DB/0605103)
- Dan Kucerovsky and Daniel Lemire, Monotonicity Analysis over Chains and Curves. Proceedings of Curves and Surfaces 2006, pages 180-190, 2007. (math.GM/0701481)
- Mamadou Tadiou Koné and Daniel Lemire (Eds.), Special Issue on Canadian Semantic Web, Computational Intelligence, Blackwell Publishing, August 2007 - Vol. 23 Issue 3, pages 299-392.
2006
- Daniel Lemire, Streaming Maximum-Minimum Filter Using No More than Three Comparisons per Element. Nordic Journal of Computing, 13 (4), pages 328-339, 2006. (cs.DS/0610046) (C++ source code) (Python source code)
(other Python source code)
- Owen Kaser, Daniel Lemire, and Steven Keith, The LitOLAP Project: Data Warehousing with Literature, CaSTA 2006,
Fredericton, 2006.
- Owen Kaser and Daniel Lemire, Attribute
Value Reordering for Efficient Hybrid OLAP, Information Sciences, Volume 176, Issue 16, pages 2279-2438, 2006.
(cs.DB/0702143)
- Mamadou Tadiou Koné and Daniel Lemire (Eds.), Canadian Semantic Web,
Semantic Web and Beyond: Computing for Human Experience, Springer, September 2006.
Buy it on Amazon.
2005
- Steven Keith, Owen Kaser, Daniel Lemire, Analyzing
Large Collections of Electronic Text Using OLAP, APICS 2005,
Wolfville, Canada, October 2005. (cs.DB/0605127)
- Daniel Lemire, Martin Brooks, and Yuhong Yan, An
Optimal Linear Time Algorithm for Quasi-Monotonic
Segmentation, IEEE Data Mining (ICDM-05), pp. 709-712, November 2005.
(cs.DS/0702142)
- Daniel Lemire, Harold Boley, Sean McGrath, Marcel Ball, Collaborative
Filtering and Inference Rules for Context-Aware Learning Object
Recommendation, International Journal of Interactive
Technology & Smart Education, Volume 2, Issue 3, August
2005.
- Will Fitzgerald, Daniel Lemire, and Martin Brooks, Quasi-monotonic
segmentation of state variable behavior for reactive control,
AAAI05, Pittsburgh, USA, pp. 1145-1150, July 2005.
- Martin Brooks, Yuhong Yan, Daniel Lemire, Scale-Based
Monotonicity Analysis in Qualitative Modelling with Flat
Segments, IJCAI05, Edinburgh, UK, pp. 400-405, July 2005.
- Daniel Lemire and Anna Maclachlan, Slope
One Predictors for Online Rating-Based Collaborative
Filtering, SIAM Data Mining (SDM'05), pp. 471-476, 2005. (cs.DB/0702144)
- Daniel Lemire, Scale
and Translation Invariant Collaborative Filtering Systems.
Information Retrieval, 8 (1), pages 129-150, January 2005.
(NRC 46508)
Complete list
-
My c.v. contains a complete list (Adobe Acrobat/PDF).
Bibliography Servers
- For my work in Computer Science, I appear on the
DBLP Server.
- I also appear on the ACM
portal.
- Most of my work appears on Google Scholar.
About this paper:
To our knowledge, there is only one paper that offers a plausible speedup based on a tighter lower bound—Lemire (2009) suggests a mean speedup of about 1.4 based on a tighter bound. These results are reproducible, and testing on more general data sets we obtained similar results (...) (Wang et al. 2013)